This document describes the basic features of Chaos Mesh, including fault injection, Chaos workflows, visualized operations, and security guarantees.
Fault injection is the key of Chaos experiments. Chaos Mesh covers a full range of faults that might occur in a distributed system, and provides three comprehensive and fine-grained fault types: basic resource faults, platform faults, and application-layer faults.
- Basic resource faults:
- PodChaos: simulates Pod failures, such as Pod node restart, Pod's persistent unavailablility, and certain container failures in a specific Pod.
- NetworkChaos: simulates network failures, such as network latency, packet loss, packet disorder, and network partitions.
- DNSChaos: simulates DNS failures, such as the parsing failure of DNS domain name and the wrong IP address returned.
- HTTPChaos: simulates HTTP communication failures, such as HTTP communication latency.
- StressChaos: simulates CPU race or memory race.
- IOChaos: simulates the I/O failure of an application file, such as I/O delays, read and write failures.
- TimeChaos: simulates the time jump exception.
- KernelChaos: simulates kernel failures, such as an exception of the application memory allocation.
- Platform faults:
- Application faults:
- JVMChaos: simulates JVM application failures, such as the function call delay.
A Chaos workflow includes a set of Chaos experiments and an application status check, so you can complete the entire process of a Chaos engineering project on the platform.
Chaos workflows enable you to perform a series of Chaos experiments, keep expanding the explosion radius (including the scope of attacks), and increase the failure types. After running a Chaos workflow, you can easily view the current state of the application using Chaos Mesh and determine whether to perform follow-up experiments.At the same time, to reduce the cost of maintaining Chaos workflows, you can keep updating and accumulating the Chaos experiment workflows, and apply the existing experiments to other workflows.
Currently, Chaos workflows provide the following features:
- Orchestrate serial Chaos experiments
- Orchestrate parallel Chaos experiments
- Support checking experimental status and results
- Support pausing a Chaos experiment
- Support using YAML files to define and manage Chaos workflows
- Support using the web UI to define and manage Chaos workflows
For the configuration of a specific workflow, see Create Chaos Mesh workflow.
Chaos Mesh provides the Chaos Dashboard component for visualized operations, which greatly simplifies Chaos experiments.You can manage and monitor a Chaos experiment directly through the visualization interface. For example, with a few clicks on the interface, you can define the scope of a Chaos experiment, specify the type of Chaos injection, define scheduling rules, and get the results of the Chaos experiment.
Chaos Mesh manages permissions using the native RBAC feature in Kubernetes.
You can freely create multiple roles based on your actual permission requirements, bind the roles to the username service account, and then generate the token corresponding to the service account.When you log into the Dashboard using this token, you can only perform Chaos experiments within the permissions given by the service account.
In addition, you can specify the namespaces that allow Chaos experiments by setting the namespace annotations, which further safeguards the control of Chaos experiments.