Skip to main content
Version: 2.0.2

Create Chaos Mesh Workflow

Introduction to Chaos Mesh Workflow#

When you use Chaos Mesh to simulate real system faults, continuous validation is always a need. You might want to build a series of faults on the Chaos Mesh platform, instead of performing individual Chaos injections.

To meet this need, Chaos Mesh provided Chaos Mesh Workflow, a built-in workflow engine. Using this engine, you can run different Chaos experiments in serial or parallel to simulate production-level errors.

Currently, Chaos Mesh Workflow supports the following features:

  • Serial Orchestration
  • Parallel Orchestration
  • Customized tasks
  • Conditional branch

Typical user scenarios:

  • Use parallel orchestration to inject multiple NetworkChaos faults to simulate complex web environments.
  • Use serial orchestration to perform health checks and use the conditional branch to determine whether to perform the remaining steps.

The design of Chaos Mesh Workflow is, to some extent, inspired by Argo Workflows. If you are familiar with Argo Workflows, you can also quickly get started with Chaos Mesh Workflow.

More workflow examples are available in the Chaos Mesh GitHub repository.

Create a workflow using a YAML file and kubtl#

Similar to various types of Chaos objects, workflows also exist in a Kubernetes cluster as a CRD. You can create a Chaos Mesh workflow using kubectl create -f <workflow.yaml>. The following command is an example of creating a workflow. Create a workflow using a local YAML file:

kubectl create -f <workflow.yaml>

Create a workflow using a YAML file from the network:

kubectl create -f https://raw.githubusercontent.com/chaos-mesh/chaos-mesh/master/examples/workflow/serial.yaml

A simple workflow YAML file is defined as follows. In this workflow, StressChaos, NetworkChaos, and PodChaos are injected:

apiVersion: chaos-mesh.org/v1alpha1kind: Workflowmetadata:  name: try-workflow-parallelspec:  entry: the-entry  templates:    - name: the-entry      templateType: Parallel      deadline: 240s      children:        - workflow-stress-chaos        - workflow-network-chaos        - workflow-pod-chaos    - name: workflow-network-chaos      templateType: NetworkChaos      deadline: 20s      networkChaos:        direction: to        action: delay        mode: all        selector:          labelSelectors:            "app": "hello-kubernetes"        delay:          latency: "90ms"          correlation: "25"          jitter: "90ms"    - name: workflow-pod-chaos-schedule      templateType: Schedule      deadline: 40s      schedule:        schedule: "@every 2s"        podChaos:          action: pod-kill          mode: one          selector:            labelSelectors:              "app": "hello-kubernetes"    - name: workflow-stress-chaos      templateType: StressChaos      deadline: 20s      stressChaos:        mode: one        selector:          labelSelectors:            "app": "hello-kubernetes"        stressors:          cpu:            workers: 1            load: 20            options: ["--cpu 1", "--timeout 600"]

In the above YAML template, the templates fields define the steps of the experiment. The entry field defines the entry of the workflow when the workflow is being executed.

Each element in templates represents a workflow step. For example:

name: the-entrytemplateType: Paralleldeadline: 240schildren:  - workflow-stress-chaos  - workflow-network-chaos  - workflow-pod-chaos

templateType: Parallel means that the node type is parallel. deadline: 240s means that all parallel experiments on this node are expected to be performed in 240 seconds; otherwise, the experiments time out. children means the other template names to be executed in parallel.

For example:

name: workflow-pod-chaostemplateType: PodChaosdeadline: 40spodChaos:  action: pod-kill  mode: one  selector:    labelSelectors:      'app': 'hello-kubernetes'

templateType: PodChaos means that the node type is PodChaos experiments. deadline: 40s means that the current Chaos experiment lasts for 40 seconds. podChaos is the definition of the PodChaos experiment.

It is flexible to create a workflow using a YAML file and kubectl. You can nest parallel or serial orchestrations to declare complex orchestrations, and even combine the orchestration with conditional branches to achieve a circular effect.

Field description#

Workflow field description#

ParameterTypeDescriptionDefault valueRequiredExample
entrystringDeclares the entry of the workflow. Its value is a name of a template.NoneYes
templates[]TemplateDeclares the behavior of each step executable in the workflow. See Template field description for details.NoneYes

Template field description#

ParameterTypeDescriptionDefault valueRequiredExample
namestringThe name of the template, which needs to meet the DNS-1123 requirements.NoneYesany-name
typestringType of template. Value options are Task, Serial, Parallel, Suspend, Schedule, AwsChaos, DNSChaos, GcpChaos, HTTPChaos, IOChaos, JVMChaos, KernelChaos, NetworkChaos, PodChaos, StressChaos, and TimeChaos.NoneYesPodChaos
deadlinestringThe duration of the template.NoneNo'5m30s'
children[]stringDeclares the subtasks under this template. You need to configure this field when the type is Serial or Parallel.NoneNo["any-chaos-1", "another-serial-2", "any-shcedue"]
taskTaskConfigures the customized task. You need to configure this field when the type is Task. See the Task field description for details.NoneNo
conditionalBranches[]ConditionalBranchConfigures the conditional branch which executes after customized task. You need to configure this field when the type is Task. See the Conditional branch field description for details.NoneNo
awsChaosobjectConfigures AwsChaos. You need to configure this field when the type is AwsChaos. See the Simulate AWS Faults document for details.NoneNo
dnsChaosobjectConfigures DNS Chaos. You need to configure this field when the type is DNSChaos. See the Simulate DNS Faults document for details.NoneNo
gcpChaosobjectConfigures GcpChaos. You need to configure this field when the type is GcpChaos.See the Simulation GCP Faults document for details.NoneNo
httpChaosobjectConfigures HTTPChaos. You need to configure this field when the type is HTTPChaos. See the Simulate HTTP Faults document for details.NoneNo
ioChaosobjectConfigure IOChaos. You need to configure this field when the type is IOChaos. See the Simulate File I/O Faults document for details.NoneNo
jvmChaosobjectConfigures JVMChaos. You need to configure this field when the type is JVMChaos. See the Simulate JVM Application Faults document for details.NoneNo
kernelChaosobjectConfigure KernelChaos. You need to configure this field when the type is KernelChaos. See the Simulate Kernel Faults document for details.NoneNo
networkChaosobjectConfigures NetworkChaos. You need to configure this field when the type is NetworkChaos. See the Simulate AWS Faults document for details.NoneNo
podChaosobjectConfigures PodChaosd. You need to configure this field when the type is PodChaosd. See the Simulate Network Faults document for details.NoneNo
stressChaoobjectConfigures StressChaos. You need to configure this field when the type is StressChaos. See the Simulate Heavy Stress on Kubernetes document for details.NoneNo
timeChaosobjectConfigures TimeChaos. You need to configure this field when the type is TimeChaos. See the SImulate Time Faults document for details.NoneNo
scheduleobjectConfigures Schedule. You need to configure this field when the type is Schedule. See the Define Scheduling Rules document for details.NoneNo
note

When creating a Chaos with a duration in the workflow, you need to fill the duration in the outer deadline field instead of using the duration field in Chaos.

Task field description#

ParameterTypeDescriptionDefault valueRequiredExample
containerobjectDefines a customized task container. See Container field description for details.NoneNo
volumesarrayIf you need to mount a volume in a customized task container, you need to declare the volume in this field. For the detailed definition of a volume, see the Kubernetes documentation - corev1.Volume.NoneNo

Conditional branch field description#

ParameterTypeDescriptionDefault valueRequiredExample
targetstringThe name of the template to be executed by the current conditional branch.NoneYesanother-chaos
expressionstringThe type is a boolean expression. When a customized task is completed and the expression value is true, the current condition branch is executed. When this value is not set, the conditional branch will be executed directly after the customized task is completed.NoneNoexitCode == 0

Currently, two context variables are provided in expression:

  • exitCode means the exit code for a customized task.
  • stdout indicates the standard output for a customized task.

More context variables will be added in later releases.

Refer to this document write expression expressions.

Container field description#

The following table only lists the commonly used fields. For the definitions of more fields, see Kubernetes documentation - core1.Container.

ParameterTypeDescriptionDefault valueRequiredExample
namestringContainer nameNoneYestask
imagestringImage nameNoneYesbusybox:latest
command[]stringContainer commandsNoneNo["wget", "-q", "http://httpbin.org/status/201"]