跳到主要内容
版本:Next

Simulate Block Device Incidents

BlockChaos Introduction

Chaos Mesh provides the BlockChaos experiment type. You can use this experiment type to simulate a block device latency or freeze scenario. This document describes how to install the dependencies of a BlockChaos experiment, and create a BlockChaos.

备注

BlockChaos is in an early stage. The installation and configuration experience of it will continue to improve. If you find any issues, please open an issue in chaos-mesh/chaos-mesh to report.

备注

BlockChaos freeze action will affect all processes using the block device, not only the target container.

Install kernel module

BlockChaos delay action depends on the chaos-driver kernel module. It can only be injected on a machine with this module installed. Currently, you have to compile and install the module manually.

  1. Download the source code of this module using the following command:

    curl -fsSL -o chaos-driver-v0.2.1.tar.gz https://github.com/chaos-mesh/chaos-driver/archive/refs/tags/v0.2.1.tar.gz
  2. Uncompress the chaos-driver-v0.2.1.tar.gz file:

    tar xvf chaos-driver-v0.2.1.tar.gz
  3. Prepare the headers of your current kernel. If you are using CentOS/Fedora, you can install the kernel headers with yum:

    yum install kernel-devel-$(uname -r)

    If you are using Ubuntu/Debian, you can install the kernel headers with apt:

    apt install linux-headers-$(uname -r)
  4. Compile the module:

    cd chaos-driver-v0.2.1
    make driver/chaos_driver.ko
  5. Install the kernel module:

    insmod ./driver/chaos_driver.ko

The chaos_driver module has to be installed every time after rebooting. To load the module automatically, you can copy the module to a subdirectory in /lib/modules/$(uname -r)/kernel/drivers, run depmod -a, and then add chaos_driver to the /etc/modules.

If you have upgraded the kernel, the module should be recompiled.

备注

It is recommended to use DKMS or akmod for automatic kernel module compiling or loading. If you want to help us improve the installation experience, creating a DKMS or akmod package and submitting it to different distribution repositories is very welcome.

Create experiments using the YAML file

  1. Write the experiment configuration to the YAML configuration file. The following uses the block-latency.yaml file as an example.

    apiVersion: chaos-mesh.org/v1alpha1
    kind: BlockChaos
    metadata:
    name: hostpath-example-delay
    spec:
    selector:
    labelSelectors:
    app: hostpath-example
    mode: all
    volumeName: hostpath-example
    action: delay
    delay:
    latency: 1s
    备注

    Only hostpath or localvolume is supported.

  2. Use kubectl to create an experiment:

    kubectl apply -f block-latency.yaml

You can find the following magic happened:

  1. The elevator of the volume is changed to ioem or ioem-mq. You can check it through cat /sys/block/<device>/queue/scheduler.
  2. The ioem or ioem-mq scheduler will receive the latency request and delay the request for the specified time.

The fields in the YAML configuration file are described in the following table:

ParameterTypeNoteDefault valueRequiredExample
modestringSpecifies the mode of the experiment. The mode options include one (selecting a random Pod), all (selecting all eligible Pods), fixed (selecting a specified number of eligible Pods), fixed-percent (selecting a specified percentage of Pods from the eligible Pods), and random-max-percent (selecting the maximum percentage of Pods from the eligible Pods).NoneYesone
valuestringProvides parameters for the mode configuration, depending on mode. For example, when mode is set to fixed-percent, value specifies the percentage of Pods.NoneNo1
selectorstructSpecifies the target Pod. For details, refer to Define the experiment scope.NoneYes
volumeNamestringSpecifies the volume to inject in the target pods. There should be a corresponding entry in the pods' .spec.volumes.NoneYeshostpath-example
actionstringIndicates the specific type of faults. The available fault types include delay and freeze. delay will simulate the latency of block devices, and freeze will simulate that the block device cannot handle any requestsNoneYesdelay
delay.latencystringSpecifies the latency of the block device.NoneYes (if action is delay)500ms