Simulate Block Device Incidents
BlockChaos Introduction
Chaos Mesh provides the BlockChaos experiment type. You can use this experiment type to simulate a block device latency or freeze scenario. This document describes how to install the dependencies of a BlockChaos experiment, and create a BlockChaos.
BlockChaos is in an early stage. The installation and configuration experience of it will continue to improve. If you find any issues, please open an issue in chaos-mesh/chaos-mesh to report.
BlockChaos freeze
action will affect all processes using the block device, not only the target container.
Install kernel module
BlockChaos delay
action depends on the chaos-driver kernel module. It can only be injected on a machine with this module installed. Currently, you have to compile and install the module manually.
-
Download the source code of this module using the following command:
curl -fsSL -o chaos-driver-v0.2.1.tar.gz https://github.com/chaos-mesh/chaos-driver/archive/refs/tags/v0.2.1.tar.gz
-
Uncompress the
chaos-driver-v0.2.1.tar.gz
file:tar xvf chaos-driver-v0.2.1.tar.gz
-
Prepare the headers of your current kernel. If you are using CentOS/Fedora, you can install the kernel headers with
yum
:yum install kernel-devel-$(uname -r)
If you are using Ubuntu/Debian, you can install the kernel headers with
apt
:apt install linux-headers-$(uname -r)
-
Compile the module:
cd chaos-driver-v0.2.1
make driver/chaos_driver.ko -
Install the kernel module:
insmod ./driver/chaos_driver.ko
The chaos_driver
module has to be installed every time after rebooting. To load the module automatically, you can copy the module to a subdirectory in /lib/modules/$(uname -r)/kernel/drivers
, run depmod -a
, and then add chaos_driver
to the /etc/modules
.
If you have upgraded the kernel, the module should be recompiled.
It is recommended to use DKMS or akmod for automatic kernel module compiling or loading. If you want to help us improve the installation experience, creating a DKMS or akmod package and submitting it to different distribution repositories is very welcome.
Create experiments using the YAML file
-
Write the experiment configuration to the YAML configuration file. The following uses the
block-latency.yaml
file as an example.apiVersion: chaos-mesh.org/v1alpha1
kind: BlockChaos
metadata:
name: hostpath-example-delay
spec:
selector:
labelSelectors:
app: hostpath-example
mode: all
volumeName: hostpath-example
action: delay
delay:
latency: 1s备注Only hostpath or localvolume is supported.
-
Use
kubectl
to create an experiment:kubectl apply -f block-latency.yaml
You can find the following magic happened:
- The elevator of the volume is changed to
ioem
orioem-mq
. You can check it throughcat /sys/block/<device>/queue/scheduler
. - The
ioem
orioem-mq
scheduler will receive the latency request and delay the request for the specified time.
The fields in the YAML configuration file are described in the following table:
Parameter | Type | Note | Default value | Required | Example |
---|---|---|---|---|---|
mode | string | Specifies the mode of the experiment. The mode options include one (selecting a random Pod), all (selecting all eligible Pods), fixed (selecting a specified number of eligible Pods), fixed-percent (selecting a specified percentage of Pods from the eligible Pods), and random-max-percent (selecting the maximum percentage of Pods from the eligible Pods). | None | Yes | one |
value | string | Provides parameters for the mode configuration, depending on mode . For example, when mode is set to fixed-percent , value specifies the percentage of Pods. | None | No | 1 |
selector | struct | Specifies the target Pod. For details, refer to Define the experiment scope. | None | Yes | |
volumeName | string | Specifies the volume to inject in the target pods. There should be a corresponding entry in the pods' .spec.volumes . | None | Yes | hostpath-example |
action | string | Indicates the specific type of faults. The available fault types include delay and freeze . delay will simulate the latency of block devices, and freeze will simulate that the block device cannot handle any requests | None | Yes | delay |
delay.latency | string | Specifies the latency of the block device. | None | Yes (if action is delay ) | 500ms |