IOChaos Experiment
This document walks you through the IOChaos experiment.
IOChaos allows you to simulate file system faults such as IO delay and read/write errors. It can inject delay and fault when your program is running IO system calls such as open
, read
, and write
.
#
Configuration fileBelow is a sample YAML file of IOChaos:
For more sample files, see examples. You can edit them as needed.
Field | Description | Sample Value |
---|---|---|
mode | Defines the mode of the selector. | one / all / fixed / fixed-percent / random-max-percent |
selector | Specifies the pods to be injected with IO chaos. | |
action | Represents the IOChaos actions. Refer to Available actions for IOChaos for more details. | delay / fault / attrOverride |
volumePath | The mount path of the target volume | "/var/run/etcd" |
delay | Specifies the latency of the fault injection. The duration might be a string with a signed sequence of decimal numbers, each with an optional fraction and a unit suffix. Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", and "h". | "300ms" / "2h45m" |
errno | Defines the error code returned by an IO action. See common Linux system errors for more Linux system error codes. | 2 |
attr | Defines the attribute to be overridden and the corresponding value | examples |
percent | Defines the probability of injecting errors in percentage. | 100 (by default) |
path | Defines the path of files for injecting IOChaos actions. It should be a glob for the files which you want to inject fault or delay. | "/var/run/etcd/**/*" |
methods | Defines the IO methods for injecting IOChaos actions. It is represented as an array of string. | open / read See the available methods for more details. |
duration | Represents the duration of a chaos action. The duration might be a string with the signed sequence of decimal numbers, each with an optional fraction and a unit suffix. | "300ms" / "2h45m" |
scheduler | Defines the scheduler rules for the running time of the chaos experiment. | see robfig/cron |
#
UsageAssume that you are using examples/io-mixed-example.yaml
, you can run the following command to create a chaos experiment:
#
IOChaos available actionsIOChaos currently supports the following actions:
- latency: IO latency action. You can specify the latency before the IO operation returns a result.
- fault: IO fault action. In this mode, IO operations returns an error.
- attrOverride: Override attributes of a file.
#
latencyIf you are using the latency
action, you can edit the specification as below:
It will inject a latency of 1ms into the selected methods.
#
faultIf you are using the fault
action, you can edit the specification as below:
The selected methods return error 32, which means broken pipe
.
#
attrOverrideIf you are using the attrOverride
mode, you can edit the specification as below:
Then the permission of selected files will be overridden with 110 in octal, which means the files cannot be read or modified (without CAP_DAC_OVERRIDE). See available attributes for a list of all possible attributes to override.
Note:
Attributes could be cached by Linux kernel, so it might have no effect if your program had accessed it before.
#
Common Linux system errorsCommon Linux system errors are as below:
1
: Operation not permitted2
: No such file or directory5
: I/O error6
: No such device or address12
: Out of memory16
: Device or resource busy17
: File exists20
: Not a directory22
: Invalid argument24
: Too many open files28
: No space left on device
Refer to related header files for more information.
#
Available methodsAvailable methods are as below:
- lookup
- forget
- getattr
- setattr
- readlink
- mknod
- mkdir
- unlink
- rmdir
- symlink
- rename
- link
- open
- read
- write
- flush
- release
- fsync
- opendir
- readdir
- releasedir
- fsyncdir
- statfs
- setxattr
- getxattr
- listxattr
- removexattr
- access
- create
- getlk
- setlk
- bmap
#
Available attributesAvailable attributes and the meaning of them are listed here:
ino
, inode of a filesize
, total size, in bytesblocks
, number of 512B blocks allocatedatime
, time of last accessmtime
, time of last modificationctime
, time of last status changekind
, file type. It can benamedPipe
,charDevice
,blockDevice
,directory
,regularFile
,symlink
orsocket
perm
, permission of a filenlink
, number of hard linksuid
, user id of ownergid
, group id of ownerrdev
, device ID (if special file)