Skip to main content
Version: 2.0.2

Simulate GCP Faults

This document describes how to use Chaos Mesh to inject faults into GCP Pod. Chaos Dashboard and YAML files are provided to create GCPChaos experiments.

GCPChaos introduction#

GCPChaos is a fault type in Chaos Mesh. By creating a GCPChaos experiment, you can simulate fault scenarios of the specified GCP instance. Currently, GCPChaos supports the following fault types:

  • Node Stop: stops the specified GCP instance.
  • Node Reset: reboots the specified GCP instance.
  • Disk Loss: uninstalls the storage volume from the specified GCP instance.

Secret file#

To easily connect to the GCP cluster, you can create a Kubernetes Secret file to store the authentication information in advance.

Below is a sample secret file:

apiVersion: v1kind: Secretmetadata:  name: cloud-key-secret  namespace: chaos-testingtype: OpaquestringData:  service_account: your-gcp-service-account-base64-encode
  • name defines the name of kubernetes secret.
  • namespace defines the namespace of kubernetes secret.
  • service_account stores the service account of your GCP cluster. Remember to complete Base64 encoding for your GCP service account.

Create experiments using Chaos Dashboard#

note

Before you create an experiment using Chaos Dashboard, make sure the following requirements are met:

  1. Chaos Dashboard is installed.

  2. Chaos Dashboard can be accessed using kubectl port-forward command:

    kubectl port-forward -n chaos-testing svc/chaos-dashboard 2333:2333

    You can then access the dashboard via http://localhost:2333 in your browser.

  1. Open Chaos Dashboard, and click NEW EXPERIMENT on the page to create a new experiment:

    img

  2. In the Choose a Target area, choose GCP fault and select a specific behavior, such as STOP NODE:

    img

  3. Fill out the experiment information, and specify the experiment scope and the scheduled experiment duration:

    img

    img

  4. Submit the experiment information.

Create experiments using YAML file#

A node-stop configuration example#

  1. Write the experiment configuration to the gcpchaos-node-stop.yaml, as shown below:

    apiVersion: chaos-mesh.org/v1alpha1kind: GCPChaosmetadata:  name: node-stop-example  namespace: chaos-testingspec:  action: node-stop  secretName: 'cloud-key-secret'  project: 'your-project'  zone: 'your-zone'  instance: 'your-instance'  duration: '5m'

    Based on this configuration example, Chaos Mesh will inject the node-stop fault into the specified GCP instance so that the GCP instance will be unavailable in 5 minutes.

    For more information about stopping GCP instances, refer to Stop GCP instance.

  2. After the configuration file is prepared, use kubectl to create an experiment:

    kubectl apply -f gcpchaos-node-stop.yaml

A node-reset configuration example#

  1. Write the experiment configuration to the gcpchaos-node-reset.yaml, as shown below:

    apiVersion: chaos-mesh.org/v1alpha1kind: GCPChaosmetadata:  name: node-reset-example  namespace: chaos-testingspec:  action: node-reset  secretName: 'cloud-key-secret'  project: 'your-project'  zone: 'your-zone'  instance: 'your-instance'  duration: '5m'

    Based on this configuration example, Chaos Mesh will inject node-reset fault into the specified GCP instance so that the GCP instance will be reset.

    For more information about resetting GCP instances, refer to Resetting a GCP instance.

  2. After the configuration file is prepared, use kubectl to create an experiment:

    kubectl apply -f gcpchaos-node-reset.yaml

A disk-loss configuration example#

  1. Write the experiment configuration to the gcpchaos-disk-loss.yaml, as shown below:

    apiVersion: chaos-mesh.org/v1alpha1kind: GCPChaosmetadata:  name: disk-loss-example  namespace: chaos-testingspec:  action: disk-loss  secretName: 'cloud-key-secret'  project: 'your-project'  zone: 'your-zone'  instance: 'your-instance'  deviceNames: ['disk-name']  duration: '5m'

    Based on this configuration example, Chaos Mesh will inject a disk-loss fault into the specified GCP instance so that the GCP instance is detached from the specified storage volume within 5 minutes.

    For more information about detaching GCP instances, refer to Detach GCP storage.

  2. After the configuration file is prepared, use kubectl to create an experiment:

    kubectl apply -f gcpchaos-disk-loss.yaml

Field description#

The following table shows the fields in the YAML configuration file.

ParameterTypeDescpriptionDefault valueRequiredExample
actionstringIndicates the specific type of faults. The available fault types include node-stop, node-reset, and disk-loss.node-stopYesnode-stop
modestringIndicates the mode of the experiment. The mode options include one (selecting a Pod at random), all (selecting all eligible Pods), fixed (selecting a specified number of eligible Pods), fixed-percent (selecting a specified percentage of the eligible Pods), and random-max-percent (selecting the maximum percentage of the eligible Pods).NoneYesone
valuestringProvides parameters for the mode configuration, depending on mode. For example, when mode is set to fixed-percent, value specifies the percentage of pods.NoneNo2
secretNamestringIndicates the name of the Kubernetes secret that stores the GCP authentication information.NoneNocloud-key-secret
projectstringIndicates the name of GCP project.NoneYesyour-project
zonestringIndicates the region of GCP instance.NoneYesus-central1-a
instancestringIndicates the ID of GCP instance.NoneYesyour-gcp-instance-id
deviceNames[]stringThis is a required field when the action is disk-loss. This field specifies the machine disk ID.Noneno["your-disk-id"]
durationstringIndicates the duration of the experiment.NoneYes30s