Cloud Native Chaos Engineering with Chaos Mesh

With Cloud, distributed architectures have grown even more complex and with complexity comes the uncertainty in how the system could fail.

Chaos Engineering aims to test system resiliency by injecting faults to identify weaknesses before they cause …


This content originally appeared on DEV Community and was authored by Shardul Srivastava

chaos-engineering

With Cloud, distributed architectures have grown even more complex and with complexity comes the uncertainty in how the system could fail.

Chaos Engineering aims to test system resiliency by injecting faults to identify weaknesses before they cause massive outages such as improper fallback settings for a service, cascading failures due to a single point of failure, or retry storms due to misconfigured timeouts.

History

Chaos Engineering started at Netflix back in 2010 when Netflix moved from on-prem servers to AWS infrastructure to test the resiliency of their infrastructure.

In 2012, Netflix open-sourced ChaosMonkey under Apache 2.0 license as a tool to test the resilience of your application infrastructure.

Cloud Native Chaos Engineering in CNCF Landscape

CNCF focuses on Cloud Native Chaos Engineering defined as engineering practices focused on (and built on) Kubernetes environments, applications, microservices, and infrastructure.

Cloud Native Chaos Engineering has 4 core principles:

  1. Open source
  2. CRDs for Chaos Management
  3. Extensible and pluggable
  4. Broad Community adoption

CNCF has two sandbox projects for Cloud Native Chaos Engineering

  1. ChaosMesh
  2. Litmus Chaos

cncf-chaos-engineering

Chaos Mesh

Chaos Mesh is a cloud-native Chaos Engineering platform that orchestrates chaos on Kubernetes environments. It is based on Kubernetes Operator pattern and provides a Chaos Operator to inject into the applications and Kubernetes infrastructure in a manageable way.

Chaos Operator uses Custom Resource Defition(CRD) to define chaos objects. It provides a variety of these CRDs for fault injection such as :

  1. PodChaos
  2. NetworkChaos
  3. DNSChaos
  4. HTTPChaos
  5. StressChaos
  6. IOChaos
  7. TimeChaos
  8. KernelChaos
  9. AWSChaos
  10. GCPChaos
  11. JVMChaos

Chaos Mesh Installation

Chaos Mesh can be installed quickly using installtion script. However, it's recommended to use Helm 3 chart in production environments.

To install Chaos Mesh using Helm :

  • Add the Chaos Mesh repository to the Helm repository.
helm repo add chaos-mesh https://charts.chaos-mesh.org
  • It's recommended to install ChaosMesh in a separate namespace, so you can either create a namespace chaos-testing manually or let Helm create it automatically, if it doesn't exist :
helm upgrade \
     --install \
     chaos-mesh \
     chaos-mesh/chaos-mesh \
     -n chaos-testing \
     --create-namespace \
     --version v2.0.0 \
     --wait

Note: If you're using GKE or EKS with containerd, then use

helm upgrade \
     --install \
     chaos-mesh \
     chaos-mesh/chaos-mesh \
     -n chaos-testing \
     --create-namespace \
     --set chaosDaemon.runtime=containerd \
     --set chaosDaemon.socketPath=/run/containerd/containerd.sock \
     --version v2.0.0 \
     --wait
  • Verify if pods are running :
kubectl get pods -n chaos-testing

Run First Chaos Mesh Experiment

Chaos Experiment describes how and what type of fault is injected.

  1. Setup a Nginx pod and expose it as a service on port 80.
kubectl run nginx --image=nginx --labels="app=nginx" --port=80 --expose
  1. Open another terminal and setup a test pod to test the connectivity to nginx service :
kubectl run -it test-connection --image=radial/busyboxplus:curl -- sh
curl nginx

this should show the response like this :

nginx-test

  1. Create your first Chaos Experiment by running :
kubectl apply -f - <<EOF
apiVersion: chaos-mesh.org/v1alpha1
kind: NetworkChaos
metadata:
  name: nginx-network-delay
spec:
  action: delay
  mode: one
  selector:
    namespaces:
      - default
    labelSelectors:
      'app': 'nginx'
  delay:
    latency: '1s'
  duration: '12s'
EOF

this will introduce a delay of 1 seconds in the response of nginx service for 12 seconds.

  1. Test the response of you nginx service now to see the delay of 1 seconds.


This content originally appeared on DEV Community and was authored by Shardul Srivastava


Print Share Comment Cite Upload Translate Updates
APA

Shardul Srivastava | Sciencx (2021-08-09T19:51:35+00:00) Cloud Native Chaos Engineering with Chaos Mesh. Retrieved from https://www.scien.cx/2021/08/09/cloud-native-chaos-engineering-with-chaos-mesh/

MLA
" » Cloud Native Chaos Engineering with Chaos Mesh." Shardul Srivastava | Sciencx - Monday August 9, 2021, https://www.scien.cx/2021/08/09/cloud-native-chaos-engineering-with-chaos-mesh/
HARVARD
Shardul Srivastava | Sciencx Monday August 9, 2021 » Cloud Native Chaos Engineering with Chaos Mesh., viewed ,<https://www.scien.cx/2021/08/09/cloud-native-chaos-engineering-with-chaos-mesh/>
VANCOUVER
Shardul Srivastava | Sciencx - » Cloud Native Chaos Engineering with Chaos Mesh. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2021/08/09/cloud-native-chaos-engineering-with-chaos-mesh/
CHICAGO
" » Cloud Native Chaos Engineering with Chaos Mesh." Shardul Srivastava | Sciencx - Accessed . https://www.scien.cx/2021/08/09/cloud-native-chaos-engineering-with-chaos-mesh/
IEEE
" » Cloud Native Chaos Engineering with Chaos Mesh." Shardul Srivastava | Sciencx [Online]. Available: https://www.scien.cx/2021/08/09/cloud-native-chaos-engineering-with-chaos-mesh/. [Accessed: ]
rf:citation
» Cloud Native Chaos Engineering with Chaos Mesh | Shardul Srivastava | Sciencx | https://www.scien.cx/2021/08/09/cloud-native-chaos-engineering-with-chaos-mesh/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.