|Type||Description||Tested K8s Platform|
|OpenEBS||Kill the cstor pool pod container and check if gets created again||GKE, Konvoy(AWS), Packet(Kubeadm), Minikube, OpenShift(Baremetal)|
Note: In this example, we are using nginx as stateful application that stores static pages on a Kubernetes volume.
Ensure that the Litmus Chaos Operator is running by executing
kubectl get podsin operator namespace (typically,
litmus). If not, install from here
Ensure that the
openebs-pool-container-failureexperiment resource is available in the cluster. If not, install from here
The DATA_PERSISTENCE can be enabled by provide the application's info in a configmap volume so that the experiment can perform necessary checks. Currently, LitmusChaos supports data consistency checks only for MySQL and Busybox.
- For MYSQL data persistence check create a configmap as shown below in the application namespace (replace with actual credentials):
apiVersion: v1 kind: ConfigMap metadata: name: openebs-pool-container-failure data: parameters.yml: | dbuser: root dbpassword: k8sDem0 dbname: test
- For Busybox data persistence check create a configmap as shown below in the application namespace (replace with actual credentials):
apiVersion: v1 kind: ConfigMap metadata: name: openebs-pool-container-failure data: parameters.yml: | blocksize: 4k blockcount: 1024 testfile: exampleFile
Ensure that the chaosServiceAccount used for the experiment has cluster-scope permissions as the experiment may involve carrying out the chaos in the
openebsnamespace while performing application health checks in its respective namespace.
Ensure that you have adequate amount of
Memoryresources available in your cluster to run the experiment.
- Application pods are healthy before chaos injection
- Application writes are successful on OpenEBS PVs
- Stateful application pods are healthy post chaos injection
- OpenEBS Storage target pods are healthy
If the experiment tunable DATA_PERSISTENCE is set to 'enabled':
- Application data written prior to chaos is successfully retrieved/read
- Database consistency is maintained as per db integrity check utils
- This scenario validates the behaviour of stateful applications and OpenEBS data plane upon forced termination of the targeted pool pod container
- Containers are killed using the kill command provided by pumba
- Pumba is run as a daemonset on all nodes in dry-run mode to begin with; the kill command is issued during experiment execution via kubectl exec
- Can test the stateful application's resilience to momentary iSCSI connection loss
- Container kill is achieved using the pumba chaos library for docker runtime.
- The desired lib image can be configured in the env variable LIB_IMAGE.
Steps to Execute the Chaos Experiment
This Chaos Experiment can be triggered by creating a ChaosEngine resource on the cluster. To understand the values to be provided in a ChaosEngine specification, refer Getting Started
Follow the steps in the sections below to prepare the ChaosEngine & execute the experiment.
Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary cluster role permissions to execute the experiment.
Sample Rbac Manifest
apiVersion: v1 kind: ServiceAccount metadata: name: pool-container-failure-sa namespace: default labels: name: pool-container-failure-sa # Source: openebs/templates/clusterrole.yaml apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRole metadata: name: pool-container-failure-sa labels: name: pool-container-failure-sa rules: - apiGroups: ["","apps","litmuschaos.io","batch","extensions","storage.k8s.io","openebs.io"] resources: ["pods","jobs","daemonsets","events","pods/log","replicasets","pods/exec","configmaps","secrets","persistentvolumeclaims","cstorvolumereplicas","chaosexperiments","chaosresults","chaosengines"] verbs: ["create","list","get","patch","update","delete"] apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: pool-container-failure-sa labels: name: pool-container-failure-sa roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: pool-container-failure-sa subjects: - kind: ServiceAccount name: pool-container-failure-sa namespace: default
- Provide the application info in
- Provide the auxiliary applications info (ns & labels) in
- Override the experiment tunables if desired
Supported Experiment Tunables
|APP_PVC||The PersistentVolumeClaim used by the stateful application||Mandatory||PVC must use OpenEBS cStor storage class|
|TOTAL_CHAOS_DURATION||Amount of soak time for I/O post container kill||Optional||Defaults to 600 seconds|
|LIB_IMAGE||The chaos library image used to inject the latency||Optional||Defaults to `gaiaadm/pumba:0.4.8`. Supported: `gaiaadm/pumba:0.4.8`|
|DEPLOY_TYPE||Type of Kubernetes resource used by the stateful application||Optional||Defaults to `deployment`. Supported: `deployment`, `statefulset`|
|DATA_PERSISTENCE||Flag to perform data consistency checks on the application||Optional||Default value is disabled (empty/unset). It supports only `mysql` and `busybox`. Ensure configmap with app details are created|
Sample ChaosEngine Manifest
apiVersion: litmuschaos.io/v1alpha1 kind: ChaosEngine metadata: name: target-chaos namespace: default spec: # It can be true/false annotationCheck: 'false' # It can be active/stop engineState: 'active' #ex. values: ns1:name=percona,ns2:run=nginx auxiliaryAppInfo: '' appinfo: appns: 'default' applabel: 'app=nginx' appkind: 'deployment' chaosServiceAccount: pool-container-failure-sa monitoring: false # It can be delete/retain jobCleanUpPolicy: 'delete' experiments: - name: openebs-pool-container-failure spec: components: env: - name: APP_PVC value: 'demo-nginx-claim' - name: DEPLOY_TYPE value: 'deployment'
Create the ChaosEngine Resource
Create the ChaosEngine manifest prepared in the previous step to trigger the Chaos.
kubectl apply -f chaosengine.yml
Watch Chaos progress
View pod restart count by setting up a watch on the pods in the OpenEBS namespace
watch -n 1 kubectl get pods -n <application-namespace>
Check Chaos Experiment Result
Check whether the application is resilient to the pool pod container failure, once the experiment (job) is completed. The ChaosResult resource naming convention is:
kubectl describe chaosresult target-chaos-openebs-pool-container-failure -n <application-namespace>
OpenEBS Pool Container Failure Demo [TODO]
- A sample recording of this experiment execution is provided here.