The goal of the LitmusChaos project is to create a complete solution to implement chaos engineering at scale, the Kubernetes way! Of course, this had to be done incrementally by first creating a toolset for chaos injection and then adding additional features to make it a platform. Litmus 1.x achieved the goal of keeping it completely open-source, creating a ChaosHub and the required CRDs, Operators, and Schedulers. With Litmus 1.x, users have a working chaos engineering toolset aligned with the original goals.
Over time, with the monthly cadence releases and community engagement, we have added a lot of features and made LitmusChaos much easier for the end-users. With the launch of Litmus 2.0, a new way of chaos engineering can be performed by the users, a few high-level features are mentioned below, however a detailed list can be found on the release page.
A high-level feature overview of Litmus 2.0 are as follows
- The Addition of Chaos Workflow creation, Chaos experiments become building blocks of a Chaos Workflow, to allow users to create a larger chaos scenario using sequential or parallel experiment executions.
- Addition of ChaosCenter where you can take advantage of all these features and a lot more
- Workflow Creation
- From Templates, Custom Workflows from Scratch (using ChaosHubs), From pre-created YAMLs
- Chaos Experiments Sequence Control (Parallel as well as Sequential steps creation)
- Creation of either Singular or Cron Workflows as Schedules
- Attaching priority to Chaos Experiments based on your use cases
- Users & Teams
- Monitoring & Observability
- Connect a Data Source (from any Agent) and monitor workflows
- Visualize workflow run statistics and aggregated schedules
- Compare two or more Workflows
- Upload shared/downloadable dashboards available in the community
- Edit queries, Tune dashboards to create a custom one from scratch
- Monitor effect of chaos in real time with interleaved events and metrics from Prometheus Datasource
- Workflow Management
- Rolling out automated changes using GitOps
- Allowing image addition from custom image server (both public and private)
- Measure and Analyse the Resilience Score of each workflow
- Workflow Creation
Litmus itself is composed of microservices. And we made sure that by adding the above features for 2.0, seamlessly integrates the additional microservices in conjunction with the existing one. Litmus 2.0 is completely backwards compatible. No features are deprecated.
The migration path is about constructing new artifacts such as Chaos Workflows that include the current chaos experiments in use by the users.
Below is a high level comparison between Litmus 1.x and Litmus 2.0 providing a holistic view of the feature additions you get in Litmus 2.0.
|Litmus 1.x||Litmus 2.0|
|Per user||Teams (Multi Tenant)|
|Per cluster||Per organisation (Cross Cloud)|
|Only Public ChaosHub||Public and Private ChaosHubs|
|CLI only||CLI and GUI|
|Integrated and Interleaved monitoring|