Using open source tooling to build an observability solution offers the opportunity to dramatically reduce costs. The largest impediment to a streamlined solution is integration, without it you just get three disconnected data pools. If only there was one dashboard to rule them all, one dashboard to find all the data. One dashboard to bring them all together and in enlightenment bind them. Apologies to JRR Tolkien.
The Three Data Pools
Open source offers a choice of mature and highly efficient tooling to handle metrics, logs and traces. They all offer class leading functionality in their niche. However, each tool works in isolation resulting in numerous browser tabs open on different dashboards in a futile attempt to correlate across disparate data sources. This is frustrating for DevOps and SRE teams and slows down the restoration of service during an incident.
Prometheus graduated from the CNCF in 2018 and has become the de facto standard for time series metrics, with many other cloud native projects providing a Prometheus metrics endpoint. The Prometheus ecosystem provides numerous exporters and client libraries making it easy to get metrics out of just about anything; including frontend frameworks like React. The Prometheus engine is highly efficient and can handle millions of metrics with modest compute and storage resources.
Prometheus is just a time series metrics engine, to visualise the data Grafana has become the de facto standard pairing.
Traditionally what’s known as the ELK stack (Elasticsearch, Logstash, Kibana) has been the Go to toolset and still offers great functionality. More recently Loki has become the challenger to the establishment. It offers some advantages over the ELK stack. It’s more efficient with its use of compute and storage resources and it uses Grafana for visualisation.
Open Telemetry is the increasingly popular choice for instrumenting code for trace collection. It offers comprehensive support for many programming languages and frameworks. The traces generated by Open Telemetry client libraries are accepted by numerous backends including commercial platforms. The open source backend of choice is Jaeger Tracing which can use Elasticsearch / Opensearch or Cassandra for storage.
Bind Them All
With the application of a few Helm charts the open source toolset is soon up and running. Finally adding Asserts as a layer of automation and intelligence on the top provides the seamless observability platform you desire. The three data pools are now united with correlation across metrics, logs and traces. All the information needed to resolve incidents is now automatically collated on to one dynamic dashboard.
A curated library of alert rules and dashboards enables DevOps and SRE teams to be productive from day one. Asserts Data Distiller provides the ability to observe everything while retaining only what matters, considerably reducing the compute and storage resource requirements. Asserts automatically builds and maintains an Entity Graph of the relationships between all application components by intelligently analysing metric and trace labels. When an incident is triggered, the Entity Graph is used to automatically collate all related components onto a dynamically created dashboard. All the information DevOps and SRE teams need for deep insight is just a click away, reducing the number of engineers called in and speeding the restoration of service.
Start saving on observability platform costs and switch to open source tooling with Asserts for free, forever.