observability

less guessing, more knowing.

Observability ("o11y" for short) is a trait of your system. How well can you understand the internal health of your system just from its external outputs? The better you can understand your system from the data it gives you, the more observable it is.

The data your system spits out to tell you about its health is called telemetry, which comes from the Greek tele- ("remote") and -metry ("measurement"). You're measuring something inside the system and then reading those measurements somewhere else. The process of adding code or tooling to your system to get telemetry data is called instrumentation.

Most importantly, good observability comes from how you interact with your system's telemetry data. This means there are two sides to improving observability:

  • improving the data

  • improving the interactions

The first comes from better instrumentation. The second comes from tooling designed to let you work with that data in novel ways.

resources

* Note: I've linked to whitepapers that are gated by a form to fill out in order to download them. If you want a direct link to the PDFs, reach out via Twitter or at my work email (shelby (at) honeycomb.io) and I can get them to you.