About the Guru
Hi, I'm Vaishak — Director of Infrastructure Automation & SRE at one of Canada's largest insurers. I've spent nearly two decades designing, breaking, and rebuilding enterprise infrastructure — from storage systems and CRM platforms to cloud-native SRE practices and full-stack observability.
OpenTelemetry is where my work and obsession converge. I've implemented OTel-based observability at enterprise scale, built SRE frameworks around SLIs and SLOs, and led teams that reduced incident resolution time by 60% through better telemetry. This blog is the knowledge I wish I'd had when I started that journey.
I'm also a CNCF contributor, community builder, and sponsor of an enterprise SRE & Automation Community of Practice. The goal is to close the gap between conference talks and production reality.
Why "Guru"?
In the original Sanskrit sense, a guru is not a master on a pedestal, but a remover of darkness (gu = darkness, ru = remover). Someone who shines a light so others don't stumble.
That is the mission here. OpenTelemetry is complex. The "darkness" of distributed systems — where requests vanish into microservice black holes — is real. My goal is to share what I've learned through years of trial, error, and scaling.
From Lab to Production
Most tutorials stop at docker-compose up. This blog is about what comes after:
- What happens when you ingest 100k spans/second?
- How do you convince 500 developers to instrument their code?
- How do you manage sampling so your bill doesn't bankrupt the company?
- What actually breaks at scale — and how do you design around it?
If you're an SRE, DevOps engineer, or just someone tired of debugging in the dark — welcome. The content here is built on real production experience, not toy demos.