About

 


Independent Observability Architect | 25 Years | Fortune 500 Experience

Jack Neely

Jack Neely, Independent Observability Architect, Cardinality Cloud, LLC

Observability isn’t a tax. It’s leverage.

Most SaaS companies treat observability as overhead, like insurance against outages. I help you redesign it as a decision-making system that cuts costs, accelerates incident response, and improves reliability. You don’t need more dashboards. You need better architecture: what to measure, when to alert, and where to invest.

Why Independent Architecture Matters

Observability vendors make more money when you send them more data. More logs, more metrics, more traces = bigger bills for you, bigger revenue for them. As an independent architect, I’m incentivized when my clients cut costs and make better decisions with their data. That fundamental alignment changes everything.

Typical engagements deliver up to 10x cost savings plus faster incident response.

What I’ve Built

From pre-seed startups to Fortune 500 enterprises, I’ve architected observability systems at scale:

  • 150+ TiB/day in logs (OpenSearch, Loki)
  • 8M+ samples/second (Prometheus, Thanos, Mimir)
  • 400M+ active time series (Grafana)
  • $2.5M annual savings from Splunk migration
  • 300+ engineers supported
  • 80+ Kubernetes clusters globally

I’ve solved high-cardinality challenges, built platform teams from scratch, and trained engineering organizations on SRE best practices. My focus: ClickHouse/Prometheus/Grafana ecosystems, vendor-neutral architecture, and building teams that ship reliable software at speed.

Open Source & Community

I contribute to the observability ecosystem through open source and education:

  • Open source contributor to Graphite, Prometheus, and Thanos projects
  • prometheus-alert-generator.com - Free tool for creating SLO alerting rules (100+ users in first week)
  • YouTube - Observability architecture, cost optimization, and technical deep-dives
  • operations.fm podcast host
  • Conference speaker and industry thought leader
  • Gertrude Cox Award recipient for innovative teaching with technology

The common thread? Making better decisions with your data.

Experience Highlights

Fractional CTO, KindHabitLabs, Inc

Pre-seed healthcare SaaS startup building Roo Mi, a HIPAA-compliant behavioral health platform for treatment facilities. Designed production AWS infrastructure (ECS, RDS PostgreSQL, CloudFront), migrated from prototype to production-grade architecture, established CI/CD pipelines, and mentored development team on cloud architecture and security practices.

Sr. Principal DevOps Observability Architect, Palo Alto Networks

Led global observability team for Prisma Cloud’s multi-cloud security platform. Architected unified observability across 80+ Kubernetes clusters in all regions including China and FedRAMP High. Managed 50+ TiB/day in OpenSearch and Loki, 150M+ metrics in Prometheus/Thanos/Mimir, and orchestrated migration from Splunk saving $2.5 million annually. Solved high-cardinality business intelligence challenges using AWS Kinesis and Apache Flink streaming pipelines.

Systems Architect, Fitbit, Inc

Built Fitbit’s Visibility Engineering team and implemented a Prometheus and Thanos observability platform ingesting 8 million data points per second. Migrated entire monitoring stack to Google Cloud Platform. Conducted time series forecasting for capacity planning during peak events. Led global migration from StatsD/Graphite to Prometheus, mentoring engineering teams across Python, Go, and Java. Contributed upstream fixes to Thanos and Prometheus projects.

Consulting, 42 Lines, Inc

Systems Architect for multiple SaaS products. Built scalable AWS load balancing solutions with Network Load Balancers and HAProxy. Introduced Prometheus and Grafana with Four Golden Signals dashboards. Created conference presentations and webinars on SRE best practices and observability-driven business decisions.

Operations and Systems Specialist, North Carolina State University

Practiced Site Reliability Engineering before the term existed. Built and maintained infrastructure for 100,000+ active users including email, Kerberos, and file storage. Led “Realm Linux” project—a fully automated installation system deployed across thousands of workstations and servers. Implemented load balancing, configuration management (Bcfg2/Puppet), and trained system administrators campus-wide.