About

 


Solving Hard Observability Problems Since 2000

Jack Neely

Jack Neely, Owner, Cardinality Cloud, LLC

Is cardinality crushing your observability platform? High-cardinality data is blocking your ability to understand your applications and track individual customer experiences. These problems are solvable—I’ve solved them at planet scale, and I can help you solve them too.

I transform observability chaos into competitive advantage through data-driven decision making. From pre-seed startups to Fortune 500 enterprises, I’ve architected systems handling 50+ terabytes per day, 150+ million active time series, and saved organizations millions in infrastructure costs. My focus: Site Reliability Engineering, Prometheus/Grafana ecosystems, and building teams that ship reliable software at speed.

I’m also building open source tools. My prometheus-alert-generator reached 100+ users in its first week—a React-based tool for creating and maintaining SLO alerting rules in Prometheus.

By the way, it’s all about the data.

Experience Highlights

Fractional CTO, KindHabitLabs, Inc

Pre-seed healthcare SaaS startup building Roo Mi, a HIPAA-compliant behavioral health platform for treatment facilities. Designed production AWS infrastructure (ECS, RDS PostgreSQL, CloudFront), migrated from prototype to production-grade architecture, established CI/CD pipelines, and mentored development team on cloud architecture and security practices.

Sr. Principal DevOps Observability Architect, Palo Alto Networks

Led global observability team for Prisma Cloud’s multi-cloud security platform. Architected unified observability across 80+ Kubernetes clusters in all regions including China and FedRAMP High. Managed 50+ TiB/day in OpenSearch and Loki, 150M+ metrics in Prometheus/Thanos/Mimir, and orchestrated migration from Splunk saving $2.5 million annually. Solved high-cardinality business intelligence challenges using AWS Kinesis and Apache Flink streaming pipelines.

Systems Architect, Fitbit, Inc

Built Fitbit’s Visibility Engineering team and implemented a Prometheus and Thanos observability platform ingesting 8 million data points per second. Migrated entire monitoring stack to Google Cloud Platform. Conducted time series forecasting for capacity planning during peak events. Led global migration from StatsD/Graphite to Prometheus, mentoring engineering teams across Python, Go, and Java. Contributed upstream fixes to Thanos and Prometheus projects.

Consulting, 42 Lines, Inc

Systems Architect for multiple SaaS products. Built scalable AWS load balancing solutions with Network Load Balancers and HAProxy. Introduced Prometheus and Grafana with Four Golden Signals dashboards. Created conference presentations and webinars on SRE best practices and observability-driven business decisions.

Operations and Systems Specialist, North Carolina State University

Practiced Site Reliability Engineering before the term existed. Built and maintained infrastructure for 100,000+ active users including email, Kerberos, and file storage. Led “Realm Linux” project—a fully automated installation system deployed across thousands of workstations and servers. Implemented load balancing, configuration management (Bcfg2/Puppet), and trained system administrators campus-wide.