The SEARCH Method: A Framework for Structured Logging at Scale
The SEARCH Method: A Framework for Structured Logging at Scale
The CFO Conversation Nobody Wants
Most teams don’t learn the importance of logging practices until they’re sitting across from their CFO explaining a five, six, or even seven-figure observability bill. By then, you’re in damage control mode. You’re dropping data, restricting access, and losing the visibility your teams need to operate effectively.
The good news? This is preventable. The secret isn’t finding the perfect vendor or the most sophisticated logging platform. It’s establishing solid logging practices from the start.
The Tool Doesn’t Matter (At This Stage)
When you’re at the crawl stage (roughly 1TB per month or less, about 35GB per day), pricing across logging vendors is essentially identical. I’ve done the math, and the total cost of ownership differences are negligible at this scale.
Your choice of logging tool is a distraction at the early stage. What matters is technique and habits.
If you’re on AWS, CloudWatch is perfectly adequate. It’s cost-competitive, avoids data transfer charges (DTO and NAT Gateway fees that can add up quickly), and it works. Don’t overcomplicate things.
Logging Mindset for SREs
Before diving into implementation, adopt these core principles:
Logs are data. They must be machine parsable AND human readable. Structure enables aggregation, alerting, and analytics. Unstructured logs are technical debt.
Logs are immutable. Once written, they’re permanent. If you didn’t mean to log something, don’t log it. This is critical for security and compliance.
Logs are schematized. Consistent structure across services enables powerful querying and reduces token consumption when feeding logs to LLMs for analysis.
Logs are leveled. ERROR, WARN, INFO, DEBUG. Establish a consistent leveling strategy across all applications. Document it. Enforce it.
The Canonical Log Pattern
This concept comes from Stripe’s engineering blog: for every transaction or API request, emit one authoritative log entry that summarizes the entire operation. This is sometimes called a “wide event” with many fields. You can think of it like a receipt of the transaction.
Think of how Apache or Nginx work. Every HTTP request produces exactly one access log entry with status code, bytes transferred, user agent, IP address, and timing. Your application should follow the same pattern.
Your application can emit debug logs, error logs, and trace data throughout a request lifecycle. But there should always be one canonical log that provides the complete picture of what happened.
The SEARCH Method
When configuring logging systems and libraries, I use the SEARCH method to ensure consistency across teams and services:
S - Schema and Structure
Use JSON for your log format. Adopt a flexible schema that allows teams to add fields as needed, but enforce standards for common fields:
- Timestamp format (ISO 8601 preferably in UTC)
- Service/application name
- Request ID (for distributed tracing)
- User ID or session ID
- IP address
- Environment (prod, staging, dev)
Look to the OpenTelemetry Semantic Conventions or Elastic Common Schema for inspiration. You don’t need rigid enforcement, but common fields should be common.
E - Errors
Log errors with consistent structure. Modern logging libraries can include:
- Error type/class
- Error message
- Stack trace
- Error hash or ID (for deduplication and trending)
Error hashes are particularly valuable. They let you track similar errors across your entire infrastructure and identify patterns.
A - Audit and Action
Every log should indicate the action taken: GET, POST, PUT, DELETE, READ, WRITE, UPDATE. This is essential for security auditing and understanding user behavior.
R - Resources and Users
Log what was acted upon:
- Which API endpoint
- Which database table
- Which file or resource
- Which user or service account performed the action
C - Status Codes and Log Levels
Include HTTP status codes (200, 404, 500) in your logs. These are universally understood and enable powerful filtering.
Don’t forget to set appropriate log levels. ERROR means something broke. INFO means normal operation. Be consistent.
H - Human Readable Message
Every log entry needs a message field with human-readable text. This allows you to quickly scan logs with a simple query:
|
|
This gives you a readable stream of what’s happening without drowning in JSON.
Structured Logging and Byte Costs
Yes, JSON logs are more verbose than unstructured text. Vendors charge by the byte, so this seems counterproductive.
But the benefits vastly outweigh the costs:
- Faster incident response: Structured queries replace regex grep
- Better alerting: Alert on specific field values, not string patterns
- Dashboard creation: Aggregate and visualize without parsing
- LLM efficiency: Feed structured data to AI agents, saving tokens and improving accuracy
- Long-term cost savings: Better insights mean better optimization
To balance verbosity:
- Use shorter field names (
msginstead ofmessage,tsinstead oftimestamp) - Enable compression for transmission and storage
- Drop debug-level logs in production after retention period
The structure pays for itself the first time you need to debug a production incident at 3 AM.
What Not to Log
This is critical for security and compliance:
Never log:
- Source code or application secrets
- Passwords or authentication tokens
- AWS credentials, API keys, private keys
- Credit card numbers, bank account numbers, routing numbers
- Government-issued IDs (SSN, passport numbers, driver’s licenses)
Don’t ask me how I know.
Be careful with PII (Personally Identifiable Information):
Even seemingly innocent fields become problematic in combination:
- Timestamp + User ID + IP Address = user tracking data
If you’re subject to GDPR or other privacy regulations, you need the ability to mask or pseudonymize user identifiers in logs. Structured logging makes this possible. You can hash or encrypt specific fields while keeping logs useful for debugging.
Implementation Checklist
- Choose a logging library that supports structured JSON output (log4j2, Logrus, Winston, etc.)
- Define your schema for common fields across all services
- Document your decisions in your architecture decision records (ADRs)
- Implement canonical logs for all API endpoints and transactions
- Configure log levels appropriately for each environment
- Set up log rotation and retention policies
- Enable compression for storage and transmission
- Test your logging by querying logs with jq or your preferred tool
The Lesson: Start Right, Stay Right
Teams learn this lesson eventually. Usually the hard way, with a CFO meeting and a mandate to cut costs by 80% immediately. You’re then forced to make decisions under pressure, often cutting valuable telemetry along with waste.
Start with good practices now. Your future self (and your CFO) will thank you.
Conclusion
The SEARCH method provides a practical framework for establishing logging practices that scale. At the crawl stage, focus on building habits and structure, not on finding the perfect vendor.
Schema, Errors, Actions, Resources, Codes, and Human messages. This mnemonic helps ensure your logs provide value from day one. Combined with the canonical log pattern and a clear understanding of what not to log, you’ll build a foundation for operational excellence.
Your logging infrastructure is part of your product. Invest in it thoughtfully, and it will pay dividends throughout your company’s growth.
Need help reviewing your logging strategy before the CFO meeting? Get in touch to discuss how Cardinality Cloud can help you build cost-effective, scalable observability practices.