Imagine if you had amnesia—you wake up every day with no memory of yesterday. Terrifying, right? Without logging, your servers have amnesia. Let’s give your system a perfect memory.
A crime happens in a building:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Without logs (No cameras, no witnesses): Detective: "Something happened here..." Evidence: None Result: Unsolvable ❌
With logs (Cameras everywhere): Detective: "Let me review the footage..." Evidence:
2:34 PM: Person A entered
2:37 PM: Person B entered
2:40 PM: Loud noise
2:41 PM: Person B exited quickly
Result: Solvable! ✓
Same with your app:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Without logs:
"The app crashed at 3 AM"
No idea why ❌
With logs:
"Let me check what happened..."
03:00:01 - User 12345 placed order
03:00:02 - Payment processing started
03:00:03 - ERROR: Database connection timeout
03:00:04 - System crashed
Result: Found the problem! ✓
Think of log levels like severity levels in a hospital:
1. DEBUG - The Detailed Diary
Hospital Analogy: Recording vitals of the patients every 5 minutes
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
"Patient's blood pressure: 120/80"
"Patient's heart rate: 72 bpm"
"Patient turned over at 2:15 PM"
Useful for: Detailed diagnosis Too much for: Daily review
Code Example:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Authentication completed successfully
When to use DEBUG:
Development environment
Troubleshooting specific issues
Understanding code flow
Never in production (too much data involved !)
2. INFO - The Normal Operations Log
Hospital Analogy: Recording routine events about the patients
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
"Patient admitted at 9:00 AM"
"Medication administered at 10:00 AM"
"Patient discharged at 5:00 PM"
Useful for: Understanding normal flow Not for: Every tiny detail
Code Example:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
When to use INFO:
3. WARN - The Concerning But Not Critical
Hospital Analogy: Potential problems
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
"Patient's temperature slightly elevated (99.5°F)"
"Patient hasn't eaten full meal"
"Unusual blood pressure reading"
Useful for: Catching issues early Status: Watch closely, but not emergency
Code Example:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Output:
When to use WARN:
Real-World WARN Example:
Scenario: E-commerce site
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
| 14:00 - [INFO] Server started
14:15 - [INFO] 100 requests/minute (normal)
14:30 - [WARN] 300 requests/minute (unusual spike)
14:45 - [WARN] 500 requests/minute (concerning)
15:00 - [ERROR] 1000 requests/minute - Server overloaded! |
The WARNs gave you 30 minutes to act! Could have scaled up servers before ERROR!
4. ERROR - The Problem Log
Hospital Analogy: Medical emergencies
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
"Patient's heart rate dropped to 40 bpm" "Medication allergy detected" "Patient fell out of bed"
Status: Immediate attention required! Action: Doctors respond now
Code Example:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
When to use ERROR:
The Key Distinction:
WARN vs ERROR:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
| WARN: "Payment took 5 seconds (slow but succeeded)" ⚠️
ERROR: "Payment failed - user got error message" ❌
WARN: "Database query slow (800ms)" ⚠️
ERROR: "Database connection lost" ❌
WARN: "Disk 85% full" ⚠️
ERROR: "Disk 100% full - can't write files" ❌
WARN: Things still work, but concerning
ERROR: Things broke, user impacted
5. FATAL/CRITICAL - The System Killer
Hospital Analogy: Code Blue
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
"Patient cardiac arrest!"
"Critical system failure"
Status: Life-threatening
Action: All hands on deck, emergency protocol
Code Example:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Output:
When to use FATAL:
Old Way (String Logging):
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
logger.info("User john@example.com placed order 789 for $99.99");
Problem: Hard to search, parse, analyze
New Way (Structured Logging):
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Output (JSON):
Benefits:
✓ Easy to search: "Show all orders > $100"
✓ Easy to aggregate: "Total revenue today?"
✓ Machine-readable: Tools can parse automatically
✓ Consistent format
Real-World Debugging Example:
Problem: "Some users can't checkout!"
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
With string logs (hard):
❌ Grep through millions of lines
❌ Parse text manually
❌ Hard to find patterns
With structured logs (easy):
✓ Query: event = "checkout_failed"
✓ Filter: last 24 hours
✓ Group by: error_type
Results:
Found it ! "payment_declined" is the main issue!
Drill deeper: payment_declined errors → 40 from same credit card processor → Their API is down! → Switch to backup processor ✓
What to Log:
✓ DO Log:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
✓ User actions: login, purchase, upload
✓ System events: startup, shutdown, deployment
✓ External API calls: request, response, latency
✓ Errors with context: what failed, why, when
✓ Performance metrics: slow queries, high memory
✓ Security events: failed logins, suspicious activity
❌ DON'T Log:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
❌ Passwords (NEVER!)
❌ Credit card numbers
❌ Social security numbers
❌ API keys, secrets, tokens
❌ Personal health information
❌ Anything sensitive/private
Example of what NOT to do:
logger.info(`User logged in with password: ${password}`); ❌❌❌
THIS IS A SECURITY BREACH!
Log Retention:
How Long to Keep Logs:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
DEBUG logs: 1-7 days (huge volume)
INFO logs: 30-90 days (moderate volume)
WARN logs: 90-365 days (important patterns)
ERROR logs: 1-2 years (compliance, analysis)
AUDIT logs: 7+ years (legal requirements)
Storage cost example:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
1 million requests/day
Each request: 2 log entries
Each entry: 1 KB
Daily logs: 2 GB
Monthly logs: 60 GB
Yearly logs: 730 GB
At $0.023/GB/month (AWS S3):
Monthly cost: $16.80
Yearly cost: $201.60
Worth it for debugging? YES! ✓