The Query That Runs 10,000 Times#
You have an API endpoint. It fetches a user’s profile from the database. Simple query, takes about 15ms. No big deal. Until you check the logs and realize this endpoint gets called 10,000 times per hour, and 80% of those requests are for the same 200 users. You’re running the same query, getting the same result, 8,000 times per hour. Your database doesn’t complain — yet — but it’s doing a lot of work it doesn’t need to do.
The fix isn’t a faster query. It’s not running the query at all.
What Caching Actually Is#
Caching is storing the result of an expensive operation so you can reuse it instead of recomputing it every time. The “expensive operation” is usually a database query, but it could be an API call, a computation, or anything that takes time.
Without cache:
Request → Database (15ms) → Response
With cache:
Request → Redis (0.5ms) → Response ← cache hit
Request → Redis (miss) → Database (15ms) ← cache miss, store in Redis
Redis keeps data in RAM. RAM is fast — sub-millisecond fast. Databases store data on disk and do real work (query parsing, index traversal, transaction management). The difference between 0.5ms and 15ms doesn’t sound like much, but multiply it by 10,000 requests per hour and it adds up. Your database goes from handling 10,000 queries to handling 2,000. That’s 80% less load, which means you can delay that expensive database scaling a lot longer.
The One Strategy You Need#
There are several caching strategies with fancy names — read-through, write-through, write-behind. They all have tradeoffs. But for most applications, cache-aside is the one to start with. It’s the simplest, the most resilient, and it covers 90% of use cases.
The pattern:
Read:
1. Check Redis → data exists? Return it (cache hit)
2. Not in Redis? → query database
3. Store the result in Redis with an expiration (TTL)
4. Return the data
Write:
1. Write to database
2. Delete from Redis
→ Next read will re-populate the cache with fresh data
In code, it looks something like this:
async function getUser(userId: string) {
// Check cache first
const cached = await redis.get(`user:${userId}`);
if (cached) return JSON.parse(cached);
// Cache miss — fetch from database
const user = await db.users.findById(userId);
// Store in Redis with 5-minute TTL
await redis.set(`user:${userId}`, JSON.stringify(user), "EX", 300);
return user;
}
The beauty of cache-aside is its resilience. If Redis goes down, your app falls back to hitting the database directly. It’s slower, but it works. The cache is an optimization, not a dependency.
The Other Strategies (And When They Matter)#
Cache-aside isn’t the only approach. There are three others, and they each solve a different problem.
Read-through is like cache-aside, except the cache itself handles the database fetch on a miss. Your app always talks to the cache, never directly to the database. The cache becomes the single read interface. This is cleaner architecturally, but requires a cache library or framework that supports it. Use it when you want to completely abstract the database behind the cache layer.
Write-through sends every write to the cache first, and the cache synchronously writes to the database before returning. The cache is always in sync — no stale data, ever. The cost? Every write is slower because you’re waiting for both the cache and the database. Use it when data is both written and read frequently, and consistency matters more than write speed (think: financial balances, inventory counts).
Write-behind (also called write-back) sends writes to the cache and returns immediately. The cache then flushes to the database asynchronously, often in batches. This is the fastest write path, but there’s a real risk: if the cache crashes before flushing, those writes are gone. Use it for write-heavy workloads where you can tolerate some data loss (analytics events, view counters, metrics).
| Strategy | Reads | Writes | Consistency | Risk |
|---|---|---|---|---|
| Cache-aside | App checks cache, falls back to DB | App writes to DB, invalidates cache | Eventually consistent | Cold start on first read |
| Read-through | Cache fetches from DB on miss | N/A (pair with write-through) | Eventually consistent | Cache needs DB access |
| Write-through | Fast (cache always fresh) | Slow (sync write to both) | Strong | Write latency |
| Write-behind | Fast | Fastest (async flush) | Eventual | Data loss if cache crashes |
Most applications should start with cache-aside and only move to the others when they have a specific reason. Write-through and write-behind add complexity that isn’t worth it until you’re hitting real performance walls.
The Actually Hard Part#
“There are only two hard things in Computer Science: cache invalidation and naming things.”
Caching is easy. Knowing when to stop caching — when the data in your cache no longer matches reality — that’s the hard part.
TTL: The Timer Approach#
Set an expiration on every cache entry. After 5 minutes, it disappears automatically. Next request fetches fresh data from the database.
redis.set("user:123", data, "EX", 300) // gone after 5 minutes
The tradeoff is straightforward. Short TTL (10 seconds) = very fresh data, lots of cache misses, more database load. Long TTL (1 hour) = great performance, potentially stale data.
| TTL | Freshness | Hit Rate | Good For |
|---|---|---|---|
| 5-10 seconds | Very fresh | Low | Real-time data |
| 1-5 minutes | Mostly fresh | Medium | User profiles, API responses |
| 1-24 hours | Possibly stale | High | Config, reference data, static content |
Event-Based: The Explicit Approach#
When data changes, explicitly delete the cache entry. No waiting for TTL.
async function updateUser(userId: string, data: UserUpdate) {
await db.users.update(userId, data);
await redis.del(`user:${userId}`); // cache gone, next read refreshes
}
This gives you immediate freshness, but you have to remember to invalidate everywhere data can change. Miss one update path and you have stale data. In practice, most teams use both — event-based invalidation for immediate freshness, TTL as a safety net in case you miss an invalidation.
The Stampede#
Here’s a fun one. You have a cache entry that’s read 1,000 times per second. Its TTL expires. All 1,000 requests simultaneously hit the database to re-populate the cache. The database chokes.
The fix: add randomness to your TTL. Instead of every entry expiring at exactly 300 seconds, use 300 + random(0, 30). Entries expire at slightly different times, spreading the load. It’s a small thing, but at scale it’s the difference between a smooth refresh and a database meltdown.
Redis: More Than a Cache#
I’ve been talking about Redis as a cache, but it does a lot more. It has purpose-built data structures for specific problems:
| Structure | What It Does | Real Example |
|---|---|---|
| String | Simple key-value | Cache a JSON response |
| Hash | Object with fields | Store a user profile (update fields individually) |
| List | Ordered collection | “Recent activity” feed (push new, trim old) |
| Set | Unique elements | Track unique visitors, deduplicate events |
| Sorted Set | Ranked elements with scores | Leaderboards, priority queues |
The sorted set is the one that surprises people. Need a leaderboard? ZADD leaderboard 1500 "player1". Need the top 10? ZREVRANGE leaderboard 0 9. It’s sorted automatically by score. Building this in a relational database involves ORDER BY queries that get slower as the table grows. In Redis, it’s O(log N) regardless of size.
Redis is also commonly used as a session store (enables stateless servers — any server can validate any session), a rate limiter (count requests per IP with automatic expiry), and a lightweight pub/sub system for real-time broadcasting.
What to Cache (And What Not To)#
Not everything belongs in a cache. Here’s the filter I use:
Cache it if:
- It’s read frequently and changes rarely (user profiles, config, reference data)
- It’s expensive to compute or fetch (aggregated reports, external API responses)
- Staleness is acceptable for some window (5 seconds? 5 minutes?)
Don’t cache it if:
- Every request is unique (personalized search results with no overlap)
- The data changes faster than your TTL (real-time stock prices with a 1-minute TTL are useless)
- It’s security-sensitive (cache adds another place credentials can leak)
- It’s large binary data (use a CDN for images and files, not Redis)
Redis vs Memcached (The Short Version)#
Memcached is a pure cache — strings only, no persistence, no data structures. It’s slightly faster at raw get/set operations (0.09ms vs 0.12ms at p50). Redis does everything Memcached does plus hashes, lists, sets, sorted sets, pub/sub, persistence, clustering, and Lua scripting.
Use Memcached if you need pure key-value caching with the absolute lowest latency and nothing else. Use Redis for everything else. For most teams, Redis is the right default.
One thing worth knowing: Redis changed its license in 2024, which prompted the Linux Foundation to fork it as Valkey — fully open source and API-compatible. If licensing matters to your organization, Valkey is the drop-in replacement.
The Connection to Scaling#
In my scaling post , I said “always optimize before you scale.” Caching is the biggest piece of that optimization. Before you add more pods, before you scale your database vertically, before you add read replicas — check if you’re hitting the database for data you already have.
Add an index for slow queries (free). Add Redis for repeated queries (cheap). Then scale infrastructure if you still need to. The order matters, and caching is almost always step one.
