You Already Know More Than You Think#
Every time you write const users = [], you’ve chosen a data structure. Every time you write const cache = {}, you’ve chosen another one. Every time you .push() and .pop(), you’re using a stack. You don’t think about it because your language hides the implementation. But the choice matters — the wrong data structure turns an O(1) operation into an O(n) one, and at scale, that’s the difference between “fast” and “the page times out.”
This isn’t a computer science lecture. It’s a practical guide to the data structures that show up in real code, when to use them, and when you’re accidentally using the wrong one.
The Big Three (You Use These Every Day)#
Arrays: The Default#
If you’re storing a list of things, you’re probably using an array. Ordered, indexed, and fast for random access.
const users = ["Alice", "Bob", "Charlie"];
users[1]; // "Bob" — O(1), direct index access
users.push("Dan"); // O(1), append to end
users.includes("Bob"); // O(n) — scans every element
Arrays are great for iteration and random access by position. They’re terrible for searching by value (includes scans the entire array) and for inserting at the beginning (unshift reindexes everything — O(n)).
The mistake I see most often: using array.find() or array.includes() in a loop. You’re doing O(n) work inside an O(n) loop — that’s O(n²). If you’re looking things up by a key, you want a hash map.
Hash Maps: The Problem Solver#
Key-value pairs with O(1) lookup. The most underrated workhorse in programming.
// O(n²) — array lookup inside a loop
const users = [
{ id: 1, name: "Alice" },
{ id: 2, name: "Bob" },
];
orders.forEach((order) => {
const user = users.find((u) => u.id === order.userId); // O(n) each time
});
// O(n) — build a map first, then look up in O(1)
const userMap = new Map(users.map((u) => [u.id, u]));
orders.forEach((order) => {
const user = userMap.get(order.userId); // O(1) each time
});
That refactor — array to hash map — is the single most common performance fix I’ve seen. It shows up everywhere: joining API responses, deduplicating lists, counting occurrences, caching computed results.
Under the hood, a hash function maps the key to an array index. Collisions (two keys hashing to the same index) are handled by chaining or probing. You don’t need to know the implementation — just know that lookup, insert, and delete are all O(1) on average.
Where you see it without realizing: JavaScript objects ({}), Python dictionaries, Ruby hashes, JSON responses, HTTP headers, environment variables, Redis
.
Stacks and Queues: Order Matters#
Both are simple but solve fundamentally different problems.
Stack (LIFO): last in, first out. Add and remove from the same end.
const stack = [];
stack.push(1); // [1]
stack.push(2); // [1, 2]
stack.push(3); // [1, 2, 3]
stack.pop(); // 3 — last in, first out
You use stacks constantly without realizing it. Every function call pushes a frame onto the call stack. Every Ctrl+Z pops the last action. Your browser’s back button is a stack. When you write recursive code, the call stack is doing stack operations for you.
Queue (FIFO): first in, first out. Add at the back, remove from the front.
const queue = [];
queue.push(1); // [1]
queue.push(2); // [1, 2]
queue.push(3); // [1, 2, 3]
queue.shift(); // 1 — first in, first out
Queues are everywhere in backend systems: message queues , job queues (background workers ), print queues, rate limiting buffers. Anything where “process in the order received” matters.
Heads up: Array.shift() in JavaScript is O(n) because it reindexes the remaining elements. For performance-critical queues, use a linked list or a dedicated deque implementation.
The Next Tier (You’ll Need These Eventually)#
Linked Lists: The Pointer Chain#
Nodes connected by references. Each node points to the next.
A → B → C → D → null
Arrays store elements contiguously in memory — great for cache performance, terrible for inserting in the middle (everything shifts). Linked lists store elements anywhere in memory and connect them with pointers — great for insertion/deletion, terrible for random access (you must traverse from the head).
You rarely build linked lists directly in application code. But they power things you use constantly:
- Queue implementations (O(1) enqueue/dequeue without array shifting)
- LRU caches (doubly linked list + hash map for O(1) eviction)
- Blockchain (each block points to the previous block’s hash)
- Undo history in editors
Trees: Hierarchy Everywhere#
A node with children. The most natural structure for hierarchical data.
10
/ \
5 15
/ \ \
3 7 20
You interact with trees constantly:
- The DOM — every HTML document is a tree of elements
- File systems — directories contain files and subdirectories
- JSON — nested objects are trees
- Database indexes — B-trees power PostgreSQL and MySQL lookups
- React component tree — parent-child component hierarchy
Binary Search Trees (BSTs) keep data sorted: left child < parent < right child. This gives O(log n) search, insert, and delete — if the tree is balanced. Unbalanced BSTs degrade to O(n) (basically a linked list). That’s why databases use self-balancing trees (B-trees, Red-Black trees) that guarantee O(log n).
Heaps: Fast Access to the Extreme#
A heap always keeps the smallest (min-heap) or largest (max-heap) element at the root. Getting the min/max is O(1). Inserting or removing is O(log n).
Min-Heap:
1
/ \
3 2
/ \
7 5
You don’t build heaps directly very often. But you use them through priority queues:
- Task scheduling — process highest-priority job first
- Dijkstra’s algorithm — shortest path in a graph
- Kubernetes pod scheduling — which node gets the next pod
- Top-K problems — “find the 10 most active users” without sorting the entire dataset
Graphs: Everything Is Connected#
When relationships between items matter more than the items themselves.
Social network: Dependency graph:
A — B types → gateway
| | ↓
C — D ui → client
Graphs model social networks, road maps, Turborepo dependency graphs , network topologies, and recommendation engines. If your data has “connections” or “relationships,” it’s a graph problem.
Two key traversal algorithms:
- BFS (breadth-first) — explore level by level. Finds shortest path in unweighted graphs.
- DFS (depth-first) — explore as deep as possible before backtracking. Good for cycle detection, topological sorting.
The Decision Flowchart#
Need O(1) lookup by key? → Hash Map
Need ordered access by index? → Array
Need LIFO (undo, recursion, parsing)? → Stack
Need FIFO (queues, scheduling)? → Queue
Need sorted data with fast operations?→ Balanced BST / Sorted Array
Need quick min/max? → Heap
Need to model relationships? → Graph
Need frequent insert/delete? → Linked List
Need probabilistic "exists?" at scale?→ Bloom Filter (next post)
The One Optimization That Matters Most#
If there’s one takeaway from this post, it’s this: when you’re searching for something repeatedly, convert your array to a hash map first.
// Before: O(n) per lookup × m lookups = O(n×m)
orders.forEach((order) => {
const product = products.find((p) => p.id === order.productId);
});
// After: O(1) per lookup × m lookups = O(m)
const productMap = new Map(products.map((p) => [p.id, p]));
orders.forEach((order) => {
const product = productMap.get(order.productId);
});
This pattern — build a lookup map, then iterate — solves more performance problems than any framework optimization, any caching layer, or any infrastructure scaling. It’s free, it’s instant, and it’s the first thing to check when your code is slow.
The next post covers a data structure that takes this optimization philosophy even further — one that’s willing to be wrong sometimes in exchange for being impossibly fast and small.
