Database Indexing 101: B-Trees vs. Hash Indexes
Published on 2026-04-09 11:17 by Frugle Me (Last updated: 2026-04-09 11:19)
Indexing is the most critical tool for database performance. By transforming full-table scans into direct lookups, indexes ensure your application remains fast as data grows.
1. B-Tree Indexes (The All-Rounder)
B-Tree (Balanced Tree) is the default indexing method for almost all relational databases (PostgreSQL, MySQL, Oracle).
How it Works
A B-Tree stores data in a sorted, hierarchical structure.
- Balanced Structure: The distance from the root to any leaf node is always the same, ensuring predictable performance.
- Sorted Keys: Keys are stored in ascending order within nodes.
- Pointers: Internal nodes contain pointers to child nodes representing specific ranges of values.
Pros
- Range Queries: Highly efficient for operators like
<,>,BETWEEN, andLIKE 'prefix%'. - Sorting: Because data is naturally sorted, it helps speed up
ORDER BYandGROUP BYoperations. - Self-Balancing: Automatically adjusts as data is inserted or deleted, maintaining $O(\log n)$ performance.
Cons
- Slightly Slower Lookups: Requires traversing multiple levels from root to leaf, which is $O(\log n)$.
2. Hash Indexes (The Specialist)
Hash indexes are specialized structures designed for extreme speed in specific scenarios.
How it Works
A Hash index uses a Hash Function to map a key directly to a specific "bucket" in a hash table.
- Direct Access: It doesn't traverse a tree; it calculates an address and jumps straight to it.
- Key-to-Bucket: Each entry is stored based on the result of the hash function.
Pros
- Ultra-Fast Equality: Offers $O(1)$ constant-time lookups for exact matches (e.g.,
WHERE id = 500). - Compact: Often consumes less space than B-Trees because it doesn't store the hierarchical pointers.
Cons
- No Range Queries: Since the hash of "1" is totally different from the hash of "2", you cannot search for a range of values.
- No Sorting: Hash tables are inherently unordered.
- Collisions: If two different keys produce the same hash, performance can degrade.
3. Comparison Summary
| Feature | B-Tree Index | Hash Index |
|---|---|---|
| Best For | General purpose, Range queries | Exact match lookups |
| Search Speed | $O(\log n)$ | $O(1)$ (Average) |
| Operators | =, >, <, BETWEEN, LIKE |
Only = and IN |
| Storage Order | Sorted | Unordered |
| Default | Yes (in most RDBMS) | No |
4. When to Use What?
Use B-Tree when:
- You need to perform range searches (e.g., finding all users between age 20 and 30).
- You want to retrieve data in a specific sorted order.
- You are looking for a "safe" default that handles almost any query pattern.
- You use partial matches like
LIKE 'Smith%'.
Use Hash when:
- You are 100% certain you will only ever use equality (
=) checks on that column. - You have a massive table where a tiny speed gain in lookups is critical (e.g., looking up a Session ID or a unique UUID).
- The table has very high cardinality (unique values), which minimizes hash collisions.
Rule of Thumb: Stick with B-Tree unless you have a specific, proven performance bottleneck that only a Hash index can solve.
Comments (0)
Want to join the conversation?
Please log in to add a comment.