Database Design: Fixing the "Flat Table" Antipattern

A common mistake in early-stage development is creating "flat" tables that store everything in one place. While simple to query initially, the provided orders schema contains several structural flaws that lead to data corruption and performance bottlenecks.

1. What’s Wrong With This Schema?

The current orders table violates First Normal Form (1NF) principles regarding atomicity and introduces significant Data Redundancy.

The Core Issues:

Update Anomalies: If a customer changes their email, you must update every single row they have ever occupied. If you miss one, your data becomes inconsistent.
Insertion Anomalies: You cannot record a new customer's details (name/city) until they place an order.
Deletion Anomalies: If you delete an order, you might accidentally delete the only record of a product's price or a customer's existence.
Data Redundancy: Storing "San Francisco" or "Smartphone" as strings thousands of times wastes disk space and slows down index scans.
Inflexible Pricing: The product price is tied to the order. If the price changes, you lose the historical context of what the product cost at that specific time unless you have a separate catalog.

2. The Solution: Normalization (3NF)

To fix this, we split the flat table into three distinct entities: Customers, Products, and Orders. This follows the Third Normal Form (3NF), ensuring every non-key attribute is dependent only on the primary key.

Step 1: The Customers Table

Stores identity and contact information once.

CREATE TABLE customers (
customer_id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(100) NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL,
city VARCHAR(100)
);

CREATE TABLE products (
product_id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(100) NOT NULL,
current_price DECIMAL(10, 2) NOT NULL
);

CREATE TABLE orders (
order_id INT PRIMARY KEY AUTO_INCREMENT,
customer_id INT NOT NULL,
product_id INT NOT NULL,
price_at_purchase DECIMAL(10, 2) NOT NULL, -- Snapshot of price at time of sale
order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (customer_id) REFERENCES customers(customer_id),
FOREIGN KEY (product_id) REFERENCES products(product_id)
);

Part 2: Achieving Sub-100ms Latency
Once normalized, use these strategies to ensure queries return in milliseconds.
1. Indexing Strategies
Covering Indexes: Include all columns requested in the query within the index itself. This allows the DB to skip the "heap" lookup entirely.
Composite Index Ordering: Order columns by selectivity (Equality first, then Sort, then Range).
Partial Indexes: Index only "active" data (e.g., WHERE status = 'pending') to keep index trees shallow and memory-resident.
2. Query Optimization
*Avoid SELECT : Retrieve only the specific IDs or values needed to minimize data transfer and memory usage.
Seek Method Pagination: Use WHERE id > last_seen_id instead of OFFSET. Offset-based pagination slows down linearly as the page number increases.
Prepared Statements: Pre-compile queries to save 5-10ms of parsing and planning time per request.
3. Architecture for Speed
Connection Pooling: Keep a pool of warm connections to avoid the 30ms+ overhead of TCP/TLS handshakes.
Caching Layer: Use an in-memory store like Redis for frequently accessed rows (e.g., product catalog) to achieve <5ms response times.
Read Replicas: Scale read traffic by distributing queries across multiple secondary nodes, keeping the primary node free for writes.

Database Design: Fixing the "Flat Table" Antipattern

Database Design: Fixing the "Flat Table" Antipattern

1. What’s Wrong With This Schema?

The Core Issues:

2. The Solution: Normalization (3NF)

Step 1: The Customers Table

Comments (0)