What is Functional Requirements and how does it work?

Users browse, search, and filter products from a catalog of 500M+ SKUs Users add items to cart, apply coupons, and check out Inventory is reserved atomically during checkout to prevent overselling Orders are processed, tracked, and fulfilled across global warehouses

What is Dual Storage and how does it work?

Product data lives in two stores simultaneously: PostgreSQL (source of truth): Product master data — name, description, price, seller, categories, SKUs. ACID transactions for write correctness. Sharded by seller_id (multi-tenant partitioning). Elasticsearch (search index): Denormalized copy optimized for full-text search, faceted filtering, and ranked results. Updated via Debezium CDC from PostgreSQL → Kafka → Elasticsearch consumer.

What is Product Page Caching and how does it work?

Static product metadata (title, images, description): CDN edge cache with 1-hour TTL — invalidated on price/availability change Dynamic data (live price, stock count, reviews): Short-lived Redis cache (30s TTL) or served directly from DB Images : S3 + CloudFront with 30-day cache; served as WebP with multiple resolutions (400px, 800px, 1600px) Product recommendations : Pre-computed ML recommendations cached in Redis per user (4h TTL); generated nightly batch

System Design

Designing an E-Commerce Platform at Scale: Catalog, Cart, Inventory & Order Management

E-commerce platforms are among the most complex distributed systems — they must handle millions of concurrent users, prevent overselling during flash sales, process payments atomically, and coordinate fulfillment across global warehouse networks. This guide covers every major subsystem from product catalog to order delivery tracking.

Md Sanwar Hossain April 6, 2026 22 min read System Design

E-commerce platform system design: catalog, cart, inventory, and order management at scale

TL;DR — Core Architecture Decisions

"Product catalog: Elasticsearch for search, PostgreSQL for truth. Cart: Redis (15-min TTL) with persistent backup in Cassandra. Inventory: Redis atomic DECR prevents oversell. Checkout: Saga pattern coordinates inventory → payment → fulfillment with compensation on failure. Orders: PostgreSQL + Kafka event bus."

Requirements & Scale Estimation
Microservices Architecture Overview
Product Catalog — Search, Filtering & CDN
Shopping Cart — Redis & Persistence Strategy
Inventory Management — Oversell Prevention
Flash Sale — Handling 100K TPS Spikes
Checkout Saga — Distributed Transaction
Order Management & Fulfillment
Pricing Engine & Promotions
Scaling Strategy & Data Partitioning
Design Checklist & Conclusion

1. Requirements & Scale Estimation

Functional Requirements

Users browse, search, and filter products from a catalog of 500M+ SKUs
Users add items to cart, apply coupons, and check out
Inventory is reserved atomically during checkout to prevent overselling
Orders are processed, tracked, and fulfilled across global warehouses
Flash sales with extreme traffic spikes (100K users simultaneously competing for 1K items)
Sellers can list products, manage inventory, and view sales analytics

Scale Estimates (Amazon-like)

Metric	Value	Notes
DAU	100M	Mid-size platform
Product catalog	500M SKUs	~1TB product data
Orders/day	5M	~58 orders/sec average
Peak checkout TPS	10,000	Black Friday peak
Flash sale peak TPS	100,000+	Requires queue/rate limiting

2. Microservices Architecture Overview

E-commerce platform microservices architecture: catalog, cart, inventory, order, payment, fulfillment services with Kafka event bus — E-Commerce Platform — full microservices architecture with Kafka event bus and downstream consumers. Source: mdsanwarhossain.me

Each service owns its domain and database. Services communicate synchronously (gRPC/REST) only when an immediate response is needed (inventory check, price lookup). All post-checkout operations (fulfillment, notifications, analytics) are driven by Kafka events for resilience.

3. Product Catalog — Search, Filtering & CDN

Dual Storage: PostgreSQL + Elasticsearch

Product data lives in two stores simultaneously:

PostgreSQL (source of truth): Product master data — name, description, price, seller, categories, SKUs. ACID transactions for write correctness. Sharded by seller_id (multi-tenant partitioning).
Elasticsearch (search index): Denormalized copy optimized for full-text search, faceted filtering, and ranked results. Updated via Debezium CDC from PostgreSQL → Kafka → Elasticsearch consumer.

// Elasticsearch product document
{
  "product_id": "B08N5WRWNW",
  "title": "Apple iPhone 15 Pro 256GB",
  "brand": "Apple",
  "categories": ["Electronics", "Phones", "Smartphones"],
  "price": 999.99,
  "rating": 4.8,
  "review_count": 12849,
  "in_stock": true,
  "variants": [
    {"color": "Natural Titanium", "storage": "256GB", "sku_id": "SKU-001"}
  ],
  "features": ["5G", "USB-C", "Action Button"],
  "image_urls": ["https://cdn.store.com/B08N5WRWNW/main.webp"],
  "seller_id": "seller_apple_official"
}

// Search query with facets
GET /products/_search
{
  "query": { "multi_match": { "query": "iphone 15", "fields": ["title^3","brand^2","features"] }},
  "aggs": { "brands": { "terms": { "field": "brand" } },
             "price_ranges": { "range": { "field": "price", "ranges": [...] } } },
  "sort": [{ "_score": "desc" }, { "rating": "desc" }]
}

Product Page Caching

Static product metadata (title, images, description): CDN edge cache with 1-hour TTL — invalidated on price/availability change
Dynamic data (live price, stock count, reviews): Short-lived Redis cache (30s TTL) or served directly from DB
Images: S3 + CloudFront with 30-day cache; served as WebP with multiple resolutions (400px, 800px, 1600px)
Product recommendations: Pre-computed ML recommendations cached in Redis per user (4h TTL); generated nightly batch

4. Shopping Cart — Redis & Persistence Strategy

The shopping cart is read extremely frequently (every page view on a shopping session) but has low durability requirements — a lost cart is annoying but not catastrophic. This makes Redis the ideal primary store.

// Cart stored as Redis Hash
// Key: cart:{user_id}
// Field: {sku_id}, Value: {quantity, added_at, price_at_add}

HSET cart:user_abc123
  SKU-001 '{"qty":2,"price":999.99,"added_at":1712300000}'
  SKU-002 '{"qty":1,"price":49.99,"added_at":1712300100}'

EXPIRE cart:user_abc123 86400  // 24h TTL; reset on every cart modification

// Read full cart
HGETALL cart:user_abc123

// Add or update quantity
HSET cart:user_abc123 SKU-001 '{"qty":3,...}'

// Remove item
HDEL cart:user_abc123 SKU-001

Cart Persistence for Logged-In Users

On add-to-cart: write to Redis immediately (sync) + publish to Kafka (async)
Kafka consumer persists cart to Cassandra (eventually consistent backup)
On login: merge anonymous guest cart with persisted logged-in cart (union, keeping higher quantity)
Price staleness: prices stored in cart at time of add; on checkout, validate current prices and alert user if changed > 5%

5. Inventory Management — Oversell Prevention

Inventory is the most consistency-critical part of e-commerce. Selling more items than you have in stock leads to order cancellations, customer trust loss, and potential legal liability.

E-commerce inventory reservation state machine and atomic check-and-reserve with Redis Lua script for flash sale oversell prevention — Inventory Reservation State Machine and Atomic Check-and-Reserve — prevents overselling at 100K TPS. Source: mdsanwarhossain.me

Two-Phase Reservation

Soft reservation (add to cart): DECR Redis counter; if result ≥ 0, item is soft-reserved for 15 minutes. TTL expires → auto-release.
Hard reservation (checkout initiated): Move from soft to hard reservation in DB transaction. Hard reservation holds inventory while payment processes (typically < 30 seconds).
Commit (payment succeeded): Deduct inventory permanently in PostgreSQL. Publish inventory.deducted event.
Rollback (payment failed / timeout): INCR Redis counter + update DB reservation status. Inventory made available again.

-- Inventory table
CREATE TABLE inventory (
    sku_id         UUID PRIMARY KEY,
    warehouse_id   UUID NOT NULL,
    total_qty      INT NOT NULL,
    reserved_qty   INT NOT NULL DEFAULT 0,     -- soft + hard reservations
    available_qty  INT GENERATED ALWAYS AS (total_qty - reserved_qty) STORED,
    version        BIGINT NOT NULL DEFAULT 0   -- optimistic locking
);

-- Atomic soft reservation with optimistic locking
UPDATE inventory
SET reserved_qty = reserved_qty + :qty, version = version + 1
WHERE sku_id = :sku_id
  AND available_qty >= :qty                    -- check atomically
  AND version = :expected_version;

-- If 0 rows affected: either out of stock or concurrent update (retry)

6. Flash Sale — Handling 100K TPS Spikes

Flash sales are designed to create urgency — 1000 iPhones at 50% off for 10 minutes. The resulting traffic spike (100K users fighting for 1K units) requires a completely different architecture than normal e-commerce traffic.

Flash Sale Architecture

Pre-load stock to Redis: 10 minutes before sale, set flash_inventory:sale_id = 1000 in Redis
Request queue: Incoming checkout requests enqueued in Redis List (LPUSH). Queue length = 3× stock (3000). Reject requests beyond queue capacity with "sale sold out" immediately.
Virtual waiting room: Show users their queue position and estimated wait time — reduces frustration and retry storms
Worker drains queue: Workers pop from queue (RPOP), call DECR flash_inventory:sale_id, process if result ≥ 0
Async persistence: Successful reservations published to Kafka → DB consumer persists to PostgreSQL
Early termination: Stop queue processing when counter hits 0; reject all remaining requests

// Flash sale checkout handler
public FlashSaleResult attemptPurchase(String saleId, String userId, int qty) {
    // 1. Check queue capacity (reject early)
    Long queueLen = redis.llen("flash_queue:" + saleId);
    if (queueLen != null && queueLen > maxQueueSize) {
        return FlashSaleResult.SOLD_OUT;
    }

    // 2. Rate limit per user (max 1 attempt per 5 seconds)
    String rateLimitKey = "flash_limit:" + saleId + ":" + userId;
    Boolean allowed = redis.setNX(rateLimitKey, "1");
    if (!allowed) return FlashSaleResult.RATE_LIMITED;
    redis.expire(rateLimitKey, 5);

    // 3. Atomic decrement inventory
    Long remaining = redis.decrBy("flash_inventory:" + saleId, qty);
    if (remaining != null && remaining < 0) {
        redis.incrBy("flash_inventory:" + saleId, qty); // refund counter
        return FlashSaleResult.SOLD_OUT;
    }

    // 4. Publish async order creation
    kafka.publish("flash.orders", new FlashOrderEvent(saleId, userId, qty));
    return FlashSaleResult.SUCCESS;
}

7. Checkout Saga — Distributed Transaction

Checkout spans multiple services: inventory reservation, payment, and order creation. There's no global transaction across these services — we use the Saga pattern with compensating transactions for failure recovery.

Checkout Saga Steps

Step	Action	Compensating Action
1	Validate cart & prices	—
2	Hard-reserve inventory	Release reservation
3	Apply coupon / calculate final price	Un-apply coupon usage
4	Charge payment (PSP)	Refund charge
5	Create order record	Cancel order (status: cancelled)
6	Confirm inventory deduction, clear cart	—

Orchestration vs. Choreography: For checkout, use orchestration (a dedicated Checkout Orchestrator service manages the saga state machine) rather than choreography — it's easier to reason about failure modes, implement timeouts, and perform compensation in a single place.

8. Order Management & Fulfillment

Order State Machine

Orders follow a strict state machine: placed → confirmed → packed → shipped → delivered → completed. Each transition is triggered by an event (from fulfillment, carrier, or user) and stored as an immutable audit log entry. The current state is the latest entry.

CREATE TABLE orders (
    order_id         UUID PRIMARY KEY DEFAULT gen_random_uuid_v7(),
    user_id          UUID NOT NULL,
    status           ENUM('placed','confirmed','packed','shipped','delivered','completed','cancelled'),
    total_amount     BIGINT NOT NULL,  -- cents
    currency         CHAR(3) NOT NULL,
    shipping_address JSONB,
    created_at       TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at       TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE TABLE order_events (  -- immutable event log
    event_id    UUID PRIMARY KEY,
    order_id    UUID NOT NULL REFERENCES orders(order_id),
    event_type  TEXT NOT NULL,  -- 'status_changed', 'tracking_updated', etc.
    payload     JSONB,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Warehouse Routing & Fulfillment

Warehouse selection: Choose warehouse closest to buyer with sufficient stock. Use geo-proximity scoring + stock availability query.
Split shipments: If no single warehouse has all items, split order across multiple warehouses (each becomes a separate shipment)
Carrier selection: Choose carrier based on SLA, cost, and historical on-time delivery rate for the destination zone
Tracking updates: Carrier webhooks → Kafka → Order Service → notify user via email/SMS/app

9. Pricing Engine & Promotions

Pricing in e-commerce is complex: base price, sale price, coupon discounts, loyalty points, bundle deals, and dynamic repricing must all be applied correctly and atomically.

Coupon Design

Coupon storage: Redis SET coupon:{code} stores coupon config (discount %, expiry, usage limit, eligible product categories)
Usage deduplication: Redis SETNX coupon_used:{code}:{user_id} prevents double-use per user
Global usage limit: Redis INCR with a limit check (e.g., first 500 users only) — atomic even under high concurrency
Coupon validation at checkout: Validate in Checkout Service before applying to total — never trust client-side discount calculations

Dynamic Pricing

For marketplace platforms (Amazon-style), sellers set their own prices. Dynamic repricing algorithms adjust prices based on competitor pricing and demand elasticity. Prices are updated in PostgreSQL + synced to Elasticsearch and CDN cache via invalidation events.

10. Scaling Strategy & Data Partitioning

Database Partitioning Strategy

Service	Database	Sharding Key	Rationale
Products	PostgreSQL	seller_id	All seller products collocated
Orders	PostgreSQL	user_id	User sees all their orders on one shard
Inventory	PostgreSQL + Redis	sku_id	High contention items spread across shards
Cart	Redis + Cassandra	user_id	Session affinity for fast access

Read Scaling

Product listing pages: served from Elasticsearch + CDN (99% of product traffic is reads)
Order history: read replicas for order queries; user-facing queries never hit write primary
Inventory availability (product page): Redis cache with 30s TTL; approximate count acceptable for display ("only 5 left!")
Inventory at checkout: always read from the write primary for accuracy before reserving

11. Design Checklist & Conclusion

E-Commerce System Design Checklist

☐ Product catalog: PostgreSQL (write truth) + Elasticsearch (search) + CDC sync
☐ Cart stored in Redis Hash with 24h TTL; persisted async to Cassandra via Kafka
☐ Inventory uses two-phase reservation (soft at cart, hard at checkout)
☐ Atomic inventory check-and-reserve via Lua script or DB optimistic locking
☐ Flash sale uses Redis DECR + request queue + virtual waiting room
☐ Checkout uses saga pattern with orchestrator (not choreography) for clarity
☐ Payment service is idempotent (idempotency keys for all PSP calls)
☐ Orders stored with immutable event log (every status change appended)
☐ Coupon usage is atomic (Redis SETNX per user + INCR global counter)
☐ All post-checkout work (fulfillment, notifications) driven by Kafka events

E-commerce system design covers a remarkably wide range of distributed systems challenges: from the read-heavy product catalog (Elasticsearch, CDN) to the write-contended inventory (Redis atomics, optimistic locking) to the multi-service checkout (saga pattern) to the event-driven fulfillment pipeline (Kafka). Each subsystem has its own failure modes and scaling patterns — which is exactly why e-commerce is such a rich topic for system design interviews and a rigorous test of distributed systems thinking.

Designing an E-Commerce Platform at Scale: Catalog, Cart, Inventory & Order Management

TL;DR — Core Architecture Decisions

Table of Contents

1. Requirements & Scale Estimation

Functional Requirements

Scale Estimates (Amazon-like)

2. Microservices Architecture Overview

3. Product Catalog — Search, Filtering & CDN

Dual Storage: PostgreSQL + Elasticsearch

Product Page Caching

4. Shopping Cart — Redis & Persistence Strategy

Cart Persistence for Logged-In Users

5. Inventory Management — Oversell Prevention

Two-Phase Reservation

6. Flash Sale — Handling 100K TPS Spikes

Flash Sale Architecture

7. Checkout Saga — Distributed Transaction

Checkout Saga Steps

8. Order Management & Fulfillment

Order State Machine

Warehouse Routing & Fulfillment

9. Pricing Engine & Promotions

Coupon Design

Dynamic Pricing

10. Scaling Strategy & Data Partitioning

Database Partitioning Strategy

Read Scaling

11. Design Checklist & Conclusion

E-Commerce System Design Checklist

Tags

Leave a Comment

Related Posts

Designing an E-Commerce Platform at Scale: Catalog, Cart, Inventory & Order Management

TL;DR — Core Architecture Decisions

Table of Contents

1. Requirements & Scale Estimation

Functional Requirements

Scale Estimates (Amazon-like)

2. Microservices Architecture Overview

3. Product Catalog — Search, Filtering & CDN

Dual Storage: PostgreSQL + Elasticsearch

Product Page Caching

4. Shopping Cart — Redis & Persistence Strategy

Cart Persistence for Logged-In Users

5. Inventory Management — Oversell Prevention

Two-Phase Reservation

6. Flash Sale — Handling 100K TPS Spikes

Flash Sale Architecture

7. Checkout Saga — Distributed Transaction

Checkout Saga Steps

8. Order Management & Fulfillment

Order State Machine

Warehouse Routing & Fulfillment

9. Pricing Engine & Promotions

Coupon Design

Dynamic Pricing

10. Scaling Strategy & Data Partitioning

Database Partitioning Strategy

Read Scaling

11. Design Checklist & Conclusion

E-Commerce System Design Checklist

Tags

Leave a Comment

Related Posts

Designing a Payment System at Scale

Real-Time Leaderboard with Redis

Distributed Caching Patterns

Social Media News Feed at Scale

Cookie Notice