System Design

Designing a Social Media News Feed at Scale: Fan-Out Strategies, Timeline Caching & Personalization

Building a news feed for hundreds of millions of users is one of the hardest system design problems — it demands sub-100ms read latency, support for celebrity accounts with 100M+ followers, personalized ranking, and real-time freshness. This guide covers the complete architecture used by Twitter, Instagram, and Facebook.

Md Sanwar Hossain April 6, 2026 22 min read System Design

TL;DR — Core Decision

"Use fan-out on write (push to Redis timelines) for regular users; use fan-out on read (pull from Cassandra at query time) for celebrities with >1M followers. Merge both at read time for the final feed. Rank with ML. Cache the ranked result per user with a short TTL."

Requirements & Scale
Data Model & Social Graph
Fan-Out on Write (Push Model)
Fan-Out on Read (Pull Model)
Hybrid Strategy — The Real-World Approach
Storage Architecture & Data Tiers
Feed Ranking & Personalization
Media Storage & CDN Delivery
Real-Time Updates & Push Notifications
Scaling the Feed at Global Scale
Design Checklist & Conclusion

1. Requirements & Scale

Functional Requirements

Users can post text, images, and videos
Users can follow/unfollow other users
Home feed shows posts from followed accounts, ranked by relevance
Feed must reflect new posts within seconds (near real-time freshness)
Support pagination (infinite scroll) efficiently
Users can like, comment, and share posts

Scale Estimates

Metric	Value	Notes
DAU	500M	Twitter-scale
Posts/day	100M	~1,150 posts/sec
Feed reads/day	5B	10× write ratio
Avg followers/user	200	Celebrities: 10M–100M

Write amplification challenge: A user with 1M followers posting once = 1M Redis writes. A celebrity with 50M followers = 50M writes per post. This is why a single fan-out strategy doesn't work at scale.

2. Data Model & Social Graph

Core Tables

-- Posts (stored in Cassandra for write-heavy, time-series access)
CREATE TABLE posts_by_user (
    user_id     UUID,
    created_at  TIMESTAMP,
    post_id     UUID,
    content     TEXT,
    media_urls  LIST<TEXT>,
    like_count  COUNTER,
    PRIMARY KEY (user_id, created_at, post_id)
) WITH CLUSTERING ORDER BY (created_at DESC);

-- Social graph (follows) — stored in a graph DB or Cassandra
CREATE TABLE user_follows (
    follower_id  UUID,
    followee_id  UUID,
    created_at   TIMESTAMP,
    PRIMARY KEY (follower_id, followee_id)
);

-- Reverse index for fan-out
CREATE TABLE followers_of (
    followee_id  UUID,
    follower_id  UUID,
    PRIMARY KEY (followee_id, follower_id)
);

Social Graph Storage

The social graph (who follows whom) is the backbone of the feed. At scale, store it in:

Cassandra: For the raw follow relationship table — excellent for wide-row fan-out lookups
Redis Sets: Cache the follower list for active users — SMEMBERS followers:{user_id}
Neo4j / Dgraph: For graph traversal queries (mutual friends, second-degree connections, recommendations)

3. Fan-Out on Write (Push Model)

When a user posts, immediately push the post ID to each follower's timeline cache. The timeline is stored as a Redis Sorted Set, scored by publish timestamp.

Fan-Out Strategies — push model (write), pull model (read), and feed ranking pipeline. Source: mdsanwarhossain.me

// Fan-out worker (runs as Kafka consumer per partition)
void onPost(PostCreatedEvent event) {
    String postId = event.postId;
    long score = event.createdAtMs;

    // Batch Redis pipeline for all followers
    List<String> followers = socialGraph.getFollowers(event.userId);
    try (Pipeline pipe = redis.pipelined()) {
        for (String followerId : followers) {
            String key = "timeline:" + followerId;
            pipe.zadd(key, score, postId);       // add to sorted set
            pipe.zremrangeByRank(key, 0, -801);  // keep only 800 latest posts
            pipe.expire(key, 86400 * 7);         // 7-day TTL
        }
        pipe.sync();
    }
}

// Feed read — O(1) Redis fetch
List<String> getFeed(String userId, int page, int size) {
    long offset = (long) page * size;
    return redis.zrevrange("timeline:" + userId, offset, offset + size - 1);
}

Pros and Cons

✅ Advantages

Feed reads are O(1) — just a Redis range query
Read latency is consistently < 5ms
No complex merge logic at read time
Works perfectly for users with ≤ 100K followers

❌ Disadvantages

Write amplification for celebrities (50M writes per post)
Inactive users waste Redis memory (they never read the feed)
Lag for ultra-popular accounts during fan-out

4. Fan-Out on Read (Pull Model)

Posts are stored once in Cassandra. At read time, fetch the most recent posts from each followee and merge them into a ranked feed. No write amplification during posting.

// Fan-out on read — merge N timelines at query time
List<Post> getFeedByMerge(String userId, int size) {
    List<String> followees = socialGraph.getFollowees(userId); // users I follow

    // Parallel fetch from Cassandra — async
    List<CompletableFuture<List<Post>>> futures = followees.stream()
        .map(fId -> CompletableFuture.supplyAsync(
            () -> postRepo.getLatest(fId, size)))
        .toList();

    // Merge and sort by created_at
    return futures.stream()
        .map(CompletableFuture::join)
        .flatMap(Collection::stream)
        .sorted(Comparator.comparing(Post::getCreatedAt).reversed())
        .limit(size)
        .toList();
}

Problem: If you follow 500 accounts, this requires 500 Cassandra reads per feed request at O(500ms+) latency. Completely unacceptable for interactive feeds — only works for users following few accounts, or with aggressive result caching.

5. Hybrid Strategy — The Real-World Approach

Twitter, Instagram, and Facebook all use a hybrid approach. The key insight: classify users by follower count and apply different strategies.

Hybrid Fan-Out Rules

Regular user (< 10K followers): Fan-out on write → push post_id to all follower Redis timelines immediately
Popular user (10K – 1M followers): Fan-out on write but async (Kafka worker) with priority queue to prevent blocking
Celebrity (> 1M followers): Skip fan-out entirely. Store post in Cassandra only. Inject at read time.
Inactive followers: Skip fan-out if the follower hasn't been active in 30 days — saves millions of writes

Read-time merge: When serving a feed, take the user's Redis timeline (regular followees) + fetch last N posts from celebrity followees in Cassandra + merge + re-rank.

// Hybrid feed assembly
List<Post> getHybridFeed(String userId, int size) {
    // 1. Get cached timeline from Redis (regular followees already fan-outed)
    List<String> postIdsFromCache = redis.zrevrange("timeline:" + userId, 0, size * 2);

    // 2. Identify celebrity followees
    List<String> celebFollowees = socialGraph.getCelebrityFollowees(userId);

    // 3. Fetch latest posts from celebrities (parallel Cassandra reads)
    List<Post> celebPosts = fetchCelebPosts(celebFollowees, size);

    // 4. Hydrate post IDs from cache
    List<Post> regularPosts = hydrate(postIdsFromCache);

    // 5. Merge and rank
    return rankingService.rank(merge(regularPosts, celebPosts), userId, size);
}

6. Storage Architecture & Data Tiers

Three-Tier Storage Architecture — hot (Redis timelines), warm (Cassandra posts), cold (S3/analytics). Source: mdsanwarhossain.me

Redis Timeline Cache

Data structure: Sorted Set — ZADD timeline:{user_id} {timestamp} {post_id}
Keep only the 800 most recent posts per user — trim with ZREMRANGEBYRANK
TTL: 7 days — evict timelines of inactive users
Memory estimate: 500M users × 800 posts × 16 bytes/entry ≈ 6.4TB Redis cluster
Use Redis Cluster with 6 shards and replication factor 2 for HA

Cassandra Post Store

Partition key: user_id — all posts for a user on one partition (locality for fan-out)
Clustering key: created_at DESC — natural time-series ordering without secondary index
Replication factor: 3 across 3 datacenters for global availability
Compaction: TWCS (TimeWindowCompactionStrategy) for time-series write patterns
TTL: 90 days for content in warm tier; archive older posts to S3 Parquet

7. Feed Ranking & Personalization

A chronological feed stopped working at scale when engagement rates dropped — users missed important content from close friends buried under high-frequency posters. Ranked feeds increased user engagement 40–70% for major platforms.

Two-Stage Ranking Pipeline

Stage	Goal	Latency Budget
Candidate Generation	Retrieve 500–2000 candidate posts from Redis + Cassandra	< 20ms
Light Scoring	Apply fast heuristics: recency, relationship strength, content type	< 10ms
ML Re-ranking	Top-200 candidates scored by engagement prediction model	< 30ms
Diversity Filter	Limit same-author posts, inject ads/promoted content	< 5ms
Final Feed	Top-20 posts returned to client	—

Ranking Features

Recency signal: Exponential decay — posts older than 24h score 50% lower
Relationship strength: DMs, comments, mutual follows → weighted higher
Engagement velocity: Post receiving 1000 likes in 10 minutes scores much higher
Content affinity: User's historical engagement with video vs. text vs. images
Diversity: Maximum 3 posts from the same author in top-20; inject varied content types

8. Media Storage & CDN Delivery

Images and videos dominate storage and bandwidth costs. The media pipeline:

Upload: Client uploads directly to S3 via presigned URL — bypasses your backend entirely
Processing: S3 event → Lambda/worker → transcode video (HLS with multiple bitrates), generate image thumbnails (3 sizes)
Storage: Original in S3 Standard; thumbnails in S3 Standard; videos older than 90 days in S3 Intelligent-Tiering
CDN delivery: CloudFront/Akamai serves all media — cache at edge PoPs closest to users
Lazy loading: Feed initially shows placeholder; images load as user scrolls into viewport
Adaptive quality: Client sends connection speed → serve appropriate resolution (360p/720p/1080p)

9. Real-Time Updates & Push Notifications

Users expect to see new posts appear in their feed without refreshing. Two strategies:

Polling vs. Push

Long polling: Client holds open HTTP connection for 30s; server responds when new posts arrive or times out. Simple but doesn't scale to millions of concurrent connections.
Server-Sent Events (SSE): Server pushes new post IDs to connected clients over a persistent HTTP stream. Better than WebSocket for unidirectional feed updates.
Push notifications: For background users (app not open), send APNs/FCM push notification when a followed account posts. Respect notification preferences and rate-limit to prevent notification fatigue.

New Post Indicator Pattern

Rather than live-streaming posts into the feed (jarring UX), show a "X new posts — tap to refresh" banner. This decouples real-time notification from feed refresh and gives users control over when to update.

10. Scaling the Feed at Global Scale

Geographic Distribution

Deploy Cassandra with multi-datacenter replication (US-East, EU-West, APAC)
Redis Cluster per region — users' timelines served from their local region
Fan-out workers replicate to all regions via Kafka MirrorMaker for global celebrities
Social graph partitioned by user geography — reduce cross-region latency for followee lookups

Engagement Counter Design

Like/comment counts are read frequently but need not be perfectly accurate at every read. Use approximate counters:

Increment Redis counter on every like: INCR post_likes:{post_id}
Async Kafka consumer writes to Cassandra (eventual consistency — acceptable for counters)
For viral posts (> 10K likes/min), use Redis HyperLogLog for approximate unique-user count

11. Design Checklist & Conclusion

News Feed System Design Checklist

☐ Fan-out strategy defined: write for regular users, read for celebrities, hybrid at threshold
☐ Redis Sorted Sets used for timeline cache with bounded size (800 posts) and TTL
☐ Cassandra schema uses (user_id, created_at DESC) primary key for time-ordered fan-out
☐ Inactive users skipped during fan-out (30-day activity check)
☐ Two-stage ranking pipeline: candidate gen → ML re-rank → diversity filter
☐ Media uploads via presigned S3 URLs (never through your API servers)
☐ CDN configured for media delivery with aggressive edge caching
☐ New post notification via "X new posts" banner rather than live-injecting into feed
☐ Read and write paths are independently scalable services
☐ Engagement counters use Redis + async Cassandra persistence

The news feed is one of the most read-heavy, latency-sensitive systems in existence. Its design requires accepting fundamental trade-offs: write amplification for fast reads, eventual consistency for engagement counters, and approximate ranking for acceptable latency. Start with a pure fan-out-on-write approach, measure your P99 latencies under celebrity posting scenarios, and adopt the hybrid strategy as your user base grows past 10M accounts.

Frequently Asked Questions

What is TL;DR — Core Decision and how does it work?

"Use fan-out on write (push to Redis timelines) for regular users; use fan-out on read (pull from Cassandra at query time) for celebrities with >1M followers. Merge both at read time for the final feed. Rank with ML. Cache the ranked result per user with a short TTL."

What is Functional Requirements and how does it work?

Users can post text, images, and videos Users can follow/unfollow other users Home feed shows posts from followed accounts, ranked by relevance Feed must reflect new posts within seconds (near real-time freshness)

What is Scale Estimates and how does it work?

Write amplification challenge: A user with 1M followers posting once = 1M Redis writes. A celebrity with 50M followers = 50M writes per post. This is why a single fan-out strategy doesn't work at scale.

What is Social Graph Storage and how does it work?

The social graph (who follows whom) is the backbone of the feed. At scale, store it in: Cassandra: For the raw follow relationship table — excellent for wide-row fan-out lookups Redis Sets: Cache the follower list for active users — SMEMBERS followers:{user_id} Neo4j / Dgraph: For graph traversal queries (mutual friends, second-degree connections, recommendations)

What is Fan-Out on Write (Push Model) and how does it work?

When a user posts, immediately push the post ID to each follower's timeline cache. The timeline is stored as a Redis Sorted Set, scored by publish timestamp.

Designing a Social Media News Feed at Scale: Fan-Out Strategies, Timeline Caching & Personalization

TL;DR — Core Decision

Table of Contents

1. Requirements & Scale

Functional Requirements

Scale Estimates

2. Data Model & Social Graph

Core Tables

Social Graph Storage

3. Fan-Out on Write (Push Model)

Pros and Cons

✅ Advantages

❌ Disadvantages

4. Fan-Out on Read (Pull Model)

5. Hybrid Strategy — The Real-World Approach

Hybrid Fan-Out Rules

6. Storage Architecture & Data Tiers

Redis Timeline Cache

Cassandra Post Store

7. Feed Ranking & Personalization

Two-Stage Ranking Pipeline

Ranking Features

8. Media Storage & CDN Delivery

9. Real-Time Updates & Push Notifications

Polling vs. Push

New Post Indicator Pattern

10. Scaling the Feed at Global Scale

Geographic Distribution

Engagement Counter Design

11. Design Checklist & Conclusion

News Feed System Design Checklist

Frequently Asked Questions

What is TL;DR — Core Decision and how does it work?

What is Functional Requirements and how does it work?

What is Scale Estimates and how does it work?

What is Social Graph Storage and how does it work?

What is Fan-Out on Write (Push Model) and how does it work?

Tags

Leave a Comment

Related Posts

Designing a Social Media News Feed at Scale: Fan-Out Strategies, Timeline Caching & Personalization

TL;DR — Core Decision

Table of Contents

1. Requirements & Scale

Functional Requirements

Scale Estimates

2. Data Model & Social Graph

Core Tables

Social Graph Storage

3. Fan-Out on Write (Push Model)

Pros and Cons

✅ Advantages

❌ Disadvantages

4. Fan-Out on Read (Pull Model)

5. Hybrid Strategy — The Real-World Approach

Hybrid Fan-Out Rules

6. Storage Architecture & Data Tiers

Redis Timeline Cache

Cassandra Post Store

7. Feed Ranking & Personalization

Two-Stage Ranking Pipeline

Ranking Features

8. Media Storage & CDN Delivery

9. Real-Time Updates & Push Notifications

Polling vs. Push

New Post Indicator Pattern

10. Scaling the Feed at Global Scale

Geographic Distribution

Engagement Counter Design

11. Design Checklist & Conclusion

News Feed System Design Checklist

Frequently Asked Questions

What is TL;DR — Core Decision and how does it work?

What is Functional Requirements and how does it work?

What is Scale Estimates and how does it work?

What is Social Graph Storage and how does it work?

What is Fan-Out on Write (Push Model) and how does it work?

Tags

Leave a Comment

Related Posts

Distributed Caching Patterns

Real-Time Notification System at Scale

Consistent Hashing Ring Design

E-Commerce Platform at Scale

Cookie Notice