Designing a Social Media News Feed at Scale: Fan-Out Strategies, Timeline Caching & Personalization
Building a news feed for hundreds of millions of users is one of the hardest system design problems — it demands sub-100ms read latency, support for celebrity accounts with 100M+ followers, personalized ranking, and real-time freshness. This guide covers the complete architecture used by Twitter, Instagram, and Facebook.
TL;DR — Core Decision
"Use fan-out on write (push to Redis timelines) for regular users; use fan-out on read (pull from Cassandra at query time) for celebrities with >1M followers. Merge both at read time for the final feed. Rank with ML. Cache the ranked result per user with a short TTL."
Table of Contents
- Requirements & Scale
- Data Model & Social Graph
- Fan-Out on Write (Push Model)
- Fan-Out on Read (Pull Model)
- Hybrid Strategy — The Real-World Approach
- Storage Architecture & Data Tiers
- Feed Ranking & Personalization
- Media Storage & CDN Delivery
- Real-Time Updates & Push Notifications
- Scaling the Feed at Global Scale
- Design Checklist & Conclusion
1. Requirements & Scale
Functional Requirements
- Users can post text, images, and videos
- Users can follow/unfollow other users
- Home feed shows posts from followed accounts, ranked by relevance
- Feed must reflect new posts within seconds (near real-time freshness)
- Support pagination (infinite scroll) efficiently
- Users can like, comment, and share posts
Scale Estimates
| Metric | Value | Notes |
|---|---|---|
| DAU | 500M | Twitter-scale |
| Posts/day | 100M | ~1,150 posts/sec |
| Feed reads/day | 5B | 10× write ratio |
| Avg followers/user | 200 | Celebrities: 10M–100M |
Write amplification challenge: A user with 1M followers posting once = 1M Redis writes. A celebrity with 50M followers = 50M writes per post. This is why a single fan-out strategy doesn't work at scale.
2. Data Model & Social Graph
Core Tables
-- Posts (stored in Cassandra for write-heavy, time-series access)
CREATE TABLE posts_by_user (
user_id UUID,
created_at TIMESTAMP,
post_id UUID,
content TEXT,
media_urls LIST<TEXT>,
like_count COUNTER,
PRIMARY KEY (user_id, created_at, post_id)
) WITH CLUSTERING ORDER BY (created_at DESC);
-- Social graph (follows) — stored in a graph DB or Cassandra
CREATE TABLE user_follows (
follower_id UUID,
followee_id UUID,
created_at TIMESTAMP,
PRIMARY KEY (follower_id, followee_id)
);
-- Reverse index for fan-out
CREATE TABLE followers_of (
followee_id UUID,
follower_id UUID,
PRIMARY KEY (followee_id, follower_id)
);
Social Graph Storage
The social graph (who follows whom) is the backbone of the feed. At scale, store it in:
- Cassandra: For the raw follow relationship table — excellent for wide-row fan-out lookups
- Redis Sets: Cache the follower list for active users —
SMEMBERS followers:{user_id} - Neo4j / Dgraph: For graph traversal queries (mutual friends, second-degree connections, recommendations)
3. Fan-Out on Write (Push Model)
When a user posts, immediately push the post ID to each follower's timeline cache. The timeline is stored as a Redis Sorted Set, scored by publish timestamp.
// Fan-out worker (runs as Kafka consumer per partition)
void onPost(PostCreatedEvent event) {
String postId = event.postId;
long score = event.createdAtMs;
// Batch Redis pipeline for all followers
List<String> followers = socialGraph.getFollowers(event.userId);
try (Pipeline pipe = redis.pipelined()) {
for (String followerId : followers) {
String key = "timeline:" + followerId;
pipe.zadd(key, score, postId); // add to sorted set
pipe.zremrangeByRank(key, 0, -801); // keep only 800 latest posts
pipe.expire(key, 86400 * 7); // 7-day TTL
}
pipe.sync();
}
}
// Feed read — O(1) Redis fetch
List<String> getFeed(String userId, int page, int size) {
long offset = (long) page * size;
return redis.zrevrange("timeline:" + userId, offset, offset + size - 1);
}
Pros and Cons
✅ Advantages
- Feed reads are O(1) — just a Redis range query
- Read latency is consistently < 5ms
- No complex merge logic at read time
- Works perfectly for users with ≤ 100K followers
❌ Disadvantages
- Write amplification for celebrities (50M writes per post)
- Inactive users waste Redis memory (they never read the feed)
- Lag for ultra-popular accounts during fan-out
4. Fan-Out on Read (Pull Model)
Posts are stored once in Cassandra. At read time, fetch the most recent posts from each followee and merge them into a ranked feed. No write amplification during posting.
// Fan-out on read — merge N timelines at query time
List<Post> getFeedByMerge(String userId, int size) {
List<String> followees = socialGraph.getFollowees(userId); // users I follow
// Parallel fetch from Cassandra — async
List<CompletableFuture<List<Post>>> futures = followees.stream()
.map(fId -> CompletableFuture.supplyAsync(
() -> postRepo.getLatest(fId, size)))
.toList();
// Merge and sort by created_at
return futures.stream()
.map(CompletableFuture::join)
.flatMap(Collection::stream)
.sorted(Comparator.comparing(Post::getCreatedAt).reversed())
.limit(size)
.toList();
}
Problem: If you follow 500 accounts, this requires 500 Cassandra reads per feed request at O(500ms+) latency. Completely unacceptable for interactive feeds — only works for users following few accounts, or with aggressive result caching.
5. Hybrid Strategy — The Real-World Approach
Twitter, Instagram, and Facebook all use a hybrid approach. The key insight: classify users by follower count and apply different strategies.
Hybrid Fan-Out Rules
- Regular user (< 10K followers): Fan-out on write → push post_id to all follower Redis timelines immediately
- Popular user (10K – 1M followers): Fan-out on write but async (Kafka worker) with priority queue to prevent blocking
- Celebrity (> 1M followers): Skip fan-out entirely. Store post in Cassandra only. Inject at read time.
- Inactive followers: Skip fan-out if the follower hasn't been active in 30 days — saves millions of writes
Read-time merge: When serving a feed, take the user's Redis timeline (regular followees) + fetch last N posts from celebrity followees in Cassandra + merge + re-rank.
// Hybrid feed assembly
List<Post> getHybridFeed(String userId, int size) {
// 1. Get cached timeline from Redis (regular followees already fan-outed)
List<String> postIdsFromCache = redis.zrevrange("timeline:" + userId, 0, size * 2);
// 2. Identify celebrity followees
List<String> celebFollowees = socialGraph.getCelebrityFollowees(userId);
// 3. Fetch latest posts from celebrities (parallel Cassandra reads)
List<Post> celebPosts = fetchCelebPosts(celebFollowees, size);
// 4. Hydrate post IDs from cache
List<Post> regularPosts = hydrate(postIdsFromCache);
// 5. Merge and rank
return rankingService.rank(merge(regularPosts, celebPosts), userId, size);
}
6. Storage Architecture & Data Tiers
Redis Timeline Cache
- Data structure: Sorted Set —
ZADD timeline:{user_id} {timestamp} {post_id} - Keep only the 800 most recent posts per user — trim with
ZREMRANGEBYRANK - TTL: 7 days — evict timelines of inactive users
- Memory estimate: 500M users × 800 posts × 16 bytes/entry ≈ 6.4TB Redis cluster
- Use Redis Cluster with 6 shards and replication factor 2 for HA
Cassandra Post Store
- Partition key:
user_id— all posts for a user on one partition (locality for fan-out) - Clustering key:
created_at DESC— natural time-series ordering without secondary index - Replication factor: 3 across 3 datacenters for global availability
- Compaction: TWCS (TimeWindowCompactionStrategy) for time-series write patterns
- TTL: 90 days for content in warm tier; archive older posts to S3 Parquet
7. Feed Ranking & Personalization
A chronological feed stopped working at scale when engagement rates dropped — users missed important content from close friends buried under high-frequency posters. Ranked feeds increased user engagement 40–70% for major platforms.
Two-Stage Ranking Pipeline
| Stage | Goal | Latency Budget |
|---|---|---|
| Candidate Generation | Retrieve 500–2000 candidate posts from Redis + Cassandra | < 20ms |
| Light Scoring | Apply fast heuristics: recency, relationship strength, content type | < 10ms |
| ML Re-ranking | Top-200 candidates scored by engagement prediction model | < 30ms |
| Diversity Filter | Limit same-author posts, inject ads/promoted content | < 5ms |
| Final Feed | Top-20 posts returned to client | — |
Ranking Features
- Recency signal: Exponential decay — posts older than 24h score 50% lower
- Relationship strength: DMs, comments, mutual follows → weighted higher
- Engagement velocity: Post receiving 1000 likes in 10 minutes scores much higher
- Content affinity: User's historical engagement with video vs. text vs. images
- Diversity: Maximum 3 posts from the same author in top-20; inject varied content types
8. Media Storage & CDN Delivery
Images and videos dominate storage and bandwidth costs. The media pipeline:
- Upload: Client uploads directly to S3 via presigned URL — bypasses your backend entirely
- Processing: S3 event → Lambda/worker → transcode video (HLS with multiple bitrates), generate image thumbnails (3 sizes)
- Storage: Original in S3 Standard; thumbnails in S3 Standard; videos older than 90 days in S3 Intelligent-Tiering
- CDN delivery: CloudFront/Akamai serves all media — cache at edge PoPs closest to users
- Lazy loading: Feed initially shows placeholder; images load as user scrolls into viewport
- Adaptive quality: Client sends connection speed → serve appropriate resolution (360p/720p/1080p)
9. Real-Time Updates & Push Notifications
Users expect to see new posts appear in their feed without refreshing. Two strategies:
Polling vs. Push
- Long polling: Client holds open HTTP connection for 30s; server responds when new posts arrive or times out. Simple but doesn't scale to millions of concurrent connections.
- Server-Sent Events (SSE): Server pushes new post IDs to connected clients over a persistent HTTP stream. Better than WebSocket for unidirectional feed updates.
- Push notifications: For background users (app not open), send APNs/FCM push notification when a followed account posts. Respect notification preferences and rate-limit to prevent notification fatigue.
New Post Indicator Pattern
Rather than live-streaming posts into the feed (jarring UX), show a "X new posts — tap to refresh" banner. This decouples real-time notification from feed refresh and gives users control over when to update.
10. Scaling the Feed at Global Scale
Geographic Distribution
- Deploy Cassandra with multi-datacenter replication (US-East, EU-West, APAC)
- Redis Cluster per region — users' timelines served from their local region
- Fan-out workers replicate to all regions via Kafka MirrorMaker for global celebrities
- Social graph partitioned by user geography — reduce cross-region latency for followee lookups
Engagement Counter Design
Like/comment counts are read frequently but need not be perfectly accurate at every read. Use approximate counters:
- Increment Redis counter on every like:
INCR post_likes:{post_id} - Async Kafka consumer writes to Cassandra (eventual consistency — acceptable for counters)
- For viral posts (> 10K likes/min), use Redis HyperLogLog for approximate unique-user count
11. Design Checklist & Conclusion
News Feed System Design Checklist
- ☐ Fan-out strategy defined: write for regular users, read for celebrities, hybrid at threshold
- ☐ Redis Sorted Sets used for timeline cache with bounded size (800 posts) and TTL
- ☐ Cassandra schema uses (user_id, created_at DESC) primary key for time-ordered fan-out
- ☐ Inactive users skipped during fan-out (30-day activity check)
- ☐ Two-stage ranking pipeline: candidate gen → ML re-rank → diversity filter
- ☐ Media uploads via presigned S3 URLs (never through your API servers)
- ☐ CDN configured for media delivery with aggressive edge caching
- ☐ New post notification via "X new posts" banner rather than live-injecting into feed
- ☐ Read and write paths are independently scalable services
- ☐ Engagement counters use Redis + async Cassandra persistence
The news feed is one of the most read-heavy, latency-sensitive systems in existence. Its design requires accepting fundamental trade-offs: write amplification for fast reads, eventual consistency for engagement counters, and approximate ranking for acceptable latency. Start with a pure fan-out-on-write approach, measure your P99 latencies under celebrity posting scenarios, and adopt the hybrid strategy as your user base grows past 10M accounts.
Frequently Asked Questions
What is TL;DR — Core Decision and how does it work?
"Use fan-out on write (push to Redis timelines) for regular users; use fan-out on read (pull from Cassandra at query time) for celebrities with >1M followers. Merge both at read time for the final feed. Rank with ML. Cache the ranked result per user with a short TTL."
What is Functional Requirements and how does it work?
Users can post text, images, and videos Users can follow/unfollow other users Home feed shows posts from followed accounts, ranked by relevance Feed must reflect new posts within seconds (near real-time freshness)
What is Scale Estimates and how does it work?
Write amplification challenge: A user with 1M followers posting once = 1M Redis writes. A celebrity with 50M followers = 50M writes per post. This is why a single fan-out strategy doesn't work at scale.
What is Social Graph Storage and how does it work?
The social graph (who follows whom) is the backbone of the feed. At scale, store it in: Cassandra: For the raw follow relationship table — excellent for wide-row fan-out lookups Redis Sets: Cache the follower list for active users — SMEMBERS followers:{user_id} Neo4j / Dgraph: For graph traversal queries (mutual friends, second-degree connections, recommendations)
What is Fan-Out on Write (Push Model) and how does it work?
When a user posts, immediately push the post ID to each follower's timeline cache. The timeline is stored as a Redis Sorted Set, scored by publish timestamp.