URL Shortener System Design: TinyURL & Bitly at Scale (2026)
A URL shortener is a deceptively simple service that hides enormous engineering challenges at scale. Designing one for 100 million URLs per day — like TinyURL or Bitly — forces you to make hard decisions about hash generation, collision handling, caching, database choice, analytics pipelines, and global distribution. This deep-dive covers every layer of the system with Java code examples you can take to an interview or production.
TL;DR — Core Design Decisions at a Glance
- Hash strategy: Use counter-based ID + Base62 encoding; avoid MD5 (collision-prone, expensive to resolve).
- Storage: NoSQL (Cassandra/DynamoDB) for URL mappings; Redis cluster for caching hot short codes.
- Redirects: HTTP 302 for analytics tracking; serve from CDN edge nodes for <10ms latency.
- Capacity: 100M writes/day ≈ 1,157 writes/sec; 10:1 read:write = ~11,570 reads/sec peak.
- Analytics: Async Kafka pipeline; never block the redirect path for click tracking.
Table of Contents
1. What Is a URL Shortener?
A URL shortener converts a long, unwieldy URL into a compact short code that redirects to the original destination. When a user visits the short URL, the service performs an HTTP redirect to the original long URL — transparently, within milliseconds.
The classic example: https://tinyurl.com/y7k2xq3p expands to a 200-character Amazon product URL. Real-world companies operating at scale include:
- Bitly — 10+ billion links created; enterprise analytics, custom domains, QR codes
- TinyURL — the original (1997), anonymous, no account required
- t.co — Twitter's internal shortener; every tweet link is wrapped automatically
- goo.gl — Google's deprecated shortener (shut down 2019); handled ~1B redirects/day at peak
- rb.gy, short.io, Rebrandly — modern competitors with white-label branded domains
Core Use Cases
- Social media sharing: Twitter's 280-character limit made short URLs essential; every URL consumes exactly 23 characters regardless of length via t.co wrapping.
- SMS & print campaigns: Short URLs are typeable and memorable in offline contexts.
- Link tracking & analytics: Marketers measure CTR, geo-distribution, device breakdown per campaign link.
- API responses & deep links: Programmatic URL generation for mobile deep links, QR codes, and referral systems.
- Expiring/one-time links: Password reset, file download, or payment confirmation URLs that expire after use.
In system design interviews, URL shorteners are a favorite because they are simple to understand but rich in trade-offs: hashing vs. sequential ID generation, SQL vs. NoSQL, cache consistency, 301 vs. 302 redirects, and horizontal scaling — all in one system.
2. Functional & Non-Functional Requirements
Functional Requirements
- Given a long URL, generate a unique short URL (e.g.,
https://short.ly/aB3kZ9q) - Redirect short URL to original long URL with <10ms added latency
- Support optional custom aliases (e.g.,
/my-brand-launch) - Support optional expiry date/time on short URLs
- Track click analytics: timestamp, referrer, geolocation, device/browser
- User accounts: dashboard for managing and auditing personal links
- Delete or deactivate a short URL
Non-Functional Requirements & Capacity Estimation
| Parameter | Assumption | Derived Metric |
|---|---|---|
| Daily URL writes | 100 million | ~1,157 writes/sec |
| Read:Write ratio | 10:1 | ~11,570 reads/sec |
| Storage per URL record | ~500 bytes avg | 50 GB/day; 18 TB/year |
| Short code length | 7 chars Base62 | 627 ≈ 3.5 trillion codes |
| Redirect latency SLA | p99 < 20ms | Requires aggressive caching |
| Availability | 99.99% uptime | <52 min downtime/year |
Key non-functional constraints: low-latency reads (the redirect is on the hot path for every user click), high availability (a dead link is a broken user experience for every downstream campaign), and eventual consistency for analytics (a few delayed click events are acceptable; a failed redirect is not).
3. High-Level Architecture
The system has two primary flows: the write path (create short URL) and the read path (redirect). These paths have very different performance profiles and can be scaled independently.
┌──────────────────────────────────────────────────────────────────┐
│ WRITE PATH │
│ │
│ Client ──► API Gateway ──► Write Service ──► ID Generator │
│ │ │ │
│ ▼ ▼ │
│ URL Validator Zookeeper/Range │
│ │ │
│ ▼ │
│ NoSQL DB (Cassandra) │
│ + Async Kafka → Analytics DB │
└──────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
│ READ PATH │
│ │
│ User ──► CDN Edge ──► Load Balancer ──► Redirect Service │
│ │ │ │
│ │ (cache HIT) ▼ │
│ └──────────── Redis Cluster ◄───── Lookup │
│ │ │ │
│ cache HIT cache MISS │
│ │ │ │
│ HTTP 302 Cassandra │
│ Redirect Lookup │
│ + populate cache │
└──────────────────────────────────────────────────────────────────┘
Component Responsibilities
- API Gateway: Rate limiting, authentication, TLS termination, routing to write vs. read service clusters
- Write Service: URL validation, short code generation, deduplication check, database write
- Redirect Service: Short code lookup (cache first, DB fallback), HTTP redirect, async click event emit
- ID Generator: Distributed counter with pre-allocated range blocks — eliminates collision by construction
- Redis Cluster: Cache short_code → long_url mappings; 20% of hot URLs absorb 80% of redirect traffic
- Cassandra: Distributed NoSQL store for all URL records; partition key = short_code for O(1) lookup
- Kafka: Click events streamed asynchronously to analytics consumers — zero impact on redirect latency
- CDN: Edge caching of redirect responses for globally distributed low-latency access
4. Hash Generation Strategies
Choosing the right short code generation strategy is the most critical design decision. The wrong approach causes collisions, wasted DB round-trips, and unpredictable performance under load.
Strategy Comparison
| Strategy | Pros | Cons | Verdict |
|---|---|---|---|
| MD5 / SHA-256 Hash | Same URL always maps to same code (deterministic) | Hash collisions require DB check + retry loop; truncation worsens collision rate | ❌ Avoid |
| Random Base62 | Simple, no coordination needed | Collision probability increases as DB fills; must always check DB before write | ⚠ Acceptable |
| Counter + Base62 | Zero collisions; predictable; tiny storage | Requires distributed counter coordination (Zookeeper/Redis) | ✅ Recommended |
| UUID + Truncate | Decentralized generation; no coordination | High collision rate after truncation; 128-bit UUID → 7-char = ~1% collision at 1B records | ❌ Avoid |
Base62 Encoding Explained
Base62 uses the character set 0-9a-zA-Z (62 characters). A 7-character code gives 627 = 3,521,614,606,208 unique values. The mapping is simply converting an integer counter to its Base62 representation:
// ❌ Bad - MD5 hash with truncation causes collisions and wastes DB round-trips
import java.security.MessageDigest;
public class BadUrlShortener {
public String shorten(String longUrl) throws Exception {
// ❌ MD5 is cryptographically broken and produces 128-bit output
MessageDigest md = MessageDigest.getInstance("MD5");
byte[] hash = md.digest(longUrl.getBytes());
// ❌ Taking only first 7 chars of hex increases collision probability dramatically
String hexHash = bytesToHex(hash);
String shortCode = hexHash.substring(0, 7);
// ❌ Must check DB for collision on EVERY write — O(n) as DB fills
if (codeExists(shortCode)) {
// ❌ Appending counter and re-checking is fragile under concurrency
shortCode = hexHash.substring(1, 8);
if (codeExists(shortCode)) {
throw new RuntimeException("Too many collisions"); // 💀
}
}
return shortCode;
}
private String bytesToHex(byte[] bytes) {
StringBuilder sb = new StringBuilder();
for (byte b : bytes) sb.append(String.format("%02x", b));
return sb.toString();
}
private boolean codeExists(String code) { /* DB lookup */ return false; }
}
// ✅ Good - Counter-based Base62 encoding with range pre-allocation eliminates collisions
public class Base62Encoder {
private static final String ALPHABET =
"0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
private static final int BASE = 62;
private static final int CODE_LENGTH = 7;
// ✅ Convert a unique monotonic counter value to a 7-char Base62 string
public String encode(long id) {
StringBuilder sb = new StringBuilder();
while (id > 0) {
sb.append(ALPHABET.charAt((int)(id % BASE)));
id /= BASE;
}
// ✅ Pad to fixed length to prevent code length leaking sequence info
while (sb.length() < CODE_LENGTH) {
sb.append(ALPHABET.charAt(0));
}
return sb.reverse().toString();
}
// ✅ Decode back to the original counter ID (useful for debugging)
public long decode(String code) {
long result = 0;
for (char c : code.toCharArray()) {
result = result * BASE + ALPHABET.indexOf(c);
}
return result;
}
}
// ✅ Distributed ID generator using range pre-allocation
// Each app node claims a range (e.g., 1000 IDs at a time) from Zookeeper/Redis
// and generates IDs locally without network calls — zero collision, high throughput
@Component
public class RangeBasedIdGenerator {
private final RedisTemplate<String, Long> redisTemplate;
private static final String COUNTER_KEY = "url:global:counter";
private static final long RANGE_SIZE = 1_000L;
private long rangeStart = 0;
private long rangeEnd = 0;
private final Object lock = new Object();
public RangeBasedIdGenerator(RedisTemplate<String, Long> redisTemplate) {
this.redisTemplate = redisTemplate;
}
// ✅ Claim a new range atomically from Redis when local range is exhausted
public long nextId() {
synchronized (lock) {
if (rangeStart >= rangeEnd) {
// Atomic increment — claims RANGE_SIZE IDs in one Redis call
Long newEnd = redisTemplate.opsForValue()
.increment(COUNTER_KEY, RANGE_SIZE);
rangeEnd = newEnd;
rangeStart = newEnd - RANGE_SIZE;
}
return rangeStart++;
}
}
}
5. Database Design
The database choice underpins the scalability of your entire system. The URL mapping table is a classic write-once, read-many workload — an almost perfect use case for NoSQL.
SQL vs. NoSQL Trade-off Analysis
| Criterion | SQL (PostgreSQL / MySQL) | NoSQL (Cassandra / DynamoDB) |
|---|---|---|
| Horizontal scaling | Hard — requires sharding middleware or Vitess | Native — add nodes, data redistributes automatically |
| Read throughput at scale | Good with read replicas; cap at ~100K RPS | Excellent; millions of RPS with linear scaling |
| Data model fit | Overkill — joins & ACID not needed for key-value lookup | Perfect — short_code is the partition key, O(1) lookup |
| Consistency model | Strong ACID — guarantees unique short codes natively | Eventual (tunable); use lightweight transactions for uniqueness |
| Use SQL for | User accounts, billing, subscription data — where ACID and relational queries matter | |
| Use NoSQL for | URL mapping table — pure key-value, 18 TB/year, billions of rows, read-heavy | |
Cassandra Schema
-- CQL Schema for URL mapping table
CREATE KEYSPACE url_shortener
WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1': 3}
AND durable_writes = true;
CREATE TABLE url_shortener.url_mappings (
short_code TEXT PRIMARY KEY, -- partition key = O(1) lookup
long_url TEXT,
user_id UUID,
created_at TIMESTAMP,
expires_at TIMESTAMP,
is_active BOOLEAN,
click_count COUNTER -- approximate; exact counts in analytics DB
);
CREATE TABLE url_shortener.user_urls (
user_id UUID,
created_at TIMESTAMP,
short_code TEXT,
long_url TEXT,
PRIMARY KEY (user_id, created_at) -- partition by user, cluster by time
) WITH CLUSTERING ORDER BY (created_at DESC);
-- Index for deduplication: same long URL by the same user → return existing short code
CREATE INDEX ON url_shortener.url_mappings (long_url); -- use sparingly!
Why two tables? The url_mappings table is optimized for the redirect path (lookup by short_code). The user_urls table is optimized for the dashboard query (list all links by user_id, paginated by time). This is Cassandra's denormalization pattern — store data in the shape of your queries.
6. Caching Strategy
Caching is the most impactful optimization for the redirect path. URL access follows a Zipfian distribution — the top 20% of URLs account for 80%+ of redirects. A well-tuned Redis cache can serve 95%+ of redirects without a database hit.
Cache Design Decisions
- Cache aside (lazy loading): On cache miss, redirect service fetches from Cassandra and populates Redis. Simple, no stale data on writes.
- TTL: 24–48 hours for regular URLs; shorter (1 hour) for expiring links. Never cache deleted/deactivated URLs — use a tombstone entry with TTL=5 minutes.
- Cache capacity: 20% of daily URLs fit in cache covers 80% of traffic. At 500 bytes/entry × 20M entries = ~10 GB RAM — very manageable for Redis.
- Eviction policy:
allkeys-lru— evict the least recently used key when memory is full. Automatically keeps hot URLs in cache. - Redis cluster: Shard across 6 nodes (3 primary + 3 replica) for HA and throughput. Use consistent hashing for key distribution.
// ✅ Good - Cache-aside pattern with Spring Boot + Redis for URL redirect lookup
@Service
public class UrlRedirectService {
private static final String CACHE_KEY_PREFIX = "url:short:";
private static final Duration CACHE_TTL = Duration.ofHours(24);
private static final String TOMBSTONE = "__DELETED__";
private final StringRedisTemplate redisTemplate;
private final UrlMappingRepository cassandraRepo;
public UrlRedirectService(StringRedisTemplate redisTemplate,
UrlMappingRepository cassandraRepo) {
this.redisTemplate = redisTemplate;
this.cassandraRepo = cassandraRepo;
}
// ✅ Cache-aside: check cache first, fall back to Cassandra, then populate cache
public String resolveLongUrl(String shortCode) {
String cacheKey = CACHE_KEY_PREFIX + shortCode;
// 1. Check Redis cache
String cached = redisTemplate.opsForValue().get(cacheKey);
if (cached != null) {
if (TOMBSTONE.equals(cached)) {
// ✅ Tombstone entry means URL was deleted — avoid DB hit
throw new UrlNotFoundException(shortCode);
}
return cached;
}
// 2. Cache miss — query Cassandra
UrlMapping mapping = cassandraRepo.findByShortCode(shortCode)
.orElseThrow(() -> {
// ✅ Write tombstone to prevent cache stampede on non-existent codes
redisTemplate.opsForValue().set(cacheKey, TOMBSTONE,
Duration.ofMinutes(5));
return new UrlNotFoundException(shortCode);
});
// 3. Validate: check expiry and active flag
if (!mapping.isActive() || isExpired(mapping)) {
redisTemplate.opsForValue().set(cacheKey, TOMBSTONE,
Duration.ofMinutes(5));
throw new UrlExpiredException(shortCode);
}
// 4. Populate cache and return
redisTemplate.opsForValue().set(cacheKey, mapping.getLongUrl(), CACHE_TTL);
return mapping.getLongUrl();
}
// ✅ Invalidate cache on URL deletion or deactivation
public void invalidate(String shortCode) {
String cacheKey = CACHE_KEY_PREFIX + shortCode;
redisTemplate.opsForValue().set(cacheKey, TOMBSTONE, Duration.ofMinutes(5));
}
private boolean isExpired(UrlMapping mapping) {
return mapping.getExpiresAt() != null
&& mapping.getExpiresAt().isBefore(Instant.now());
}
}
7. Custom Aliases & Rate Limiting
Custom Aliases
Custom aliases (e.g., short.ly/my-product-launch) are user-specified short codes. They require additional validation: alias format check, reserved word filtering, uniqueness check, and user tier enforcement (free users get 5 custom aliases; premium get unlimited).
- Validate alias against regex
^[a-zA-Z0-9_-]{3,30}$— no path separators or reserved URI characters - Block reserved words:
admin,api,login,health,metrics, etc. - Check Cassandra for uniqueness with a conditional write (
INSERT IF NOT EXISTSin Cassandra LWT) - Return HTTP 409 Conflict if alias is already taken
Rate Limiting
Rate limiting protects the write path from abuse (scraping, competitor bulk creation, DoS). Use a Token Bucket or Sliding Window Counter per user/IP, backed by Redis for distributed enforcement.
// ✅ Good - Sliding window rate limiter using Redis sorted set
// Allows N requests per window, tracked per user ID
@Component
public class SlidingWindowRateLimiter {
private final StringRedisTemplate redisTemplate;
// ✅ Configuration: max 100 URL creations per hour per user
private static final long WINDOW_MILLIS = 3_600_000L; // 1 hour
private static final long MAX_REQUESTS = 100L;
public SlidingWindowRateLimiter(StringRedisTemplate redisTemplate) {
this.redisTemplate = redisTemplate;
}
/**
* ✅ Returns true if the request is allowed; false if rate limit exceeded.
* Uses Redis ZSET: score = timestamp, member = unique request ID
*/
public boolean isAllowed(String userId) {
String key = "ratelimit:create:" + userId;
long now = System.currentTimeMillis();
long windowStart = now - WINDOW_MILLIS;
// ✅ Execute as a Lua script for atomicity — avoids TOCTOU race conditions
String luaScript =
"local key = KEYS[1] " +
"local now = tonumber(ARGV[1]) " +
"local window = tonumber(ARGV[2]) " +
"local max = tonumber(ARGV[3]) " +
"local id = ARGV[4] " +
// Remove timestamps outside the window
"redis.call('ZREMRANGEBYSCORE', key, '-inf', now - window) " +
// Count requests inside the window
"local count = redis.call('ZCARD', key) " +
"if count < max then " +
" redis.call('ZADD', key, now, id) " +
" redis.call('EXPIRE', key, math.ceil(window / 1000)) " +
" return 1 " +
"else " +
" return 0 " +
"end";
Long result = redisTemplate.execute(
new DefaultRedisScript<>(luaScript, Long.class),
List.of(key),
String.valueOf(now),
String.valueOf(WINDOW_MILLIS),
String.valueOf(MAX_REQUESTS),
userId + ":" + now // unique member per request
);
return Long.valueOf(1L).equals(result);
}
}
8. Analytics & Tracking
Analytics is a key differentiator for paid tiers (Bitly Enterprise charges for advanced analytics). Every click generates a rich event: timestamp, referrer URL, IP → geolocation, User-Agent → device/browser/OS, and country/region.
Architecture: Never Block the Redirect
The cardinal rule of analytics in a URL shortener: the redirect must never wait for analytics to complete. Analytics writes must be 100% asynchronous. The flow:
- Redirect service resolves short code, emits click event to Kafka topic
url.click.events, then immediately returns the HTTP 302 redirect. Total added latency: <1ms for the Kafka produce call (fire-and-forget). - A separate Analytics Consumer (Kafka consumer group) processes click events in micro-batches, enriches them with MaxMind GeoIP and User-Agent parsing, then writes to ClickHouse (or TimescaleDB) for fast analytical queries.
- An Aggregation Job (Spark Streaming or Flink) pre-aggregates hourly counts, top referrers, and geo heatmaps into a reporting table read by the user dashboard.
// ✅ Clean API — Redirect Controller with async analytics
@RestController
@RequestMapping("/")
public class RedirectController {
private final UrlRedirectService redirectService;
private final KafkaTemplate<String, ClickEvent> kafkaTemplate;
private static final String CLICK_TOPIC = "url.click.events";
public RedirectController(UrlRedirectService redirectService,
KafkaTemplate<String, ClickEvent> kafkaTemplate) {
this.redirectService = redirectService;
this.kafkaTemplate = kafkaTemplate;
}
@GetMapping("/{shortCode}")
public ResponseEntity<Void> redirect(
@PathVariable String shortCode,
HttpServletRequest request) {
// 1. Resolve URL (cache → DB)
String longUrl = redirectService.resolveLongUrl(shortCode);
// 2. Fire-and-forget analytics event — non-blocking
ClickEvent event = ClickEvent.builder()
.shortCode(shortCode)
.timestamp(Instant.now())
.ipAddress(extractClientIp(request))
.userAgent(request.getHeader("User-Agent"))
.referer(request.getHeader("Referer"))
.build();
kafkaTemplate.send(CLICK_TOPIC, shortCode, event); // async, returns Future
// 3. 302 redirect — browser will NOT cache this (unlike 301)
// ✅ Use 302 to ensure every click hits the server for analytics tracking
return ResponseEntity.status(HttpStatus.FOUND)
.header(HttpHeaders.LOCATION, longUrl)
.header("Cache-Control", "no-cache, no-store")
.build();
}
// ✅ Extract real client IP — handle X-Forwarded-For from CDN/load balancer
private String extractClientIp(HttpServletRequest request) {
String xff = request.getHeader("X-Forwarded-For");
if (xff != null && !xff.isBlank()) {
return xff.split(",")[0].trim(); // first IP in chain is the client
}
return request.getRemoteAddr();
}
}
// ✅ Clean API — URL creation endpoint
@RestController
@RequestMapping("/api/v1/urls")
public class UrlCreateController {
private final UrlCreateService urlCreateService;
private final SlidingWindowRateLimiter rateLimiter;
@PostMapping
public ResponseEntity<UrlCreateResponse> createShortUrl(
@Valid @RequestBody UrlCreateRequest request,
@AuthenticationPrincipal UserDetails user) {
if (!rateLimiter.isAllowed(user.getUsername())) {
return ResponseEntity.status(HttpStatus.TOO_MANY_REQUESTS).build();
}
UrlCreateResponse response = urlCreateService.create(
request.getLongUrl(),
request.getCustomAlias(), // nullable
request.getExpiresAt(), // nullable
user.getUsername()
);
return ResponseEntity.status(HttpStatus.CREATED).body(response);
}
}
9. Scalability Deep Dive
Horizontal Scaling the Application Tier
Both the write service and redirect service are stateless — they hold no local session state. This makes horizontal scaling trivial: add more instances behind a load balancer. The only shared state is in Redis and Cassandra. Target 10–20 redirect service pods per region, autoscaled on CPU and RPS.
Consistent Hashing for Redis Cluster
When adding Redis nodes to handle traffic growth, consistent hashing minimizes cache invalidation. Standard modular sharding (hash % N) invalidates ~100% of keys when N changes. Consistent hashing invalidates only ~1/N keys — critical for a cache serving 100M+ items.
Redis Cluster implements consistent hashing natively using hash slots (16,384 total). Spring Data Redis with Lettuce or Jedis handles cluster-aware routing transparently. Use virtual nodes (vnodes) to ensure even key distribution when nodes have different capacities.
CDN for Global Edge Caching
For globally distributed systems, serve redirects from CDN edge nodes (Cloudflare Workers, AWS CloudFront with Lambda@Edge). A user in Tokyo hitting a Cloudflare edge near Tokyo gets a redirect in <5ms instead of <100ms to a US origin. CDN caches the 302 response with a short TTL (e.g., 60 seconds) — balancing freshness with latency.
Important: CDN edge caching of redirects means analytics events are NOT generated at the origin server. Handle this by having the CDN forward a click beacon (async HTTP request) to your analytics ingestion endpoint, or accept that CDN-cached redirects will under-count clicks.
Multi-Region Deployment
- Active-active: Write service runs in all regions; Cassandra uses multi-datacenter replication with LOCAL_QUORUM for writes. Consistency sacrifice: ~100ms replication lag between regions is acceptable for URL creation.
- ID generation: Each region is assigned a unique prefix bit pattern in the counter to prevent cross-region ID collisions without coordination. (Similar to Twitter Snowflake: region bits in the ID.)
- DNS routing: Route 53 or Cloudflare Geo-Routing sends users to the nearest region. Failover is automatic if health checks fail.
Handling Thundering Herd / Cache Stampede
When a viral URL's cache entry expires simultaneously, thousands of requests hit Cassandra at once — the classic cache stampede. Mitigations:
- Mutex lock: First miss acquires a Redis distributed lock; other misses wait for the first request to populate the cache. Jedis/Lettuce support
SET NX EXfor this pattern. - Probabilistic early expiry (PER): Refresh the cache entry slightly before it expires with a small probability that increases as TTL approaches zero. Prevents synchronized expiry.
- Background refresh: A background thread refreshes cache entries that are approaching TTL expiry, keeping them warm before the stampede window.
10. Common Mistakes & Interview Tips
Mistakes Candidates Commonly Make
Mistake 1: Using MD5 for Hash Generation
MD5 is not an ID generator — it's a fingerprint. When you truncate a 32-char hex string to 7 characters, you dramatically increase collision probability. At 1 billion URLs, truncated MD5 collides ~27% of the time. Always clarify in the interview that you'll use counter-based Base62 encoding instead.
Mistake 2: Using 301 Instead of 302
301 (Permanent Redirect) causes browsers to cache the redirect locally and go directly to the destination on future visits — bypassing your server entirely. This eliminates all analytics data for repeat visitors. The correct choice for a tracked URL shortener is always 302 Temporary Redirect.
Mistake 3: Blocking the Redirect for Analytics
Writing analytics data synchronously (calling the analytics DB inside the redirect handler) adds 5–50ms to every single redirect. At 100M redirects/day, even 10ms of extra latency is catastrophic. Always use an async message queue (Kafka, SQS) and decouple analytics writes from the hot path.
Mistake 4: Not Discussing the ID Generator Bottleneck
A single Redis counter for ID generation is a single point of failure and a bottleneck under high write throughput. Always mention range pre-allocation (each write node claims a batch of 1,000 IDs) and Zookeeper-based coordination for true distributed ID generation (Twitter Snowflake pattern).
Mistake 5: Ignoring URL Validation and Security
A URL shortener without validation becomes a phishing vector. Production systems must: validate URL format, check against malware/phishing URL blocklists (Google Safe Browsing API), block javascript: and data: scheme URLs, and show a preview/warning page for suspicious destinations.
Interview Bonus Points
- Mention Bloom filter as a pre-check before DB lookups for non-existent short codes (reduces unnecessary DB reads for 404 abuse)
- Discuss database TTL / scheduled cleanup for expired URLs to reclaim storage
- Propose short URL recycling for expired codes after a grace period
- Address abuse prevention: CAPTCHA for anonymous creation, honeypot fields, anomaly detection on high-volume creators
- Mention vanity metrics: interviewers love it when you ask "should the system support the same long URL creating multiple distinct short codes or deduplicate?"
11. Conclusion & Key Takeaways
A production-grade URL shortener at Bitly/TinyURL scale is a masterclass in distributed systems trade-offs. The system is deceptively simple to describe but requires careful engineering at every layer:
Key Takeaways Checklist
- ✅ Hash generation: Counter + Base62 encoding — zero collisions, O(1) generation
- ✅ Storage: NoSQL (Cassandra) for URL mappings; SQL only for user accounts
- ✅ Caching: Redis with LRU eviction and tombstone entries; target 95%+ cache hit rate
- ✅ Redirect: HTTP 302 (not 301) to enable per-click analytics tracking
- ✅ Analytics: Async Kafka pipeline — never block the redirect on analytics writes
- ✅ Rate limiting: Sliding window counter per user in Redis; Lua scripts for atomicity
- ✅ Scalability: Stateless app tier + Redis cluster + CDN edge caching for global distribution
- ✅ Security: URL validation, Safe Browsing API check, reserved alias blocklist
Whether you're taking this to a system design interview or building a real product, the patterns here — cache-aside, async analytics, range-based ID generation, CDN edge caching — apply far beyond URL shorteners. They are the building blocks of any high-throughput, low-latency read-heavy distributed service.
FAQs: URL Shortener System Design
Q: How many characters should a short URL code be?
7 characters using Base62 (a-z, A-Z, 0-9) gives 627 ≈ 3.5 trillion unique codes — more than enough for billions of URLs with headroom. TinyURL uses 8 characters; Bitly uses 7. 6 characters yields 56 billion codes, which is still sufficient for most services.
Q: What database should I use for a URL shortener?
NoSQL (Cassandra, DynamoDB) is preferred for the URL mapping table because read throughput far exceeds write throughput (10:1), the data model is a simple key-value lookup, and horizontal scaling is straightforward. Use a relational database only if you need strong transactional guarantees for user accounts or billing.
Q: How do you handle hash collisions in a URL shortener?
First, check the database before inserting — if the short code already exists for a different long URL, append a counter suffix or re-hash with a salt and retry. Using a Bloom filter as a pre-check avoids unnecessary database round-trips. A counter-based pre-allocation strategy (range-based ID generation) eliminates collisions entirely.
Q: How do you scale a URL shortener to handle 100M URLs per day?
Use horizontal scaling at the application tier behind a load balancer, a distributed cache (Redis cluster) for hot URLs, a NoSQL database with sharding for storage, CDN edge nodes for the redirect path, and an async event pipeline (Kafka) for click analytics. The redirect path must be ultra-low latency — target under 10ms p99.
Q: What is the difference between 301 and 302 redirects in a URL shortener?
301 (Permanent Redirect) instructs browsers to cache the redirect and go directly to the destination on future visits — this reduces server load but prevents analytics tracking of repeat visits. 302 (Temporary Redirect) forces the browser to hit your server every time, enabling accurate click counting. Services like Bitly use 302 to track every click for analytics.