Back to Articles
System Design 2025-12-23 20 min read

Design Twitter (X): A System Design Interview Guide

Design Twitter (X): A System Design Interview Guide

Designing a system like Twitter (now X) is a classic system design interview question because it touches on almost every aspect of distributed systems: heavy read/write loads, eventual consistency, complex data modeling, and the famous "fan-out" problem.

In this guide, we will architect a platform capable of handling 500 million daily active users (DAU) and billions of tweets per day.

1. Requirements & Estimation

Before writing code or drawing boxes, we must define the scale.

Functional Requirements

  1. Post Tweet: Users can post short text/media messages.
  2. Home Timeline: Users can view a feed of tweets from people they follow.
  3. Follow/Unfollow: Users can follow others.
  4. Search: Users can search tweets by keywords (Out of scope for this article, focusing on Feed).

Capacity Estimation (Back-of-the-Envelope)

  • DAU: 500 Million.
  • Reads: Each user visits their timeline 5 times/day -> 2.5 Billion reads/day (~29k QPS).
  • Writes: Users post 100 Million tweets/day (~1.2k QPS).
  • Ratio: Read-heavy system (High Read-to-Write ratio).

Conclusion: We need a system optimized for fast reads.


2. High-Level Architecture

We will use a Microservices architecture to separate concerns.

Key Components

  • User Service: Profile management, authentication.
  • Tweet Service: Storage and retrieval of tweet content.
  • Social Graph Service: Tracks who follows who.
  • Timeline Service: The most complex piece. Generates and retrieves news feeds.

3. Data Modeling

Tweets & Scale (The "Snowflake" ID)

We cannot rely on a single database's auto-incrementing ID. We need a global, unique, sortable ID generator. Twitter uses Snowflake:

  • 64-bit integer.
  • Sorting by ID roughly equates to sorting by time.
  • Distributed generation without coordination.

Database Choice

  • User Data: MySQL/PostgreSQL. Relational data requires ACID compliance (ACID is less critical for tweets, but critical for user accounts).
  • Tweet Data: Cassandra or DynamoDB. Why? Massive write throughput, simple key-value structure (TweetID -> Data), and linear horizontal scalability.
  • Social Graph: Graph Database (Neo4j) or specific key-value store optimized for adjacency lists.

4. The Core Challenge: Timeline Generation

How do we efficiently show a user a feed of tweets from the 500 people they follow? There are three main approaches.

Approach A: Pull Model (Fan-out on Read)

When User A checks their feed:

  1. Fetch IDs of everyone User A follows.
  2. Fetch recent tweets for all those IDs (e.g., SELECT * FROM tweets WHERE user_id IN (...)).
  3. Merge and sort in memory.
  • Pros: Simple implementation. NRT (Near Real-Time).
  • Cons: High Latency. If a user follows 2,000 people, the DB query is heavy. Doesn't scale for read-heavy systems.

Approach B: Push Model (Fan-out on Write)

When User B posts a tweet:

  1. Find all followers of User B.
  2. Insert the tweet ID into a cached timeline list (e.g., Redis List) for each follower.
  • Pros: Zero Latency on Read. The timeline is pre-computed.
  • Cons: Write Amplification. If Justin Bieber (100M followers) tweets, the system must perform 100M writes. This is the "Thundering Herd" problem.

Approach C: The Hybrid Strategy (Winner)

We combine the two.

  • Normal Users: Use Push Model. Their tweet is pushed to their few hundred followers' caches immediately.
  • Celebrities (VIPs): Use Pull Model. Justin Bieber's tweets are not pushed. Instead, when a user views their timeline, they pull Bieber's tweets separately and merge them.

5. Storage Optimization: Sharding

We must shard our databases. Sharding by User ID vs Tweet ID?

  • Sharding by User ID: All tweets for a user live on one shard.
    • Pro: Fast to fetch all tweets for one user.
    • Con: "Hot Partition" problem (Celebrities).
  • Sharding by Tweet ID: Tweets are distributed randomly (or by Snowflake time).
    • Pro: Even distribution of load.
    • Con: Fetching timeline requires querying all shards (Scatter-Gather).

Industry Standard: Often a mix, but time-based sharding (Snowflake) combined with secondary indexing helps.


6. Caching & Reliability

Redis Strategy

The timeline should be stored as a Redis List or Sorted Set.

  • Key: timeline:{user_id}
  • Value: List of tweet_ids.
  • We only need to cache the last 800 tweets. Old tweets can be fetched from the DB on demand (scroll).

Replication

  • Primary-Replica: Master accepts writes, Replicas serve reads.
  • Quorum Writes (Cassandra): Ensure data is written to at least 2/3 nodes for durability.

7. Summary

Designing Twitter requires handling conflicting constraints. The Hybrid Fan-out architecture allows us to serve 99% of users with pre-computed (O(1)) timelines while preventing system collapse when celebrities tweet.

By leveraging Redis for hot timelines, Cassandra for massive write ingestion, and a Snowflake ID generator for distributed sorting, we build a system that is both resilient and lightning fast.