Skip to main content

Twitter System Design

Twitter is a social media platform that allows users to post and interact with messages known as "tweets". Let's analyze its system design.

System Requirements

Functional Requirements

  • Users can post tweets (140-280 characters)
  • Users can follow other users
  • Users can view their timeline
  • Users can like and retweet posts
  • Real-time notifications

Non-Functional Requirements

  • High availability
  • Low latency
  • Scalability for millions of users
  • Data consistency
  • Real-time updates

Architecture

High-Level Design

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│ Web/App │ │ API Layer │ │ Services │
│ Clients │────▶│ (REST/WS) │────▶│ Layer │
└─────────────┘ └─────────────┘ └─────────────┘
│ │
▼ ▼
┌─────────────┐ ┌─────────────┐
│ Caching │ │ Database │
│ Layer │ │ Layer │
└─────────────┘ └─────────────┘

Components

  1. Load Balancers

    • Distribute traffic across API servers
    • Handle SSL termination
    • Implement health checks
  2. API Servers

    • Handle user requests
    • Implement rate limiting
    • Manage authentication
  3. Services

    • Tweet Service
    • User Service
    • Timeline Service
    • Notification Service
  4. Caching

    • Redis for timeline caching
    • Memcached for user data
    • CDN for media content
  5. Database

    • MySQL for user data
    • Cassandra for tweets
    • Redis for timelines
    • MongoDB for analytics

Data Models

User

{
"user_id": "123",
"username": "john_doe",
"name": "John Doe",
"email": "[email protected]",
"followers": ["456", "789"],
"following": ["789"],
"created_at": "2023-01-01T00:00:00Z"
}

Tweet

{
"tweet_id": "abc123",
"user_id": "123",
"content": "Hello, Twitter!",
"created_at": "2023-01-01T00:00:00Z",
"likes": 100,
"retweets": 50,
"replies": 25
}

Key Features

Timeline Generation

  1. Pull Model

    • User requests timeline
    • System fetches tweets from followed users
    • Merge and sort tweets
    • Return to user
  2. Push Model

    • Pre-compute timelines
    • Store in cache
    • Update on new tweets
    • Serve from cache

Real-time Updates

  • WebSocket connections
  • Fan-out on write
  • Event-driven architecture
  • Message queues

Scaling Strategies

Database Scaling

  • Sharding by user_id
  • Read replicas
  • Caching layer
  • Data partitioning

Caching Strategy

  • Multi-level caching
  • Cache invalidation
  • Write-through cache
  • Cache warming

Challenges

  1. Timeline Generation

    • Handling celebrity users
    • Real-time updates
    • Consistency vs. Performance
  2. Data Consistency

    • Eventual consistency
    • Conflict resolution
    • Data synchronization
  3. Performance

    • Latency optimization
    • Resource utilization
    • Cost management

Further Reading