The interview question sounded too simple. It wasn’t.
“Design a multiplayer quiz platform like Kahoot.”
Create quiz. Join room. Show questions. Calculate scores. Done.
That’s also exactly where the trap begins.
Because building a quiz platform for 50 users is genuinely straightforward. Building it for 10,000 users answering the same question at the same moment is an entirely different problem — and that’s what the interviewer is actually testing.
The obvious solution is to broadcast the question immediately.
The problem is that “immediately” means different things to different players.
Some receive the message in 50ms.
Others in 300ms.
A few in 800ms.
That difference sounds small until rankings depend on reaction time.

Start With Clarifying Requirements
Before drawing a single box, the first question to ask is: what does “real-time” actually mean here?
Functional Requirements:
- Host creates a quiz and controls progression
- Players join using a room code
- Questions appear in real time for all players
- Players submit answers within a time window
- Leaderboard updates after each question
Non-Functional Requirements:
- Support 10,000 concurrent players in a single room
- Question synchronization under 1 second across all clients
- High availability — a node going down shouldn’t end an active quiz
- Prevent duplicate answer submissions
The interesting requirement is hidden in the second list: every player should see Question 2 at almost exactly the same moment. That single constraint turns a quiz application into a distributed systems problem.
High-Level Architecture
Host
|
v
Quiz Service → Room Service
|
WebSocket Gateway
|
+-----------+-----------+
| |
Players (10,000+) Answer Service
|
Leaderboard Service
|
Redis
|
Database
At first glance this looks reasonable. But the real discussion starts the moment the host clicks “Next Question.”
The Question Synchronization Problem — The First Hidden Challenge
Imagine 10,000 users connected to the room. The host clicks “Question 2.”
A naive implementation loops through all connected players and sends the question to each one individually. This works for 50 users. It fails badly for 10,000 — because players at the end of the loop receive the question two or three seconds after players at the beginning. The quiz becomes unfair before it starts. One user effectively received extra thinking time because of where they happened to be in a server-side loop.
The better approach shifts responsibility to the clients:
{
"questionId": 2,
"startAt": "2025-05-10T10:00:30.000Z"
}Every client receives this event. Every client waits until startAt. The question renders simultaneously for all players regardless of when they received the event — even if network delivery varied by a few hundred milliseconds.
This is timer-based synchronization. It’s the same technique used in multiplayer games to keep clients in sync without requiring perfect network conditions. The server coordinates the moment; the clients enforce it locally.
Why WebSockets Matter Here
Many candidates immediately reach for REST APIs. That creates a polling model:
Client → "Any new question?" → Server
Client → "Any new question?" → Server
Client → "Any new question?" → Server
At 10,000 concurrent players polling every second, you have 10,000 requests per second just to ask “has anything changed?” That’s 10,000 requests per second of work the server has to do to answer “no, nothing yet” for most of the quiz duration.
WebSockets reverse the direction. The server pushes events to connected clients:
QUESTION_STARTED → all players
TIMER_UPDATE → all players
LEADERBOARD_UPDATE → all players
QUIZ_ENDED → all players
One persistent connection per player. Server-initiated events only when something actually changes. At 10,000 players, you have 10,000 open connections — but you’re only pushing data when the host advances the quiz, not responding to a flood of polling requests.
The architecture moves from request-response to event-driven, which is the right model for anything that needs to feel live.

The Leaderboard Problem — Where the Database Becomes the Bottleneck
10,000 users submit answers simultaneously. The window is typically five seconds. That’s potentially 2,000 answer submissions per second, all needing to update scores and recompute rankings.
The naive solution:
for each answer:
UPDATE scores SET score = score + points WHERE player_id = ?
SELECT player_id, score FROM scores ORDER BY score DESC LIMIT 10
The database becomes your bottleneck immediately. Every answer triggers a write and a ranking recalculation. Under 2,000 submissions per second, you’re looking at 2,000 writes and 2,000 sort operations per second on a table that’s growing as the quiz progresses.
Redis Sorted Sets solve this cleanly:
ZADD leaderboard:room123 <score> <playerId>
ZREVRANGE leaderboard:room123 0 9 WITHSCORES // top 10 instantly
ZREVRANK leaderboard:room123 <playerId> // player's rank instantly
Redis Sorted Sets maintain entries in score order automatically. Adding or updating a score is O(log n). Retrieving the top 10 is O(log n + 10). At 10,000 players, this is fast enough that you can push leaderboard updates to all players after every answer window closes without the computation becoming the constraint.
The database still stores canonical scores for persistence — but the real-time leaderboard lives in Redis. The leaderboard becomes an in-memory problem rather than a database problem.
Preventing Duplicate Answers — An Idempotency Problem in Disguise
A natural follow-up question: what stops a player from submitting the same answer multiple times?
The solution sounds like a quiz feature but is actually a classic idempotency pattern:
Unique constraint on (roomId, questionId, playerId)
First write wins. All subsequent submissions are rejected.
The first answer is recorded and scored. Every later submission for the same player on the same question is a no-op. The server returns the same response whether it’s the first submission or the twentieth — from the player’s perspective, nothing changes. From the scoring perspective, nothing dangerous happened.
This is the same principle used in payment systems to prevent double charges, in order systems to prevent duplicate orders, and in inventory systems to prevent overselling. The context changes. The pattern is identical.
The Event Ordering Problem
Most candidates focus on getting events delivered quickly.
The harder problem is making sure they arrive in the correct order.
Imagine a quiz room where the host advances the game:
Question 2 Started
↓
Timer Started
↓
Question 2 Ended
↓
Leaderboard Updated
But once your system is distributed across multiple servers, message brokers, and WebSocket gateways, there is no guarantee that every client receives those events in exactly the same order.
A player might receive:
Question 2 Started
↓
Leaderboard Updated
↓
Timer Started
↓
Question 2 Ended
Now the leaderboard is being shown before the question has even finished.
The events themselves are correct.
The order is not.
And in real-time systems, wrong ordering can be just as dangerous as losing messages entirely.
Why Does This Happen?
A typical architecture might look like this:
Quiz Service
|
v
Message Broker
|
+---+---+
| |
WS Node1 WS Node2
Events travel through multiple components before reaching users.
Different nodes process messages at slightly different speeds.
Network latency fluctuates.
Retries happen.
Temporary failures happen.
An event generated later can occasionally arrive before an event generated earlier.
This is normal behavior in distributed systems.
The challenge is preventing it from breaking the user experience.
A Better Approach: Sequence Numbers
Instead of relying on arrival order, every event is assigned a sequence number.
{
"roomId": "quiz-123",
"sequence": 101,
"event": "QUESTION_STARTED"
}{
"roomId": "quiz-123",
"sequence": 102,
"event": "TIMER_STARTED"
}{
"roomId": "quiz-123",
"sequence": 103,
"event": "QUESTION_ENDED"
}The client keeps track of the latest sequence it has processed.
if (event.sequence > lastProcessedSequence) {
apply(event);
}Older or duplicate events are ignored.
Even if messages arrive out of order, the client can maintain a consistent view of the game.
The Even Harder Problem: Missing Events
Now imagine the client receives:
101 → QUESTION_STARTED
103 → QUESTION_ENDED
Sequence 102 never arrived.
The client now knows there is a gap.
Instead of blindly processing the next event, it can request the latest room state from the server and resynchronize itself.
GET /rooms/{roomId}/stateThe server responds with the current truth:
Current Question: 3
Time Remaining: 12 seconds
Leaderboard Version: 8
The client catches up and continues without corrupting its state.
The moment your application becomes event-driven, “Did the message arrive?” stops being the interesting question.
“Did the messages arrive in the correct order?” is where the real engineering begins.

The Reconnect Storm — The Scaling Problem That Appears When It Hurts Most
Imagine a WebSocket server node goes down mid-quiz. 3,000 players were connected to that node. Within seconds, 3,000 clients attempt to reconnect — simultaneously, to whatever nodes are still running.
This is called a reconnect storm. The remaining nodes receive a sudden spike of new connection requests and session restoration operations at exactly the moment when the system is already degraded. If the remaining nodes can’t handle the burst, the cascade continues.
A mature design prevents this:
- Load balancer distributes new connections across multiple WebSocket nodes
- Room state and player session metadata live in Redis, not in the WebSocket server’s memory — so any node can serve any player after reconnection
- Exponential backoff with jitter on client reconnection attempts — clients don’t all retry simultaneously, they spread out over a window
- Sticky sessions only where required (for example, if the game state machine is local to a node)
The room should survive a node failure. A single node going down should be a recovery event, not a game-ending event.

Scaling to 100,000 Players — The Question Worth Thinking Through
At 10,000 players, the design above holds.
At 100,000 players, the first bottleneck isn’t where most people expect.
The WebSocket connections themselves are manageable — a single server can hold tens of thousands of persistent connections. The connection count isn’t the constraint.
The constraint is the broadcast.
When the host clicks “Next Question,” the server needs to push the synchronization event to 100,000 connected clients. If that broadcast is handled by a single process, it becomes a tight loop that takes non-trivial time — reintroducing the fairness problem you solved with timer-based synchronization.
The solution: a fan-out architecture using a message broker. The host triggers an event that goes to Kafka or Redis Pub/Sub. Multiple consumer processes — each handling a shard of the connected players — pick up the event and broadcast to their slice simultaneously. The broadcast becomes parallel rather than sequential.
This is also where the room partitioning decision matters. A room with 100,000 players might itself be served by multiple WebSocket shards, with each shard responsible for a subset of players.
What the Interviewer Is Actually Testing
Most candidates spend time on tables, APIs, and microservice decomposition. Those matter, but they’re not the interesting part of this question.
The real question is: how do you keep thousands of users synchronized in real time? Because that’s where the problem becomes non-trivial.
Timer-based synchronization to ensure fairness. WebSocket push over REST polling to make it feel live. Redis sorted sets to make leaderboards fast at scale. Idempotency keys to make answer submission safe under retries. Reconnect handling to survive infrastructure failures without killing active games.
None of those challenges appear when you design the quiz for 10 users. All of them appear when 10,000 users join the same room at the same time.
A multiplayer quiz platform looks like a CRUD application at first glance. Once thousands of players enter the same room simultaneously, it quietly transforms into a distributed systems problem — and that’s exactly why it appears in system design interviews.
The quiz itself is easy. Making everyone experience the same quiz at the same moment is where the engineering begins.
If you want to practice these kinds of questions in a structured setting — both sides of the table — PracHub is built specifically to make technical interview practice more realistic and transparent.
Part of a series on system design interview questions. Earlier posts cover YouTube streaming architecture, monolith migration under write load, coupon system design, and the distributed caching problems that most interview answers skip entirely.
UBER: Design a Real-Time Quiz Platform Like Kahoot. The Quiz Isn’t the Hard Part. was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.