Functionality of the chat service. When senders send messages to chat services, chat services store messages and relay them to receivers.
Senders can use HTTP protocol to send messages to chat services. The HTTP header can set to keep-alive to maintain a more persistent connection.
Receivers cannot simply use HTTP because HTTP is client initiated, not server initiated.
Polling: clients periodically ask servers if there are chat messages available.
Long polling: server holds clients’ request until there is new chat message available or the request time out.
WebSocket: the most common solution for sending asynchronous updates from server to client. WebSocket connection is initiated by the client, starting like a HTTP connection and then upgraded to a WebSocket connection. WebSocket is persistent and bi-directional.
High Level Design
Components in a chat system
Stateless API servers for signup, login, profile and other services.
Stateful servers for chat services, presence services.
Message queue and corresponding handlers for asynchronous tasks.
Rational DB storage for profile, friend list data etc.
NoSQL key-value storage for chat message data.
Reason: more scalable, lower latency according chat message data’s usage pattern.
Data model
For profile, friend list data, we can identify them by user id and we can store them in relational database.
For chat message data, we can use (channel id, message id) to identify. We make the message id sortable by time for the same channel id.
CUJ: connect to chat services
User connects to API servers.
User logins in
API server use Service Discovery (e.g. Zookeeper) to find a suitable chat server
User connects to the chat server using WebSocket
CUJ: Send Messages
Sending 1-to-1 chat messages
User sends chat data to chat server
Chat server generates unique and incremental message id
Chat server enqueues task to message queue
Message queue handler / chat server writes chat data into KV store
By querying Web Socket Manager, handler can determine the target user’s status
If target user is online, send chat data to target chat server and to target user
If target user is offline, send chat data to notification service
Message synchronization for the same user
The same user will connect to the same chat server by multiple devices
Each device maintains its own maximum message id
Chat servers can query KV store for the current maximum message id in storage
Use the message id to determine the message order
Sending messages in group
Each user has its own inbox message queue
The sender will send chat data to each recipient’s inbox message queue
The recipient can receive chat data from different chat server
Cons: If the group size is too large, write fanout will be huge.
CUJ: Online Presence
User login: user connects to presence servers using WebSocket, presence servers update the last login time in DB
User logout: user logout through API servers and API server sends request to presence servers to change user’s status to offline
User disconnection: user is responsible to send heartbeat signals to presence servers to keep the status online
Status fanout: when a user’s status changes, send out status update to friends
Other Topics
Support media files: compression, cloud storage, thumbnails etc.
E2E encryption.
Caching message on client side.
Use regional storage to improve the data access time.