Slack is mainly known for its Persistent Group Messaging service.
4M Daily active user (DAU) and 5.8M WAU
Slack has about 2.5M web socket connections open simultaneously at their peak.
Half of DAU outside US
Slack uses a conservative tech-stack: Technologies/tools which are >10years olds
Developers at Slack choose something they’ve already operated on, over something new and tailor-made, shallow, transparent stack of abstraction
Slack’s webapp codebase:
PHP monolith of app logic (<1M Lines of Code)
Scaled-out LAMP stack app (Memcache wrapped around sharded MySQL)
Recently migrated to HHVM (HipHop Virtual Machine, Facebook’s JIT (just-in-time) for PHP)
Login and Receive Messages: the “mains”
MySQL Shards:
Source of truth for most customer data (Teams, users, channels, messages, comments, emoji, …)
Replication across two Data Centers (Available for 1-DC failure)
Sharded by teams (For performace, fault isolation, and scalability)
By why MySQL?
Many, many thousands of server-years of operating
The relational model is a good discipline
Not because of ACID, though
How is MySQL used then?
At slack, MySQL is used for master-master replication
This helps in retreiving data from one shard in case of failure of the other, as write operations are performed on both shards simultaneously.
Now what if the same row is written or same value is written ? How does Slack with MMR complications?
Choosing A in CAP theorem. (Availability)
INSERT ON DUPLICATE KEY UPDATE …
The rtm-API (RealTime Messaging API), does all of the above and shares to the user the following but not limited to:
Identity of every channel in a team,
ID of every user in the team,
The membership of all channels,
Where has the cursor moved since you were last active on the channel, etc..
But the two important pieces of information are:
{
"ok": True,
"url": "wss:\/\/ms9.slack-msgs.com/\websocket\/7I5yBpcvk",
...
}
Using the above frame of information of each session, slack manages realtime update with cache memory of the websocket connection
Rtm.start payload
Rtm.start returns an image of the whole team
Message Delivery
Deferring Work:
Slack uses Redis as a Job Queue
Link unfurling is the mechanism to curl into reference links.