Sorry for the late reply :) Looks like we have between 20T and 25T of data, combined across all MySQL databases. And somewhere in the region between 60B and 80B records.
Edit: this is just for data that's used all the time. Almost nothing is dead, just sitting around.
I can break it down quickly, here:
- Scaling web servers is much easier. Queries are (mostly, on a single host) executed one after another, so if one is slow, you suddenly have a queued of queries waiting to be executed. The goal is to execute them as quickly as possible.
- 2 separate queries, both using perfect indexes will be much faster (insanely faster on this scale) than 1 query with a join. So, we just join them in code
- Sorting is often a problem, since in MySQL only one index can be used per query.
- No foreign keys increases insert/update query speeds, and decreases server load.
- Etc, etc.
And thanks for brining this up, I've added a disclaimer that these are not to be taken for granted, and they work for us, on our scale.
I did plan a whole new article about this. With all the benchmarks we've gathered over the years.
Vitess is just one example, there are others. And it depends, some teams might be able to do some work to get along with MySQL, like chess.com. As in most cases, there is no silver bullet here.
Yeah, I can agree with that. You really need to know what you're doing. We have benchmarks for most of such "micro-optimizations", though now we can know how things are going to behave before we even make them, just from the experience.
Definitely not something I'd suggest on an average traffic website.
yep. From what I've seen: it's because a bunch of chess Twitch and Youtube people have been building up communities + meme-scapes around the game. QG added a bunch of new people who found the online community and got into it -- especially because with the pandemic people have been wandering the internet searching (unintentionally) for new hobbies to get invested in.