MicroOptimizing MySQL at scale

Light as a feather

My previous article ruffled a few feathers (on HN), mostly regarding my comment about having done “all the standard optimizations”. Turns out those are not so standard, or known, as I’ve expected. So I’ve decided to address that topic here.

I’ll quote that piece of the article here:

Not many unused/redundant indexes, the ones that were there were mostly good. We constantly monitored and improved heavy/slow queries. A nice chunk of data was denormalized, as well. No foreign keys, many things were done in the code itself (filtering, sorting, etc, to make sure the DB only ever uses the most efficient indexes) running on the latest version of MySQL, etc, etc.

I’ll try to break this article down in 3 distinct areas (though, they can overlap a bit) – client, server and OS. The easiest optimizations for anyone to apply are probably query optimizations (client side). But we can also improve things directly on the MySQL server and even on the OS level. (All the things I write about here are MySQL with InnoDB, but they should be applicable to many other relational databases).

Disclaimer: Don’t take these for granted, and before applying them to your database, make sure they work for you. Also, many of these won’t make a difference on a low traffic database (the ones I work with combine to over 8B queries a day). Get some benchmarking done (use PMM, or similar tools), make sure you can easily undo them, etc. I’ve worked with PHP for ages, and since it’s stateless, scaling it horizontally is much easier than scaling MySQL servers, so the database is our bottleneck and we tend to give it a lot of love. This probably isn’t the case of stateful applications, where you can’t scale them infinitely.

Client-side optimizations

Let’s start at a place everyone understands and can apply these (potential) optimizations. The most common goal of client side optimization is to either reduce the number of queries or reduce the load those queries create on your databases. Let’s start with the basics.

Caching

Cache your results! Do we even need to talk about this? Technically β€” it’s not even an optimization, but it does decrease database load, and that’s what this article is all about. I bet almost everyone reading this is familiar with caching, but there still might be a few quirks in this section that might be interesting.

Cache your results! But don’t cache everything. And now things become a bit confusing. You’ll need to make a judgment call and find a balance (and you will see that’s a recurring theme here) between what you cache and what you query from your database.

If you cache everything for a long period of time, the traffic that the database would usually handle just gets redirected elsewhere (I highly recommend Redis for result caching, but really, any in-memory storage will do the trick), and that secondary storage might start to struggle (either get overfilled or can’t handle all the traffic). A few rules of thumb I like to follow are:
– Don’t cache single record selects, where you query by a primary key (unless it’s called way too often and is a cause of nice chunk of database load)
– You might think it’s a good idea to cache large result sets that don’t change often for a long period of time. Which is probably true, if it’s looked up often. If not – you’re wasting your in-memory storage (potentially gigabytes) on something that’s not used at all.
– Hit rate is important. What’s the point of caching something if the result is almost never found in cache? In that case we’re just adding network traffic to the cache storage, and after that we’re querying database anyhow.
– If you’re using an ORM that offers caching out-of-the-box, make sure you understand how it works. It will usually cache based on input parameters, and here’s the catch – if one of those parameters is a timestamp of any kind, your caching probably won’t work as you’d expect. Here’s a very simple example (you might recognize PHP and Doctrine):

return $this
    ->createQueryBuilder('b')
    ->select('count(b.id)')
    ->andWhere('b.createDate >= :date')
    ->setParameter('date', new DateTime('-48 hours'))
    ->getQuery()
    ->useResultCache(true)
    ->getSingleScalarResult();

In this chase the cache hit rate is close to 0, meaning the cache is useless. The reason being that Doctrine generates cache key based on input parameters, and in this case it’s a DateTime object will be different every second. That means that we’re storing a new cache entry, but it’s only usable in the same second. The fix is rather simple, though β€” floor the timestamp to a reasonable value. In our case it was as simple as rounding it down to the hour:

$date = new DateTime('-48 hours'); 
$date->setTime($date->format('H'), 0, 0);

I could literally write about caching and it’s quirks and “gotchas” for hours, and that would mean that I’d get nothing else in this article, so I’ll move on.

Update/Delete by primary key(s)

If you need to delete (or update) a ton of records that match the same condition, it’s probably not a smart move to do them all at once, like:

DELETE FROM table_x
WHERE
    some_date > NOW() - INTERVAL 1 YEAR AND 
    is_enabled = 0;

This might lock up table_x, causing subsequent queries to wait in the queue for execution (this actually killed our database once or twice over the years). What we usually do now (or variation of this, depending of what conditions need to be met) is:

SELECT id
FROM table_x
WHERE
    some_date > NOW() - INTERVAL 1 YEAR AND 
    is_enabled = 0;

Depending on the conditions, this select will most likely go to a replica. After getting primary keys, we usually chunk those into batches of 100 to 500 PKs, and queue asynchronous jobs with those PKs. Those jobs will execute queries like:

DELETE FROM table_x 
WHERE id IN (:ids);

This way the load that the master database suffers is far, far less than what it used to be, as there’s no easier way to delete records (from database perspective) than by primary key.

Hinting or forcing indexes

Even though it might sound impossible, it can happen that MySQL query optimizer fails to recognize the best index to use for a query. This can be especially annoying if you know you have a perfect index for a query, but when you EXPLAIN the query, you get a different one selected by the optimizer.

You can tell the optimizer which indexes to use using USE INDEX or FORCE INDEX clauses. Former is just a recommendation for the optimizer, and it still might pick a different index to use.

But be careful, as this might be a double-edged sword. For an example, a query that was appearing in the slow logs (where slow queries are logged). After investigating it, it turned out that it runs for over 2 seconds for users that have over 100k records in the table. We found a better index for the query, and it resulted in the query time decreasing to under 0.3s. Sounds great right?

Weeeell, that was a huge mistake. We only investigated queries that appeared in the slow log and didn’t try to run the query for users that had less than 100k records (which is over 99% of users). When we started running the new query (with forced index) in the production, our database load skyrocketed. That’s because the new index that was forced significantly decreased the performance for over 99% of queries executed. Luckily, we heavily rely on feature flags, so undoing it before anything visible happened was simple.

This was just one bad example. We actually managed to improve performance by manually hinting indexes in multiple occasions. And also failed to do so more than once. So, it’s essential to nicely benchmark everything before doing this, and to have a quick way to undo the changes.

“INSERT IGNORE” issues

This might be a problem if the database is using statement based replication, and the problem goes back to the database load, again. Let’s, for example, assume that the cluster has 1 master and 4 replicas, and it was quicker to write insert ignore into ... instead of selecting from the table to see if the record exists, and then only inserting if it already doesn’t. Using tools like PMM, you might find stats for the query that look like 800qps executed (800 queries per second) and 80 rows affected (per second). That means that only every 10th insert ignore query actually inserts something, the remaining 9 are just duplicates due to poor code design.

This might not sound like a problem, until you dive a bit deeper. It means that every single host is trying to execute this query 800 times per second, because all these are actually replicated down the stream to replicas. So, the database cluster is actually running 5 * 800 = 4000qps of inserts, of which only 80 inserts are really happening.

The solution is simple – check for the existence of the record (usually possible by PK) first, before inserting. Depending on the replication delay between master and replicas, you could even select from replicas. So, that turns into 800 select + 80 insert queries in total (per second). Instead of 4000 insert queries per second.

Also, similar can be said for on duplicate key clauses.

Temporary tables on disk

There are a lot of ways to write a query that will make MySQL create a temporary table to perform additional operations. When you run an EXPLAIN for a query and it’s result is something like Using where; Using temporary; – that’s OK (not perfect, but it’s manageable). It means that MySQL is creating a temporary table in memory.

But, this can get much worse, if MySQL needs to create the temporary table on disk. You really want to avoid this, as disk operations are extremely expensive. There are multiple ways to trigger this, but probably the easiest one is if your SELECT clause contains any *BLOB or *TEXT (and in some MySQL versions JSON) data types.

A simple way around that is to simply not select those columns, and instead make sure their primary key is selected. After that run an additional query like SELECT blob_column FROM table_x WHERE id IN (:ids);. Primary key selects will not create temporary tables.

Non-deterministic queries

Non-deterministic queries are a server-side problem created on client, and it should be fixed there, as well. Here are couple of examples of non-deterministic queries:

DELETE FROM users WHERE email LIKE 'test@%' LIMIT 10;
UPDATE users SET is_enabled = 1 WHERE last_login_date > NOW() - INTERVAL 1 HOUR LIMIT 10;

If you think about these a bit more, you’ll see that depending on when these queries are executed, they will not update the same records in the database. If the database is using statement based replication, queries like these are a problem. They can terminate (kill) an alter on a replica, or cause data inconsistency between hosts.

Solution to this problem is similar to those I already mentioned above. Select the primary keys first, and then issue new queries with those keys.

Like:

SELECT id FROM users WHERE email LIKE 'test@%' LIMIT 10;
DELETE FROM users WHERE id IN (1, 2, 3);

Now it’s deterministic, and no matter when or where the query is executed, it will always produce the same outcome.

Denormalization

While learning about relational databases, most of us were forced to understand the principle of data normalization. That’s actually a great principle, especially for people learning about this. But, when you grow too big, this can become a huge burden. So, going against the normalization principle, you need to create redundant data.

If you start with 2 simple tables, named user and user_game (where you store results for games which user has played), you will do great while you’re growing slowly. Getting count of user’s games is simple, just like SELECT COUNT(*) FROM user_game WHERE user_id = ?;. Then you grow a bit more, and you realize you need to add index on user_game.user_id. Then you grow a lot bigger, and you suddenly have hundreds of millions of records in the user_game table (and you didn’t have time to shard it yet). Now you’re in a problem (This is a hypothetical problem I’m creating. It could also be solved by caching the result, and modifying/invalidating cache when a game finishes. The real-life problems do exist, though.). If you want to decrease the load your count queries are creating – you could just add a new column in user table, or create completely new table user_game_stats, where could create the column named games_played. So, now, instead of querying table with hundreds of millions of records, you query far smaller table, by primary key, like SELECT games_played FROM user_game_stats WHERE user_id = ?;. The load will drop significantly, even though you now have to update this table every time a game finishes.

Unions

Short answer: Don’t.

Long answer: whatever the result set of union is, any further operation you perform on it (filtering, sorting) won’t be on an index, and will most likely create a temporary table on disk. As an alternative, I’d suggest running those union queries separately, and doing anything else with them in the code (merge, sort, filter, etc).

Subqueries

Subqueries aren’t as bad as many people think, mostly because query optimizer knows how to turn them into something more performant. I’d still suggest pulling them into a separate query, as that might allow you to cache the result.

Too many items in the IN clause

As you could see in the “Update/Delete by primary key(s)” section, we often run these by primary key with IN clause. A few times we’ve ran into a weird issue, where these queries would fail to use the index, even though they are executed by primary key, which should never really happen. When there were over 2000 IDs in the IN clause, MySQL would just give up, and not use the index at all.

The issue isn’t solved yet, unfortunately. We’re just more careful with what we do.

Server-side optimizations

Server-side optimizations are pretty cool, but usually not as effective as fixing the abuse coming from poorly written queries.

Unused and redundant indexes

As the name suggests, these indexes are not needed. They only slow down writes, as they are still populated, but just sit there, doing nothing. You can find them in sys.schema_unused_indexes and sys.schema_redundant_indexes. The important thing to note here is that both master and its replicas need to be checked, and only indexes that appear unused/redundant in all of them should be dropped.

This comes from a fact that many read queries might be going to replicas, and indexes are heavily used there. If only master is checked, and these indexes dropped – well, a lot of pain is ahead.

Table fragmentation

If a table has high write traffic, especially involving a lot of deletes and updates, there’s a chance it becomes “fragmented”. MySQL statistics get completely messed up, which can cause a lot of problems (wrong indexes used for queries is the most frequent one).

One recent example was with such a table. Table file on disk took 39G, and table statistics showed that it had only around 173M records. But running a count query on it resulted in over 370M records. So, it was clearly messed up.

After an alter, table statistics got corrected, and table size on disk dropped to 21G (almost 50% reduction). Additionally, there was also a nice side-effect of this, where execution times for selects on this table dropped completely:

Execution times

Versions

Make sure you monitor your analytics tools when upgrading your database. Don’t upgrade everything at once, and I’d even suggest running a single host on the new version (they are usually fully backwards compatible) for a while, and checking how it does compared to others.

Here’s a real life example. We’ve upgraded one of our hosts to MySQL 8.0.0 recently (from 5.7). It was INSANELY (I can’t stress this enough) underperforming. Disk IO skyrocketed, some queries executed twice slower, etc. No need to say that we’ve decided not to upgrade any other host. After a bit, 8.0.22 appeared, and we’ve decided to give it another shot. And voila:

< 16th 5.7 | 16th-19th 8.0.0 | >19th 8.0.22

AHI (Adaptive Hash Index)

AHI is a nice feature, but in a database containing only really big tables (>1B records) it becomes a performance killer. To know how it’s performing, just look into AHI miss ratio. If it looks anything like this, it should probably be disabled:

AHI Miss Ratio

As you can see, our miss ratio was always over 50%, and usually 100%. Clearly, any type of cache/index with 100% miss ratio is worthless. And it will always take away from the performance to keep it up-to-date.

Here’s AHI disabled-enabled-disabled for context switching:

Context Switches

OS level optimizations

Well, a the moment, I’ve got only one thing to mention, and it’s not really an optimization. It’s rather a hack. A magical one. If someone can explain why this one works, please let me know.

So, here’s the trick: if a replica is falling behind (can’t keep up with master) – run disk reads. And somehow, magically, it speeds the replication up, and it will catch up. Mikhail Solovyev wrote a script (I won’t post it here) that, when it detects replication lag, runs something like this cat /data/mysql/..../random_table_name.ibd. And trust me – it really works. It catches up really quickly. We haven’t figured out why this may happen, but there’s a wild guess. Also, this is a double edged sword, as it might cause the host to overload.

Final words

This topic is so huge, I could probably write for days about it. But at the same time, I don’t have enough time to dedicate to writing, so I had this article almost 90% finished for over 3 weeks. In the end, I’ve decided to publish that 90%, because otherwise it might’ve been completely forgotten.

But even if I had finished it as I’d planned, there would still be many, many other things I didn’t mention. Some skipped intentionally (because they are either too known, too convoluted or just don’t give visible enough results), many I didn’t remember and even more I never heard about or didn’t have a chance to implement (or see implemented by someone else). So, maybe there’ll be a sequel (follow me on LinkedIn, if you want to be up to date).

If you didn’t have a chance, I recommend reading my previous article, which in turn triggered me to write this one.

Related Post

5 Replies to “MicroOptimizing MySQL at scale”

  1. Another great article! You got a fantastic domain name, by the way. πŸ™‚

    > runs something like this cat /data/mysql/…./random_table_name.ibd. And trust me – it really works. It catches up really quickly.

    My first guess is that this forces the file into a memory cache, which improves the read times that MySQL is seeing. Hopefully someone who knows what’s going on internally can comment.

  2. I have just read both of your articles and they are really good “war stories”, thanks for sharing them Alek.

    In relation with “Hinting or forcing indexes”, I have made the same mistake of analyzing improvements in “extremely-slow” scenarios and forgot to understand the impact in the 99% of the cases in which queries were running without issues.

    “So, it’s essential to nicely benchmark everything before doing this, and to have a quick way to undo the changes.”

    In addition to that I would also say that it is important to mention that even if forcing/hinting the index works for all the scenarios now it doesn’t mean that it will still be the best solution in “one year” when the changes in the product cause changes in the data/relations and we lost the engine’s ability to decide the best index for the query. So, even if that “fixes” the problem today it might still be painful at some point in the future.

Leave a Reply

Your email address will not be published. Required fields are marked *