Skip to content

Category: MySQL

Making large tables smaller: Compression in MySQL

So JF has been busy abusing MySQL again:

  • An Adventure in InnoDB Table Compression (for read-only tables) is trying out different KEY_BLOCK_SIZES in InnoDB to find out which settings yield the best compression for a specific table.

    His sample script copies a table and compresses it with one setting, then does this again, and if the new version is smaller, keeps the new one. Otherwise the first version of the table is being kept. This is then done over and over until a minimum sized InnoDB compressed table has been created. JF managed to create a compression ratio of 4.53, bringing a 20 TB instance down to 4.5TB.

  • In Why we still need MyISAM (for read-only tables) he does the same thing with his database in MyISAM format, and then compresses using myisampack, which is ok because his data is read-only archive data.

    MyISAM uncompressed is 22% smaller than InnoDB uncompressed. Compressed, his data is 10x smaller than the raw InnoDB uncompressed, so his 20TB raw data is below 2T compressed.

Using MyISAM for read-only data is much less critical than it would be for data that is being written to: Data corruption due to the lack of checksums is much less likely, and while the lack of clustered indexes can not really be compensated, “myisamchk –sort-index” is at least keeping the regenerated indexes linear in the MYI files.

1 Comment

How large can MySQL Replication be in production?

Based on How large can MySQL be in production (article by JF), I have been looking at the awesome Orchestrator visualization of some replication hierarchies.

Click for large

Names have been removed, but the blue badges indicate some host counts. All of these are installations replicating across multiple Cluster/Data Center Boundaries.

Orchestrator is by far the best tool to discover and manage replication setups like these. You can restructure replication topologies with drag and drop.

3 Comments

Using MySQL Partitions (a Python example)

Today somebody had a problem with expiring a large table (a Serendipity Blog table).

In MySQL InnoDB, tables are physically ordered by primary key (InnoDB data is a B+ tree, a balanced tree where the data pages are the leaves of the tree). If you are expiring old data from such a log table, you are deleting from the left hand side of the tree, and since it is a balanced tree, that triggers a lot of rebalancing – hence it is very slow.

If you rename the old table and INSERT … SELECT the data you want to keep back into the original table, that can be faster. But if the data you want to keep is larger than memory, the indexing of the data will still be slow.

A nice way to handle log tables are partitions. Here is an example. It’s not very cleaned up, but it works on my system.

Leave a Comment

Good Riddance to the Query Cache

MySQL 8.0 will be retiring support for the Query Cache.

The MySQL Query cache is a result cache: The MySQL server will record all result sets that are small enough to keep in the cache, and a hash of the query that produced it. If a query meets certain requirements, and the hash of the same query string is ever seen again, the query will not be actually parsed and executed, but the same result set will be replayed.

There are mechanisms in place that prevent uncacheable queries from being cached in the first place, and that prune outdated data from the query cache.

The query cache exists in the first place, because it was easier to create than to teach every PHP CMS developer in the world about sessions. So instead of retrieving the current background color of the current theme over and over from the database, the query cache recognizes the current theme color query again and just replays “green” over and over.

But that was then.

Leave a Comment

Protecting MySQL Network Traffic

Percona Live Talk by Daniël van Eeden: Protecting MySQL Network Traffic.

Warning: It is somewhat more complicated than this:

Slideshare

Check out the performance slide (#22), too.

Tl;Dr: You want a MySQL compiled against OpenSSL, because SSL Tickets and AES-NI support. YaSSL sucks, hard. With Tickets and hardware symmetric encryption, TLS support in MySQL is actually no longer slow.

Tl;DR 2: MariaDB is actually pretty well positioned here.

Leave a Comment

Optane/Xpoint and paradigm shift

So Optane is here. It’s a bit-addressable non-volatile storage with a higher density than DRAM. It’s not as good as initially promised, yet, but it’s a first iteration.

It is basically very slow RAM (or very fast flash), which is bit-adressable. So you are not, like with flash, erasing 64 KB sized pages, but you are doing things to individual bits and bytes.

It’s also faster than flash (but slower than DRAM), about 10x faster than old Commodore 64 memory.

And it’s persistent, so if you power off your machine, contents are not gone.

And it is very dense, denser even than the memory you currently use, because no transistors, so less space necessary per bit.

This is going to change a lot of things, but not right now. We need to rethink our approach to persistence.

4 Comments