Die wunderbare Welt von Isotopp

MySQL: Data for Testing

Avatar of @isotopp@infosec.exchange Kristian Köhntopp - September 26, 2022

Where I work, there is an ongoing discussion about test data generation. At the moment we do not replace or change any production data for use in the test environments, and we don’t generate test data.

That is safe and legal, because production data is tokenized. That is, PII and PCI data is being replaced by placeholder tokens which can be used by applications to access the actual protected data through specially protected access services. Only a very limited circle of people is dealing with data in behind the protected services.

MySQL: Sometimes it is not the database

Avatar of @isotopp@infosec.exchange Kristian Köhntopp - September 19, 2022

Query latencies in one data center are larger than elsewhere for one replication hierarchy, but only in the high percentiles. This impacts production and traffic is being failed away from that data center to protect production.

When the P50 and P90 look okay, but the P99 and the P99.9 do not, the database(s) operate normally, and only some queries are running slow. The initial guess was “for some queries the plan has flipped, but only in that data center.”

MySQL: Artifactory Conclusion

Avatar of @isotopp@infosec.exchange Kristian Köhntopp - September 14, 2022

Two weeks ago I was being drawn into the debug of artdb, the Replication hierarchy used by our Artifactory instance.

TL;DR

Artifactory overloaded the database. This was incident-handled by optimizing a number of slow queries using some covering index trickery, and by upgrading the hardware substantially.

Using the runway we bought, we found and partially fixed the following problems:

  • Fixed: A number of very expensive reporting queries were sped up 16x to 20x using covering indexes, from 180s runtime to 8s-12s runtime.
  • Fixed: We optimized our database and data size by completing the data lifecycle for several repo types, deleting old images.
  • Fixed: Hardware upgrade lowered load even more, and sped up the queries even more.
  • Fixed: Further investigation identified a synchronisation cron job which overloaded the database in 40m intervals. The cron job was throttled and is now on a backlog for refactoring.
  • Fixed: The long running queries were identified in the Artifactory source code, which allowed us and jFrog support to identify that a) the Artifactory internal cache for these queries was bypassed and b) this happens when certain JMX monitoring options are enabled, per application host hitting the database instance. The monitoring was temporarily disabled.
  • Fixed: For each image download, Artifactory writes statistics updates, which can lead to lock contention on hot images. We identified a configuration that could turn this off, and later found that these updates can be happening batched in the background.
  • Unfixed: Artifactory does not understand the concept of Replicas and database scale-out at all, it can only do database scale-up, and that has absolute limits.

The database workload for Artifactory is now well below critical levels and has on the current, upgraded hardware runway for estimated 4x growth. Beyond that, vertical scaling is going to be hard without changes to the architecture of Artifactory.

MySQL: Straight lines

Avatar of @isotopp@infosec.exchange Kristian Köhntopp - September 8, 2022

A database is showing replication delay, and so are all the other instances of the same replication hierarchy, all of which reside in Openstack.

Shortly before 21:30 the database begins to lag, until around 23:45, when it starts to catch up, slowly. After 00:30, we gain delay again, plateau and then around 01:45, we catch up.

The database is moving deep into replication delay sometimes. It does not do that on bare metal.

MySQL: YOLO mode

Avatar of @isotopp@infosec.exchange Kristian Köhntopp - September 2, 2022

OH:

“And now let’s quickly push 2 billion rows into this database VM.”

That is best done in YOLO mode. This is a mode of operation for a database that minimizes disk writes in favor of batched bulk writes.

It is not ACID, so if anything goes wrong during the load, the instance is lost. That is why it is called YOLO mode.

You are supposed to do this on a spare replica and not the production primary. If you are not having at least one more replica than needed in your MySQL deployment, I consider your setup defective.

MySQL: Boiling JFrogs

Avatar of @isotopp@infosec.exchange Kristian Köhntopp - August 25, 2022

A work problem: A commercial application, Artifactory, where we do not control the source or the schema has performance problems involving a certain long running query.

The data size and row counts are not outrageous, and the query itself and the schema are not broken. But the data is very skewed and for certain values the query is very slow, as almost the entire table is selected.

We introduce an experimental covering index, and show a 16x improvement, going from 143s to 9s execution time. We advise the customer to ask for this covering index to be added officially using the normal way through MySQL portal.

MySQL: GIPK (InnoDB and Primary Keys)

Avatar of @isotopp@infosec.exchange Kristian Köhntopp - August 23, 2022

The MySQL InnoDB storage engine lives off primary keys: Records are stored physically in a B+-Tree, a balanced tree where the data pages are the leaves of the primary key B-Tree.

That implies that the data is stored physically in primary key order – records with similar primary keys are usually physically closer together than records with larger differences in primary key value. In literature, such physically ordered indexes are often called clustered indexes.

MySQL: Row Literals

Avatar of @isotopp@infosec.exchange Kristian Köhntopp - August 22, 2022

Question on the Libera/#mysql IRC channel:

Is there a way to split a simple select into multiple returned rows? For example, select 1, 2, 3 to be returned as rows?

This is actually asking for a table literal notation. I know of four ways to construct a table literal in MySQL:

UNION ALL

The oldest way to construct a table literal in any SQL that supports UNION is the UNION ALL construct. Write SELECT statements to return literal rows, and add them together to a table using UNION ALL:

Good Energy News

Avatar of @isotopp@infosec.exchange Kristian Köhntopp - August 15, 2022

Two good news in one day. First, my energy provider tells me:

Levert u meer terug dan u van het net afneemt, dan betalen we u vanaf die datum 16.8 eurocent per kWh. De oude vergoeding was 7 eurocent per kWh.

“If you push energy into the grid, we now pay you 16.8 Eurocent per kWh. The old payment was 7.0 Eurocent per kWh.”

The other good news is that it is very likely that the existing “Salderingsregeling” may be continued until 2025. At least that seems to be the plan, says milieu centraal .

MacOS: How to prevent screen blanker

Avatar of @isotopp@infosec.exchange Kristian Köhntopp - July 30, 2022

When running Keynote, MacOS prevents the screen blanker from kicking in. I needed a similar thing in one of my Python applications, so I needed to find out how it does that.

It turns out, there is an API for that: beginActivity:withOptions: , which returns an Object token activity. Calling endActivity with that Token ends the activity, and resumes the screen blanker countdown.

There is a list of activity options, NSActivityOptions . We are interested into NSActivityIdleDisplaySleepDisabled and NSActivityUserInteractive, mostly.