Die wunderbare Welt von Isotopp

MySQL: Backups and Replication

Kristian Köhntopp - November 27, 2020
There was a question at work about MySQL backups and restore. I needed to explain more. We use databases to make state persistent. That is: As a developer you can think of your database as a single giant, structured global variable with a weird access method, and to make things worse, concurrent access. A database is just a global variable to your code We can log statements that change the state of our database in a log.

Gitlab in Docker

Kristian Köhntopp - November 22, 2020
These installation notes are mostly a note to myself, documenting the installation process of a Gitlab Omnibus Container in Docker, plus Gitlab Runners. OS Setup We are installing into /export/gitlab, a 10G xfs slice from the local flash pool: # lvcreate -n gitlab -L 10G data # mkfs -t xfs /dev/data/gitlab # mkdir /export/gitlab # mount /dev/data/gitlab /export/gitlab # echo "/dev/data/gitlab\t/export/gitlab\txfs\tbsdgroups,usrquota,grpquota,attr2,nofail,noatime 1 2" >> /etc/fstab # mkdir /export/gitlab/{gitlab,gitlab-runner} # mkdir /export/gitlab/gitlab/{config,data,logs} Docker We are using docker-compose to run this, with a .

On the Observability of Outliers

Kristian Köhntopp - November 19, 2020
At work, I am in an ongoing discussion with a number of people on the Observability of Outliers. It started with the age-old question “How do I find slow queries in my application?” aka “What would I want from tooling to get that data and where should that tooling sit?” As a developer, I just want to automatically identify and isolate slow queries! Where I work, we do have SolarWinds Database Performance Monitor aka Vividcortex to find slow queries, so that helps.

My home sensor network

Kristian Köhntopp - November 15, 2020
I have been asked to document my home sensor network. Being married to a person with a background in web security sets boundary conditions: No cloud. We are running all services locally. No control, only metrics. I am collecting data from a number of plugs with power meters over Wi-Fi, using the MQTT protocol. I am also collecting data from a number of temperature sensors over Zigbee, and convert to MQTT.

Rechenzentren und ihren Stromverbrauch regulieren

Kristian Köhntopp - November 1, 2020
Es gibt ein Interview mit Stefan Ramesohl vom Umweltministerium (des Bundes) in Netzpolitik.org: “Warum niemand weiß, wie viele Rechenzentren es in Europa gibt ”. Im Wesentlichen hat das Umweltministerium angesagt, daß es auf europäischer Ebene Rechenzentren erfassen und katalogisieren will, um in einem zweiten Schritt den Energieverbrauch von Rechenzentren zu regulieren. Das ist sehr spannend, denn derzeit gibt es keine Übersicht über Rechenzentren in Europa, und tatsächlich sind einige Rechenzentrumsbetreiber sehr paranoid, was den genauen Standort ihrer Hardware angeht und wieviel und welche Hardware darin ist oder was diese tut.

MySQL: Ecosystem fragmentation

Kristian Köhntopp - October 28, 2020
Sometimes things change in a way that is hard to put a finger on, but I am doing this MySQL thing since 3.23, and commercially since 2005, and the environment is changing. These days, when you talk to people in need of MySQL, the first thing you have to ask them is “Which MySQL”. And by that I do not mean a version number in the first place. The answer may be:

Safe Biking: It's not the right of way that's wrong

Kristian Köhntopp - October 22, 2020
Sven Geggus trolled me . A bunch of nerds were speaking about what’s wrong with biking in Germany, and he wrote: Tweet : Der @isotopp wohnt doch in Holland. Wo sind denn bei euch die Vorfahrtsregeln anders und könnte man da was sinnvoll für .de übernehmen? – @isotopp is living in the Netherlands. So how is the right of way different and how could .de learn from this? Biking in the Netherlands does not suck, and that is not because of any specific traffic rules being any different, but because traffic is fundamentally different, and that somehow escalated.

Fertig gespielt: Assassins Creed: Odyssey

Kristian Köhntopp - October 19, 2020
Ok, so the title is a bit of an exaggeration, I am about 80% done with the main games quests. Still, the shape and feel of the story is firmly established and it is fascinating. I have some 110 hours in “Assassins Creed: Origin”, and some 65 hours or so in “Assassins Creed: Odyssey”. You can play “Assassins Creed: Odyssey” as Alexios or Kassandra, and the change is permanent for the playthrough for storytelling reasons.

Vijzelbuurt is changing

Kristian Köhntopp - October 8, 2020
Back when I was still commuting to work in an office, in the far past, I used to be in the Spaces building in Vijzelgracht in Amsterdam, every day. Mark Wagenbuur of BicycleDutch has been there today, and tweeted : Amsterdam is changing dramatically. I was in Vijzelstraat for the first time in about 2 years and I was pleasantly surprised! Looked up the old situation in Google StreetView for some before and afters.

An unexpected pool size increase

Kristian Köhntopp - October 7, 2020
At work, replication chains have a single primary database node, to which you write, and then multiple replicas, in multiple AZs. Here is what the one sample chain looks like in Orchestrator: instance-918d is the current primary, in the blue AZ. Replicas in orange and green are in other AZs. Blue badges indicate multiple replicas, eg (38) means 38 machines. When you talk to a database, you get two database handles:

IT modernisieren und konsolidieren

Kristian Köhntopp - October 5, 2020
Ich schrieb in einem Twitter Thread über Posix Dateisysteme vs. Object Stores : UNIX FS ist 1974. BSD FFS ist 1984. XFS ist 1994. ZFS (und Btrfs und Wafl) sind LFS, also 2004. Object Storages, LSM, “RocksDB” ist ca. 2014, um den Takt zu halten. und wurde gefragt: “Was kommt 2024”. Meine halb spöttische, halb ernst gemeinte Antwort war: Irrelevant. 2024 läuft Dein Code serverless bei einem professionellen Betreiber und vom lokalen System und dem lokalen Dateisystem kriegst Du nix mehr zu sehen außer einer monatlichen Rechnung.

What are the problems with POSIX?

Kristian Köhntopp - October 5, 2020
Every once in a while there is the IT news article that kind of triggers me. This time it was “Object-Storage-Protokoll könnte Posix ablösen” in german computer news site Golem . The article speaks about mmap(), NVMEoF and object storage and how it could revolutionize or complete object storages, but does not link to an original article, names no persons and no paper. Also, what do these things - mmap, NVMEoF, object storage and Posix, even have in common?

MySQL: Import CSV, not using LOAD DATA

Kristian Köhntopp - September 28, 2020
All over the Internet people are having trouble getting LOAD DATA and LOAD DATA LOCAL to work. Frankly, do not use them, and especially not the LOCAL variant. They are insecure, and even if you get them to work, they are limited and unlikely to do what you want. Write a small data load program as shown below. Not using LOAD DATA LOCAL The fine manual says :

Importing account statements and building a data warehouse

Kristian Köhntopp - September 26, 2020
This is an update and translation of a much older article , which I wrote in German Language back then. I was experimenting with importing the account statements from my German Sparkasse, which at that time were being made available as a CSV. The initial data load The data looked like this: $ head -2 /home/kris/Documents/banking/umsatz-22758031-29122004.csv "Local Account";"Book Date";"Valuta Date";"Transaction Type"; "Purpose"; "Remote Party";"Remote Account";"Bank Code"; "Amount";"Currency";"Info" "08154711";"30.12";"30.12.05";"Direct Debit"; "DRP 08154711 040441777 INKL.

MySQL: automatic partitions surely would be nice

Kristian Köhntopp - September 25, 2020
In Deleting data we have been looking at a process that loads data into MySQL, leveraging partitions to make it easier and faster to later get rid of the data again. For this, we created three processes, a data loader process, and two observers - one for creating partitions, and one for deleting them. The observer processes have been running ANALYZE TABLES and then polling INFORMATION_SCHEMA.PARTITIONS every 1/10th of a second to check if intervention is needed.

MySQL: Deleting data

Kristian Köhntopp - September 24, 2020
Completing the data lifecycle is often harder than originally expected: Deleting data can sometimes cost way more than inserting it in the first place. MySQL Partitions can offer a way out. We have an earlier post on the subject. A sample table, and a problem statement Let’s define a kind of log table, to which data is added with an auto_increment id value and some data. #! /usr/bin/env python3 from time import sleep from random import randint from multiprocessing import Process import click import MySQLdb import MySQLdb.

MySQL: Provisioning .mylogin.cnf

Kristian Köhntopp - September 23, 2020
MySQL uses connection and config parameters from a number of possible sources. The easiest way to find out where it is looking for config files is to run $ mysql --help | grep cnf order of preference, my.cnf, $MYSQL_TCP_PORT, /etc/my.cnf /etc/mysql/my.cnf /Users/kkoehntopp/homebrew/etc/my.cnf ~/.my.cnf As can be seen, my version of the MySQL client checks in this order /etc/my.cnf /etc/mysql/my.cnf /Users/kkoehntopp/homebrew/etc/my/cnf ~/.my.cnf The cnf file is a file in dot-ini syntax, so you have [groups] and each group contains lines with key = value pairs.


Kristian Köhntopp - September 22, 2020
A question to the internal #DBA channel at work: »Is it possible to change a column type from BIGINT to VARCHAR ? Will the numbers be converted into a string version of the number or will be it a byte-wise transition that will screw the values?« Asking yielded more information: »The use-case is to have strings, to have UUIDs.« So we have two questions to answer: Is ALTER TABLE t CHANGE COLUMN c lossy?

MySQL: Encoding fields for great profit.

Kristian Köhntopp - September 18, 2020
Iterating schemas over time is not an uncommon thing. Often requirements emerge only after you have data, and then directed action is possible. Consequently, working on existing data, and structuring and cleaning it up is a common task. In today’s example we work with a log table that logged state transitions of things in freeform VARCHAR fields. After some time the log table grew quite sizeable, and the log strings are repeated rather often, contributing to the overall size of the table considerably.

Mein selbstfahrendes Auto hat Kameras, oh nein!

Kristian Köhntopp - September 17, 2020
Bei der ARD ist man empört! Teslas Kameras: ARD rückt Datenschutzbedenken ins Licht . Will sagen, jemand hat älteren Personen gesteckt, daß ein selbstfahrendes Auto wenig überraschend Kameras braucht, um bei seiner bestimmungsgemäße Verwendung weniger ältere Personen überzumöllern. So weit so wenig überraschend. Bei einem Tesla sind es 9, bei anderen Automobilen anderer Hersteller noch mehr, und Radar, und Lidar und weitere Sensorsysteme. Problematisch sind nur Kameras, weil Menschen als visuelle Wesen zum Bild eine emotional andere Beziehung haben als zu anderen Daten, die objektiv nicht weniger kritisch sind.