Die wunderbare Welt von Isotopp

MySQL: The command line client

When asking for help in Libera Chat , in the #mysql channel, people will ask you to use the mysql command line client. They will also point you to dbfiddle.uk for asking questions. Specifically, when using phpMyAdmin, you will get hate.
Why is that?
When asking for help, it is almost impossible to help a GUI user, because they will need to paste screenshots in order to document what they did.
Meditations on Quitting

I quit often. At least once a year, but there have been years when I have been quitting four times.
Every time before a performance evaluation, or before important meetings, I sit down and write myself a notice. I pull up a word processor, start the empty business letter template, fill in the details and the date, and then write the three or four sentences necessary to inform my employer that the time has come to part ways.
From Hadoop to HTAP?

For the last 15 years, one popular way to persist large amounts of data has been Hadoop, if you needed them persisted in such a way that you can still process them.
Of course, from a database point of view, brute forcing a result by scanning compressed CSV files in parallel, and then building giant distributed Hash Joins is not very elegant. But two facts were true in 2006 influenced Hadoop’s design, and which allowed it to process data at all at a scale where all other things failed:
Tying the BI pipeline together

In Of Stars and Snowflakes we have been looking at the “normal Form” for Data Warehouses/BI structures, and how it differs from normal forms used in transactional systems. In ETL from a Django Model we looked at one implementation of a classical offline DWH with a daily load.
The normal BI structure is a fact table, in which an object identifier (the one we collect facts about) is paired with a point in time to report facts about the object at a certain point in time.
Ansible: List Cross-Join

A friend asked in Discord:
I need a pointer to a solution in Jinja.
Given two lists, x: [a,b,c] and y: [d,e,f], I need the cross-join ["a.d","a.e","a.f","b.d",…,"c.e","c.f"]. I know how to cross-join, but that then is a list of lists, and I want join the inner lists.
After some experimentation the result was a set of nasty templating loops. There has to be a better way.
There are two:
Ansible Custom Filters in Python Playbook We want a custom filter cross, which produces the desired result.
ChatGPT and Limits

Like everybody else, I have been playing with ChatGPT from OpenAI . Specifically, I wanted to test how it could be used as a coding assistant, and what the limits are in terms of size and complexity.
Code Generation I have been using the Labyrinths example as a base. My goal was to have ChatGPT write the Labyrinth class for me.
I did so interactively.
Kris:
Write an empty Python class named Labyrinth
2FA für Mastodon

Multi-Factor Authentication (Identifikation mit mehreren Faktoren) oder 2FA (Two-Factor-Authentication) sind ein Weg, einen Account vor der Übernahme durch Dritte zu schützen.
Statt sich mit Usernamen und Passwort anzumelden ist zusätzlich noch eine wechselnde Pseudozufallszahl notwendig. Diese wird von einem Seed-Wert generiert, der durch eine Buchstabenfolge oder einen QR-Code repräsentiert wird.
Authenticator Anwendung installieren Das Verfahren ist standardisiert und wird von vielen Tools unterstützt. Dazu gehören Google Authenticator, Bitwarden und viele andere Passwortmanager.
Change Data Capture

Change Data Capture is a way to capture, well, events from a system that describe how the data in the system changed. For a system that does business transactions that may be at the lowest level Create, Update, or Delete of entities or relationships. Systems that emit this kind of events are called Entity Services and are kind of the lowest level of events that you can have in such a system.
USENET und Tiernetze

Vor ziemlich genau 30 Jahren gab es in Deutschland die Anfangsgründe des Internet , aber es gab auch Netze, die auf anderer, viel älterer Technologie betrieben wurden – die Mailboxnetze. Das sind dezentrale Netze, bei denen denen lokale Rechner mit Modems ausgestattet wurden, bei denen man anrufen und dann Nachrichten a la Mastodon online lesen konnte. Oder man hatte Software daheim, die bei der Mailbox anrief, die Nachrichten heruntergeladen hat. Dann konnte man offline lesen, Antworten schreiben und ein zweites Mal anrufen.
Systemd and docker -H fd://

Based on what I learned in Systemd Service and Socket Activation and Systemd Service and stdio , we can now have a look at Docker.
The code for -H fd://-Handling is here . The file descriptors are coming from activation.Listeners(), and are in the listeners slice. In our case, the part after the fd:// is empty, so lines 83-85 are activated, and the incoming fd’s are passed to the Docker proper.
Systemd Service and stdio

After yesterday’s article, Arne Blankerts pointed me at a note showing how to install a program using stdio with systemd.
Code and Unit files The code:
#! /usr/bin/env python3 import sys if __name__ == "__main__": while True: line = input().strip() print(f"ECHO: {line}") if line == "QUIT": sys.exit(0) The Socket Unit:
$ systemctl --user cat kris2.socket # /home/kris/.config/systemd/user/kris2.socket [Unit] Description=My second service PartOf=kris2.service [Socket] ListenStream=127.0.0.1:12346 Accept=Yes [Install] WantedBy=sockets.target And the Service Unit, which has to be a template:
Systemd Service and Socket Activation

In today’s Yak Shaving session I needed to understand how to expose the docker socket of a remote machine over the network. You should not do that, it is totally insecure, but I needed to do that to test something.
Socket Activation I discovered that dockerd is running with -H fd://.
# ps axuwww | grep docker[d] root 1616732 0.5 0.1 2930892 52168 ? Ssl 15:32 2:25 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.
ETL from a Django Model

Continued from last weeks article on data warehouses.
At work, I was tasked with building a capacity model for data center growth. The basic assumption of these things is often that the future behaves similarly to the past, so the future predicted capacity model is somehow an extension of past growth. I needed old server usage data, and was indeed able to find that in one of our systems, called ServerDB.
Of Stars and Snowflakes

A sample system When you have an Online Transactional Database, you have to record transactions at some point in time. That means you get a table with time dimension in your OLTP system. Consider for example a system that records Reservations. Users exist and can reserve Things to use, for a day.
You probably get a structure such as this:
In an OLTP database, a reservation is a (resid, userid, thingid, date).
Databases on un-RAID-ed storage?

Where I work, we run bare-metal databases on non-redundant local storage. That is, a database is a very cheap frontend blade server. It has 2 CPUs, with 8 cores/16 threads each. It contains 128 GB of memory, 2 or 4 TB of local NVME and it has a 10 GBit/s network interface. It costs around 120 to 150 Euro per month to run for 5 years, including purchase price and all datacenter costs.
Bandwidth, IOPS and Latency

A harddisk from 1998.
The opening image for this post shows the stock photo of a hard disk platter. You can see a movable arm that can ride in and out of a stack of rotating platters coated with some kind of metal oxide. We sometimes call this kind of storage condescendingly “rotating rust”, when in reality it is a triumph of material science.
Moving an arm costs time, and bringing that arm into the right position and then waiting until the right segment of disk rotates underneath it so that we can write things to disk takes time.
Proper O11y for MySQL

Three years ago, I learned that due to SREcon, Charity Majors was in Amsterdam. I set up a meeting between Benjamin Tyler, Yves Orton and a few more colleagues of mine, and her. That is, because apparently in a case of co-evolution, our company internal “Events” system and Honeycombs observability tooling, modelled after experiencing Fabooks “Scuba” seemed to be doing a lot of the same things.
These days, we are using Honeycomb a lot to record events, and debug code running in distributed systems.
Software Supply Chain Issues

The GitHub Security Lab has a long hard look at “Apache Commons Text” in March this year. That resulted in CVE-2022-42889 . The exploit goes like this:
final StringSubstitutor interpolator = StringSubstitutor.createInterpolator(); String out = interpolator.replace("${script:javascript:java.lang.Runtime.getRuntime().exec('touch /tmp/foo')}"); System.out.println(out); Next to ${script:...} there are apparently also a ${url:...} and `${dns} as other unsuitable substitutions, and they nest.
This was fixed in October 2022, after being reminded by GHSL in May and August.
Groups and Places

In a distributed, asynchronous environment, there is a need for distributed, asynchronous interaction. This interaction is often written, but “writing” these days is actually a media-rich process that includes much more than letters. It also needs to be able to build some structure, and some gateway to level up to more synchronous and even richer communication.
Let’s have a chat about chats, and what properties they have.
Historically, chat was lines of text, without much structure.
Pan Narrans and Better Meetings

When you are looking for a better Remote First culture, you are looking for better meetings. If you go for better meetings, you will also have fewer of them.
“The anthropologists got it wrong when they named our species Homo sapiens (‘wise man’). In any case it’s an arrogant and bigheaded thing to say, wisdom being one of our least evident features. In reality, we are Pan narrans, the storytelling chimpanzee.