Skip to content

Category: Computer Science

The road to hell is paved with outdated passwords…

So I am using Chrome in a corporate context. Outdated password regulations force me to increment my password every three months. The reason for that is well understood (PCI compliance), but can’t be changed from inside the corporation.

Previously, Chrome stored my passwords in the Apple Keychain. So I could script this, using /usr/bin/security and push my password change into all saved passwords, or, alternatively, bulk delete all those old passwords.

Recent Chrome does not do that any more.


PHP: Understanding unserialize()

The history of serialize() and unserialize() in PHP begins with Boris Erdmann and me, and we have to go 20 years back in time. This is the day of the prerelease versions of PHP 3, some time in 1998.

Boris and I were working on Code for a management system for employee education for German Telekom. The front side is a web shop that sells classes and courses, the back end is a complex structure that manages attendance, keeps track of a line manager approval hierarchy and provides alternative dates for overfull classes.

In order to manage authentication, shopping carts and other internal state, we needed something that allowed us to go from a stateless system to a stateful thing, securely. The result was PHPLIB, and especially the code in

That code contained a function serialize(), which created a stringified representation of a PHP variable and appended it to a string. There was no unserialize() necessary, because serialize() generated PHP code. eval() would unserialize().


Making large tables smaller: Compression in MySQL

So JF has been busy abusing MySQL again:

  • An Adventure in InnoDB Table Compression (for read-only tables) is trying out different KEY_BLOCK_SIZES in InnoDB to find out which settings yield the best compression for a specific table.

    His sample script copies a table and compresses it with one setting, then does this again, and if the new version is smaller, keeps the new one. Otherwise the first version of the table is being kept. This is then done over and over until a minimum sized InnoDB compressed table has been created. JF managed to create a compression ratio of 4.53, bringing a 20 TB instance down to 4.5TB.

  • In Why we still need MyISAM (for read-only tables) he does the same thing with his database in MyISAM format, and then compresses using myisampack, which is ok because his data is read-only archive data.

    MyISAM uncompressed is 22% smaller than InnoDB uncompressed. Compressed, his data is 10x smaller than the raw InnoDB uncompressed, so his 20TB raw data is below 2T compressed.

Using MyISAM for read-only data is much less critical than it would be for data that is being written to: Data corruption due to the lack of checksums is much less likely, and while the lack of clustered indexes can not really be compensated, “myisamchk –sort-index” is at least keeping the regenerated indexes linear in the MYI files.

1 Comment

Monitoring – the data you have and the data you want

So you are running systems in production and you want to collect data from your systems. You need to build a monitoring system.

That won’t work and it won’t scale. So please stop for a moment, and think.

What kind of monitoring do you want do build? I know at least three different types of monitoring system, and they have very different objectives, and consequently designs.

Three types of Monitoring Systems

The first and most important system you want to have is checking for incidents. This Type 1 monitoring is basically a transactional monitoring system:


Scaling, automatically and manually

There is an interesting article by Brendan Gregg out there, about the actual data that goes into the Load Average metrics of Linux. The article has a few funnily contrasting lines. Brendan Gregg states

Load averages are an industry-critical metric – my company spends millions auto-scaling cloud instances based on them and other metrics […]

but in the article we find Matthias Urlichs saying

The point of “load average” is to arrive at a number relating how busy the system is from a human point of view.

and the article closes with Gregg quoting a comment by Peter Zijlstra in the kernel source:

This file contains the magic bits required to compute the global loadavg figure. Its a silly number but people think its important. We go through great pains to make it work on big machines and tickless kernels.

Let’s go back to the start. What’s the problem to solve here?

Leave a Comment

What is a Service Mesh? What is the purpose of Istio?

Service Mesh

An article by Phil Calçado explains the Container Pattern “Service Mesh” and why one would want that in a really nice way.

Phil uses early networking as an example, and explains how common functionality needed in all applications was abstracted out of the application code and moved into the network stack, forming the TCP flow control layer we have in todays networking.

A similar thing is happening with other functionality that all services that do a form of remote procedure call have to have, and we are moving this into a different layer. He then gives examples of the ongoing evolution of that layer, from Finagle and Proxygen through Synapse and Nerve, Prana, Eureka and Linkerd. Envoy and the resulting Istio project of CNCF are the current result of that development, but the topic is under research, still.

Leave a Comment

So you want to write a Shell script

So some people, companies even, have guidelines that describe how to write shell scripts, or even unit tests for shell scripts, as if “UNIX Shell” was a programming language. That’s wrong.

“Modern Shells” are based on a language that has been written without a formal language specification. The source looked like this, because somebody didn’t like C and wanted Algol, abusing the preprocessor. The original functionality and language rules had to be reverse engineered from that source, and original shell has a lot of weird rules and quirks:

  • You can use the caret, ‘^’, as replacement for the pipe symbol, ‘|’.
  • Check out the section »Consider a variable which has been picked up by the shell from the environment at startup. Modifying this variable creates a local copy.« in that document, especially the part where they explain this:
    If you call a script directly from a bourne shell (“./script” without shebang),  then the shell only forks off a subhell and reads in the script.
    The split between original and local copy of the variable is still present in the subshell.But if the script is a real executable with #! magic, or if another sh is called, then fork and exec is used and only the original unmodified variable will be visible.

And it gets better if you go down the entirety of that particular document.

If you think Unix Shell is a survivable programming environment, good luck, and please take your code with you while you leave.


AMS-IX CEO leaves

AMS-IX press release: »After 17.5 years, Job Witteman will leave AMS-IX as of October 1st. Job Witteman, who was the founder of the world’s largest internet exchange, will leave at the peak of development of the company, […]«


»AMS-IX has grown to be the world’s largest internet exchange, having now more than 900 customers, operating 7 internet platforms globally and a peak of internet traffic of 5.5 Tbps. In addition, AMS-IX is considered by the Dutch Government as the third main port, together with Schiphol Airport and the port of Rotterdam. This shows that the ongoing development of the digital sector and infrastructure is taken seriously.«


How large can MySQL Replication be in production?

Based on How large can MySQL be in production (article by JF), I have been looking at the awesome Orchestrator visualization of some replication hierarchies.

Click for large

Names have been removed, but the blue badges indicate some host counts. All of these are installations replicating across multiple Cluster/Data Center Boundaries.

Orchestrator is by far the best tool to discover and manage replication setups like these. You can restructure replication topologies with drag and drop.