Skip to content

Category: Erklärbär

Hashes in Structures

In Hashes and their uses we have been talking about hash functions in general, and cryptographic hashes in particular. We wanted four things from cryptographic hashes:

  1. The hash should be fast to calculate on a large string of bytes.
  2. The hash is slow to reverse (i.e. only by trying all messages and checking each result).
  3. The hash is slow to find collisions for (i.e. it’s hard to find two input strings that have the same hash value).
  4. The hash does chaotically cascade changes (i.e. a single bit flip in the original message does flip many bits in the hash value).

With these things and general cryptography we can built three very versatile things that see many applications: Digital signatures, eternal logfiles (“blockchains”) and hash trees (“torrents”).


Hashes and their uses

A hash function is a function that maps a large number of arbitrary data types onto a smaller number of contiguous integers.

This simple hash function maps strings of arbitrary length to integers. Some strings are mapped to the same integer: a hash value collision.

The base set here is a number of strings of arbitrary length, which is a theoretically open ended set size. The target is a bounded number of integer values. It is thus inevitable that two strings exist which are mapped to the same target number, a hash value collision.

Hash functions are useful in computer science, and you have been using them in everyday life, or at least seen them:

  • as checksums
  • to quickly assign a position to an arbitrary object
  • or to create object identity from content.
Leave a Comment

Conway’s Law

Melvin Conway is a compiler developer and systems designer, who is well known for the eponymous Conway’s Law. Various phrasings exist of that, and one popular is

Organizations which design systems are constrained to produce designs which are copies of the communication structures of these organizations.

The original paper and an introductory paragraph can be found on his website. It’s worth reading, because there are more useful insights to be found in the original writeup.

So what does this even mean? Can you give examples from your current or previous work environments?

1 Comment

The Sonos Rage Wave

Update in progress…

Sonos shipped an update – and it contained a revised privacy statement. The new privacy statement can also be found here (de, en). Go read it, it’s shockingly well written.

Even more so considering the really complex situation Sonos is in – as an independent platform for streaming music from dozens of services, and in the future as a platform for digital assistants, they have a bundle of multilateral legal and contractual obligations that they need to handle on top of maintaining a technologically demanding product.

A case of really bad journalism can be found at Heise (Article in German):


Community Management?

Today is a weird day. First thing is a friend asking about help with community management. And next thing is Fefe reiterating his longstanding fallacy (Rant in German) that programmers are able to do anything just because they are able to do one thing (here: Community Management).

The TL;DR is that he rants against non-programmers showing interest into programming projects, because the software is actually useful, ruining everything.

Dabei ist es so einfach, sich in einem Projekt Respekt zu erarbeiten. Leiste einfach was. Erwarte nichts als Gegenleistung. Problem: Jede Minute über dich oder deine Leistungen reden macht 10 Minuten tatsächliche Leistung kaputt.

But it is easy to get respect in a project: Just show something useful. Don’t expect a return. Problem: Every minute speaking about yourself or your results ruins ten minutes of actual useful work.

That is, of course, nonsense. It just shows, like his example about the closed umatrix bug tracker, a complete lack of understanding of the communication situation and a failure to organise the the communication efficiently.


PHP: Understanding unserialize()

The history of serialize() and unserialize() in PHP begins with Boris Erdmann and me, and we have to go 20 years back in time. This is the day of the prerelease versions of PHP 3, some time in 1998.

Boris and I were working on Code for a management system for employee education for German Telekom. The front side is a web shop that sells classes and courses, the back end is a complex structure that manages attendance, keeps track of a line manager approval hierarchy and provides alternative dates for overfull classes.

In order to manage authentication, shopping carts and other internal state, we needed something that allowed us to go from a stateless system to a stateful thing, securely. The result was PHPLIB, and especially the code in

That code contained a function serialize(), which created a stringified representation of a PHP variable and appended it to a string. There was no unserialize() necessary, because serialize() generated PHP code. eval() would unserialize().


Monitoring – the data you have and the data you want

So you are running systems in production and you want to collect data from your systems. You need to build a monitoring system.

That won’t work and it won’t scale. So please stop for a moment, and think.

What kind of monitoring do you want do build? I know at least three different types of monitoring system, and they have very different objectives, and consequently designs.

Three types of Monitoring Systems

The first and most important system you want to have is checking for incidents. This Type 1 monitoring is basically a transactional monitoring system:


Scaling, automatically and manually

There is an interesting article by Brendan Gregg out there, about the actual data that goes into the Load Average metrics of Linux. The article has a few funnily contrasting lines. Brendan Gregg states

Load averages are an industry-critical metric – my company spends millions auto-scaling cloud instances based on them and other metrics […]

but in the article we find Matthias Urlichs saying

The point of “load average” is to arrive at a number relating how busy the system is from a human point of view.

and the article closes with Gregg quoting a comment by Peter Zijlstra in the kernel source:

This file contains the magic bits required to compute the global loadavg figure. Its a silly number but people think its important. We go through great pains to make it work on big machines and tickless kernels.

Let’s go back to the start. What’s the problem to solve here?

Leave a Comment

So you want to write a Shell script

So some people, companies even, have guidelines that describe how to write shell scripts, or even unit tests for shell scripts, as if “UNIX Shell” was a programming language. That’s wrong.

“Modern Shells” are based on a language that has been written without a formal language specification. The source looked like this, because somebody didn’t like C and wanted Algol, abusing the preprocessor. The original functionality and language rules had to be reverse engineered from that source, and original shell has a lot of weird rules and quirks:

  • You can use the caret, ‘^’, as replacement for the pipe symbol, ‘|’.
  • Check out the section »Consider a variable which has been picked up by the shell from the environment at startup. Modifying this variable creates a local copy.« in that document, especially the part where they explain this:
    If you call a script directly from a bourne shell (“./script” without shebang),  then the shell only forks off a subhell and reads in the script.
    The split between original and local copy of the variable is still present in the subshell.But if the script is a real executable with #! magic, or if another sh is called, then fork and exec is used and only the original unmodified variable will be visible.

And it gets better if you go down the entirety of that particular document.

If you think Unix Shell is a survivable programming environment, good luck, and please take your code with you while you leave.


Zero Factor Authentication

Dear Internet, Today I Learned that oath-toolkit exists in Homebrew.

So, this is a thing:

$ brew install oath-toolkit
$ alias totp='oathtool --totp -b YOURSECRET32BLA | pbcopy'

And so is this:

#! /usr/bin/env expect -f
set totp [ exec oathtool --totp -b MYSECRET7W22 ]
spawn ssh
expect "Password:"
sleep 1
send "thisIsN0t1GoodPaszwort@\r"
expect "Two Factor Token:"
sleep 1
send "$totp\n"

Yup, it’s totally possible to laugh and cry at the same time.