Skip to content

Category: Computer Science

How do people develop for MacOS at scale?

So, how do people develop people for MacOS at scale?

Normal people throw compile jobs at their Kubernetes cluster, and fan out a compile across some two racks full of 50 core machines, giving you some 4000 cores to play with for distributed compiles.

Is there a MacOS LLVM docker image that runs the Xcode compiler in Linux containers and that can be plugged into this? Or are people piling Mac mini and Mac pro or other unrackable bullshit with insufficient remote management into racks, creating a nightmare farm of snowflakes?

How does Apple itself do this? Like animals, on the Desktop?

And how do you integrate such a remote compile farm into Xcode?


What G+ thinks you like to read…

The latest incarnation of “What’s hot”…

Google plus always has had a content discovery feature. In the past that have been the infamous “What’s hot…” entries. Postings that went into that category usually attracted a ton of spammers and even more haters, and one had a pretty blood crusted banhammer until the waves were through. Ask me how I know…

The current incarnation of “What’s hot” is marginally better at selecting and offering content, because it is somewhat more personalised. This is actually interesting, because the top bar shows a list of clickable keywords, which can give you a way to filter and also show you what Google plus would associate with your behavior.

Leave a Comment

From Data Centers to Computronium and Riding Light

So at work we discussed Data Center Design at scale, and then things got out of hand. We ended up discussing Computronium, a hypothetical stuff  that basically is a piece of thinking matter, performing computation, the ultimate composable piece of hardware.

Computronium is a problem, though. You can’t just cover the planet in a crunchy Computronium crust – not only because the Hotels have to go somewhere. But also, because whatever thickness of Computronium you propose, it has to be powered somehow.

Ultimately, it has to be powered by the amount of energy hitting us from the sun. So there is likely a Dyson sphere behind the earth or elsewhere, collecting even more energy from the sun and sending it into the Computronium.


The road to hell is paved with outdated passwords…

So I am using Chrome in a corporate context. Outdated password regulations force me to increment my password every three months. The reason for that is well understood (PCI compliance), but can’t be changed from inside the corporation.

Previously, Chrome stored my passwords in the Apple Keychain. So I could script this, using /usr/bin/security and push my password change into all saved passwords, or, alternatively, bulk delete all those old passwords.

Recent Chrome does not do that any more.


PHP: Understanding unserialize()

The history of serialize() and unserialize() in PHP begins with Boris Erdmann and me, and we have to go 20 years back in time. This is the day of the prerelease versions of PHP 3, some time in 1998.

Boris and I were working on Code for a management system for employee education for German Telekom. The front side is a web shop that sells classes and courses, the back end is a complex structure that manages attendance, keeps track of a line manager approval hierarchy and provides alternative dates for overfull classes.

In order to manage authentication, shopping carts and other internal state, we needed something that allowed us to go from a stateless system to a stateful thing, securely. The result was PHPLIB, and especially the code in

That code contained a function serialize(), which created a stringified representation of a PHP variable and appended it to a string. There was no unserialize() necessary, because serialize() generated PHP code. eval() would unserialize().


Making large tables smaller: Compression in MySQL

So JF has been busy abusing MySQL again:

  • An Adventure in InnoDB Table Compression (for read-only tables) is trying out different KEY_BLOCK_SIZES in InnoDB to find out which settings yield the best compression for a specific table.

    His sample script copies a table and compresses it with one setting, then does this again, and if the new version is smaller, keeps the new one. Otherwise the first version of the table is being kept. This is then done over and over until a minimum sized InnoDB compressed table has been created. JF managed to create a compression ratio of 4.53, bringing a 20 TB instance down to 4.5TB.

  • In Why we still need MyISAM (for read-only tables) he does the same thing with his database in MyISAM format, and then compresses using myisampack, which is ok because his data is read-only archive data.

    MyISAM uncompressed is 22% smaller than InnoDB uncompressed. Compressed, his data is 10x smaller than the raw InnoDB uncompressed, so his 20TB raw data is below 2T compressed.

Using MyISAM for read-only data is much less critical than it would be for data that is being written to: Data corruption due to the lack of checksums is much less likely, and while the lack of clustered indexes can not really be compensated, “myisamchk –sort-index” is at least keeping the regenerated indexes linear in the MYI files.

1 Comment

Monitoring – the data you have and the data you want

So you are running systems in production and you want to collect data from your systems. You need to build a monitoring system.

That won’t work and it won’t scale. So please stop for a moment, and think.

What kind of monitoring do you want do build? I know at least three different types of monitoring system, and they have very different objectives, and consequently designs.

Three types of Monitoring Systems

The first and most important system you want to have is checking for incidents. This Type 1 monitoring is basically a transactional monitoring system:


Scaling, automatically and manually

There is an interesting article by Brendan Gregg out there, about the actual data that goes into the Load Average metrics of Linux. The article has a few funnily contrasting lines. Brendan Gregg states

Load averages are an industry-critical metric – my company spends millions auto-scaling cloud instances based on them and other metrics […]

but in the article we find Matthias Urlichs saying

The point of “load average” is to arrive at a number relating how busy the system is from a human point of view.

and the article closes with Gregg quoting a comment by Peter Zijlstra in the kernel source:

This file contains the magic bits required to compute the global loadavg figure. Its a silly number but people think its important. We go through great pains to make it work on big machines and tickless kernels.

Let’s go back to the start. What’s the problem to solve here?

Leave a Comment