Skip to content

Category: Performance

What GPUs can do…

Pcgamer reports “Nvidia CEO says Moore’s Law is dead and GPUs will replace CPUs“. Now, Jensen Huang might be a bit biased here, but he reminded us that “GPUs are advancing at a much faster pace than CPUs” and “that GPUs will replace CPUs soon, adding that at this point, designers can hardly work out advanced parallel instruction architectures for CPUs.”

So what can a modern GPU do? Well, apparently Font Rendering is still a hard problem for GPUs, and a bottleneck in modern browsers. That’s not to say it’s not being done – the linked article contains lot of pointers.

And an older article about the Ubershaders basically explains how the Dolphin GameCube/Wii-Emulator uses modern GPU hardware to live-emulate 2002/2006 GPU hardware, in realtime (for a short time, while the CPU in the background creates more optimised precompiled GPU setups and code).

Leave a Comment

Monitoring – the data you have and the data you want

So you are running systems in production and you want to collect data from your systems. You need to build a monitoring system.

That won’t work and it won’t scale. So please stop for a moment, and think.

What kind of monitoring do you want do build? I know at least three different types of monitoring system, and they have very different objectives, and consequently designs.

Three types of Monitoring Systems

The first and most important system you want to have is checking for incidents. This Type 1 monitoring is basically a transactional monitoring system:

3 Comments

Scaling, automatically and manually

There is an interesting article by Brendan Gregg out there, about the actual data that goes into the Load Average metrics of Linux. The article has a few funnily contrasting lines. Brendan Gregg states

Load averages are an industry-critical metric – my company spends millions auto-scaling cloud instances based on them and other metrics […]

but in the article we find Matthias Urlichs saying

The point of “load average” is to arrive at a number relating how busy the system is from a human point of view.

and the article closes with Gregg quoting a comment by Peter Zijlstra in the kernel source:

This file contains the magic bits required to compute the global loadavg figure. Its a silly number but people think its important. We go through great pains to make it work on big machines and tickless kernels.

Let’s go back to the start. What’s the problem to solve here?

Leave a Comment

On cache problems, and what they mean for the future

This is a disk utilization graph on a heavily loaded Graphite box. In this case, a Dell with a MegaRAID, but that actually does not matter too much.

Go-carbon was lagging and buffering on the box, because the SSD was running at its IOPS limit. At 18:10, the write-back cache and the “intelligent read-ahead” are being disabled, that is, the MegaRAID is being force-dumbed down to a regular non-smart controller. The effect is stunning.

/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp NORA -l0 -aALL
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp WT -l0 -aALL

and also, on top of that,

#Direct IO instead of cached
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp DIRECT -l0 -aALL
#Force SSD disk write cache (our SSD has super-capacitors, so it safe to enable)
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp -EnDskCache -l0 -aALL

What we observe here is part of an ongoing pattern, and we will see more of it, and at more layers of the persistence-stack in our systems.

2 Comments

“Usage Patterns and the Economics of the Public Cloud”

The paper (PDF) is, to say it in the words of Sascha Konietzko, eine ausgesprochene Verbindung von Schlau und Dumm (“a very special combination of smart and stupid”)

The site mcafee.cc is not related to the corporation of the same name, but the site of one of the authors, R. Preston McAfee.

The paper looks at the utilization data from a number of public clouds, and tries to apply some dynamic price finding logic to it. The authors are surprised by the level of stability in the cloud purchase and actual usage, and try to hypothesize why is is the case. They claim that a more dynamic price finding model might help to improve yield and utilization at the same time (but in the conclusion discover why in reality that has not happened).

Leave a Comment

The attack of the killer microseconds

In the Optane article I have been writing about how persistent bit-addressable memory will be changing things, and how network latencies may becoming a problem.

The ACM article Attack of the Killer Microseconds has another, more general take on the problem. It highlights how we are prepared in our machines to deal with very short delays such as nanoseconds, and how we are also prepared to deal with very long delays such as milliseconds. It’s the waits inbetween, the network latencies, sleep state wakeups and SSD access waits, that are too short to do something else and too long to busy wait in a Spinlock.

1 Comment

Gaming Laptops – your recommendations?

The current vacation is hard on me, because I hardly get to use my own computer – the best wife of all and the Schnuppel both compete for time on my machine in order to play Transport Fever and Cities: Skylines. That’s an annoyance not only because I can’t get the keyboard, but also because a MacBook pro apparently sucks as a gaming machine.

So this website lists a bunch of relatively recent laptops with proper graphics cards, and household peace seems to require a premade machine and a transportable device (not a desktop device).

What would be your recommendation (see above, and maybe Elite Dangerous and No Man’s Sky), and why?

16 Comments