Don’t say Backup, say Restore

isotopp image Kristian Köhntopp -
August 23, 2023
a featured image

This is about the third story I hear about a Fedi instance losing all their data because of a CI/CD mistake.

Lily Cohen ( has bad news.

Hugops, but also the usual grizzled old sysadmin advice:

Never say backup. Always say restore. This changes your mind.

A backup is a cost center. It has no value, it has only cost. Only a restore has a proven value, and comes with knowledge:

  • You know you actually can restore, the backup was complete and does connect.
  • You know how long the restore took, so you know the time to restore when asked. Not an estimate. The actual time.
  • You know the restore procedure.

Restore every backup all the time, then throw the recovered instance away. Keep the metrics, keep the backup.

There is no such thing as immutable, statelessness or whatever.

Parts of your setup may be stateless deployments with immutable images. That is, because you collected all system state and put it into one or two selected locations. You can redeploy everything but these selected locations.

If you drop them, if you make a config mistake, these things are gone gone. They cannot be redeployed unless you have taken measures to do so. See above, item 1.

Devops is easy except for the stateful parts.

That is why the storage people and the database people all look down on you hipster devops people and make condescending remarks. 🙂 Yah, ok, they are nicer than you probably think they are, but they do have a completely different outlook on operations.

Listen and learn. Also, restore test.

Also, ArgoCD: No prune resources and Kubernetes PV Reclaim Policy “Retain, not Delete.”

There are people who have taken steps to prevent their CI/CD from messing with EBS volumes, S3 buckets or K8s Persistent Volumes, and there are people who will lose data in the future.

Don’t be in the second group.

“Nobody wants backup. Everybody wants restore.” – Martin Seeger

See also Gitlab Data Loss .