Member-only story

Oops Recovery

Park Sehun
2 min readAug 13, 2021

Have you ever deleted something very important, which you can’t easily revert? Have you been surprised when you drop the table accidentally? Have you typed “the destroy” instead of “apply” and deleted all cloud resources?

There are many cases you accidentally mistake something and find it’s really difficult to go back to the previous status. And Oops recovery is NOT a backup/restore strategy.

If you have the database and read replica, your backup/restore plan is to promote the read replica to the master (primary) one. It may be a good one but not for the “Oops situation”. Because your corruption in the primary database will be replicated to the read replica. In that case, PITR (Point in time recovery) can be the solution for Oops recovery. The drawback is that it requires a full database restore, taking a long time.

Of course, I’m not talking about only the data backup/restore, but also application and infrastructure. The important thing is to have the plans (strategies), it may be a part of troubleshooting or restoration guides. But when humans and robots do a mistake, and it’s repeated, it’s not mistaken anymore it’s a decision. You should prepare for the backup for the frequent errors. Then…how to?

First, you can have categories that you need to prepare for the Oops recovery.

  • Application
  • Data
  • Infrastructure
  • etc.

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

No responses yet

Write a response