How to recover a 'rm -rf /var/lib/mysql/*'
There’s a saying I like:
In life, there are two types of system administrators: The one who has already made a big mistake in production And the one that will soon be
Well, I just moved from the second to the first category in a command line:
And of course on the production server, and not on the future one.
In short, the BIG dumpling. On an 80GB database that takes several hours to restore.
And very big spoiler, we managed to restore everything.
Before I read on, I don’t guarantee that this will work for you. And declines any responsibility if you lose your data.
But you may be happy to be able to get as much of your bases/tables as possible.
First of all, the first thing to do is to cut off any activity that may be present on this instance:
- Shutdown applications can access to Mysql
- Put an iptable rule
Whatever, but above all, reduce access to this instance.
The second thing to keep in mind, DO NOT STOP Mysql. That’s what’s going to save us.
I have since written a Python script that will allow me to save as much as possible.
Then I will quickly explain why all this can work.
But first, here’s how to reproduce this in your home (in a Vagant). Don’t try that on a real server.
Setting up the test environment
We’re going to use Vagrant
We’re going to start the restoration:
And now, let’s let the magic work.
The script in --help
mode
Recovery of deleted files
Extraction of data in CSV formats for greater security
Well, if the database is ever corrupted, we’ll be happy to have the data in CSV format. However, in my tests, I had a lot of crashes with the Mysql engine, so you have to be very careful with the controls. But also to think that it might crash and that we’ll lose everything.
And voilà!
Put the data back in place
Various instructions
- Never stop Mysql, never, never, never, never, never, never, never, never, never
- Do not try to mysqldump while using this tool, never (again)
- We identify the mysql process with the command
ps ax | grep[m]ysqld
- We should only have one process. Write it down for later
- We’ll check the files we can recover
sudo lsof -p $MYSQL_PID
- We can see all the files deleted
- And as long as we have Mysql running, we’ll be able to get those files back
Why it can work
Already, it is necessary to understand how Linux works in ext* when deleting a file.
We’re going back to the base. When a process opens a file, it opens an FD “file descriptor” in lsof
.
A file is defined in the ext* file system as a path that points to an inode.
When you delete a file, you delete this link in the file allocation table.
And ext* frees up the space when an inode no longer has an FD open on the inode.
That’s why sometimes you delete a large file, but the space is not cleared.
Or when you make a `mv’ of a log file, touch the new file and the process continues to write to the file that has been renamed. The FD is still open on the old inode. It hurts a little bit of a headache, doesn’t it?
And so since Mysql keeps the files open, we can make a cat /proc/$PID/df/9
.
So much for the recovery part of the deleted files.
Now, the stories around touch
.
This is the part I have less control over, but I can imagine Mysql’s behavior.
When doing a show databases;
Mysql must read the list of directories in /var/lib/mysql
. That’s why you can end up with a lost+found
database.
Then when we do a show tables;
Mysql must read the list of files *.MYI
or *.MYD
(or both, or when one misses mysql may be crash). So a touch of the files brings up the tables. But since the files are already open, he will be able to access the old deleted files content.
However, a desc table
or select
will not work. For this purpose Mysql needs the *.frm
files that contain the table structure. And here, the frm/MYD/MYI
files must be properly aligned.
That’s it, we only have the Mysqldump that doesn’t work. I think he tries to read directly into the content of the files and that the `touch’ mess disturbs him.
I’m done with this article. I hope I taught you some things. That you’ll never have to put all this into practice on production.
And if I have any advice to give:
- Make backups
- Testing restorations
- Note the time of these operations to be aware of the time it will take in case of a crash
- If you do mysqldump, also think about making regular backups of.frm files