Monday this week was the first day I came back to work from my long-expected vacation. An odd surprise was thrown at me on this very day right after I stepped into the office and a bizarre issue was awaiting for me to deal with: one of those medium sized databases (~40GB) for a QA team’s testing project was corrupted last Friday and consequently their test suspended at somewhere close to a check point. As I am the SME on these matters, the teammates left it for me to work a way out once I got back to work. Good guys!
Swearing almost burst out from my mouth after my preliminary examination of the issue because all the bad settings and practices you can think of were actually found there (and they had claimed they did not need a DBA to look after their databases – are you kidding me?). And there was not even a single backup to get the database restored, and therefore not even a way to recover the data at the price of losing some data after the last backup.
For almost a day, I was trying out all the skills I possess, and searching on the web wildly as well, only hope to get some of the data back – I could not count on any possible support from the vendor as the software has been way out of their supporting period. And DBCC CHECKDB repair did not work even I managed to make the database “accessible”!
Eventually, I came to realize that there was not a silver bullet for the kill, but more practically a piece by piece recovery.
So I set the database into emergency mode and then single user mode to make sure it is accessible for me myself only, and then started to extract data table by table for 475 tables.
Starting from Tuesday morning, pumping data out form the corrupted database to a new image of it was the only task I was focusing on. Some of the 475 tables contain merely 100s rows while others multi-millions. And at the end of the day, after 472 tables done successfully, it all came down to 3 multi-million tables that could not be recovered – they were the ones all sitting on one same physical data file damaged at the data page level! One table has 8+ million rows of data, another 5+ millions rows, and still another I could not even tell how many rows it had as the head sector of the data page corrupted itself.
But there is no way to export all the rows from the 3 – they are corrupted, period!
A wild idea arose up and that could be my last resort, I thought for myself. Similar to recovering other tables, I used a piece by piece method again, only this time a simple INSERT … SELECT command pair applied repeatedly to transfer data from the 3 old and corrupted tables to 3 new ones rows by rows.
Ran into off-hours quite a while that day.
And the result was magnificent!
For the 5+ million-row table, I got 99.99% rows recovered, 99.97% for the 8+ million-row table and 2.8 million rows for the third table.
This is even better than restoring the database from a previous backup and the latter will certainly loss all of the “new” dada that had been entered into the database since the last backup — supposedly there were one, and that could well mean millions of rows being lost!
A major data recovery effort, said the manager!
So true! and as true was the odd way of "Welcome back to work"