SQL Server with Mr. Denny

Dec 20 2010   2:00PM GMT

Database Restores and Torn Pages



Posted by: Denny Cherry
Tags:
Log Shipping
SQL Server
Storage

This weekend I was migrating some databases from one server to another.  As these were rather large databases (about 2 TB in size) and I didn’t have the clusters setup on a SAN (using DAS instead) I needed to setup log shipping from one cluster to another so that the downtime when moving was kept to a minimum.

But a little problem came up, when I restored one database I got a torn page error when trying to roll the logs forward.  My first thought was ok, maybe there was a network hiccup or a problem with the storage on the destination server.  So I restored the database again, and got a torn page error again.  However the second time the error was on a different page.  This is very important to know, because the torn page had moved from one page (1:1396373) to another (1:24815312) and I used the same database backup both times, this told me something very important.  That the torn page wasn’t caused by the source database or the database backup.  If it was then the page which was torn would have been the same both times.

Now you’ll notice that the torn page was detected when the transaction logs were being rolled forward, not while the actual restore was happening.  The reason for this is that torn pages aren’t checked for during a full backup restore.  Torn pages are only detected during two specific operations, when the pages are read into the buffer pool (my database is in NORECOVERY so that isn’t going to happen), and when the pages are changed (in this case they are changed when the transaction log rolls forward).  So the only other way I would have been able to verify that there wasn’t a torn page would have been to bring the database up in STANDYBY mode and to a scan of every index (both clustered and non-clustered) which would have caused all the pages to be read into the buffer pool and caused the torn page message to be thrown.  But this can’t happen in this case because I’m log shipping form SQL Server 2005 to SQL Server 2008 and the database can’t be brought up in STANDYBY mode because a database upgrade needs to be done.  So checking for additional torn pages requires taking a fully rolled forward database out of NORECOVERY and putting it into RECOVERY, querying every table on every index so that every page is touched (by using index hints) to ensure that the torn page isn’t there.

This not so quick bit of testing (the database backup in question is about 300 Gigs) told me that there was a problem with either the HBAs or the DAS that the cluster was using.  To further test we failed the SQL Cluster over to the other node and ran the restore again which didn’t cause any torn page error messages when rolling the logs forward.  This would appear to put the DAS in the clear, and put the problem either on the other node of the cluster, the HBAs in that node, cables, etc.  Something which only that server uses to connect to the storage.

If you ever setup log shipping and you get torn pages let this be a lesson, don’t assume that there is a problem on the source database.  It might be a problem on the destination database instead.

Denny

 Comment on this Post

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: