xfs_repair of critical volume

Discussion:

(too old to reply)

Eli Morris

2010-10-31 07:54:13 UTC

Hi,

I have a large XFS filesystem (60 TB) that is composed of 5 hardware RAID 6 volumes. One of those volumes had several drives fail in a very short time and we lost that volume. However, four of the volumes seem OK. We are in a worse state because our backup unit failed a week later when four drives simultaneously went offline. So we are in a bad very state. I am able to mount the filesystem that consists of the four remaining volumes. I was thinking about running xfs_repair on the filesystem in hopes it would recover all the files that were not on the bad volume, which are obviously gone. Since our backup is gone, I'm very concerned about doing anything to lose the data that will still have. I ran xfs_repair with the -n flag and I have a lengthly file of things that program would do to our
filesystem. I don't have the expertise to decipher the output and figure out if xfs_repair would fix the filesystem in a way that would retain our remaining data or if it would, let's say t!
runcate the filesystem at the data loss boundary (our lost volume was the middle one of the five volumes), returning 2/5 of the filesystem or some other undesirable result. I would post the xfs_repair -n output here, but it is more than a megabyte. I'm hoping some one of you xfs gurus will take pity on me and let me send you the output to look at or give me an idea as to what they think xfs_repair is likely to do if I should run it or if anyone has any suggestions as to how to get back as much data as possible in this recovery.

thanks very much,

Eli

Stan Hoeppner

2010-10-31 09:54:46 UTC

Permalink

Post by Eli Morris
Hi,
I have a large XFS filesystem (60 TB) that is composed of 5 hardware RAID 6 volumes. One of those volumes had several drives fail in a very short time and we lost that volume. However, four of the volumes seem OK. We are in a worse state because our backup unit failed a week later when four drives simultaneously went offline. So we are in a bad very state. I am able to mount the filesystem that consists of the four remaining volumes. I was thinking about running xfs_repair on the filesystem in hopes it would recover all the files that were not on the bad volume, which are obviously gone. Since our backup is gone, I'm very concerned about doing anything to lose the data that will still have. I ran xfs_repair with the -n flag and I have a lengthly file of things that program would do to ou

r filesystem. I don't have the expertise to decipher the output and figure out if xfs_repair would fix the filesystem in a way that would retain our remaining data or if it would, let's say

t!

Post by Eli Morris
runcate the filesystem at the data loss boundary (our lost volume was the middle one of the five volumes), returning 2/5 of the filesystem or some other undesirable result. I would post the xfs_repair -n output here, but it is more than a megabyte. I'm hoping some one of you xfs gurus will take pity on me and let me send you the output to look at or give me an idea as to what they think xfs_repair is likely to do if I should run it or if anyone has any suggestions as to how to get back as much data as possible in this recovery.

This isn't the storage that houses the genome data is it?

Unfortunately I don't have an answer for you Eli, or, at least, not one
you would like to hear. One of the devs will be able to tell you if you
need to start typing the letter of resignation or loading the suicide
pistol. (Apologies if the attempt at humor during this difficult time
is inappropriate. Sometimes a grin, giggle, or laugh can help with the
stress, even if for only a moment or two. :)

One thing I recommend is simply posting the xfs_repair output to a web
page so you don't have to email it to multiple people. If you don't
have an easily accessible resource for this at the university I'll
gladly post it on my webserver and post the URL here to the XFS
list--takes me about 2 minutes.

--
Stan

Emmanuel Florac

2010-10-31 14:10:00 UTC

Permalink

Post by Eli Morris
I have a large XFS filesystem (60 TB) that is composed of 5 hardware
RAID 6 volumes. One of those volumes had several drives fail in a
very short time and we lost that volume.

You may still have a slight chance to repair the broken RAID volume.
What is the type and model of RAID controller? What is the model of the
drives? Did you aggregate the RAID arrays with LVM?

Most drives failures (particularly on late 2009 Seagate SATA drives) are
both relatively frequent and transitory, i. e. may randomly recover
after a while.

What did you try? Can you power down the faulty RAID array
entirely and power it up after a while? Did you try to actually freeze
the failed drives (it may revive them for a while)? Did you try to run
SpinRite or another utility on the broken drives, individually ?

--
------------------------------------------------------------------------
Emmanuel Florac | Direction technique
| Intellique
| <***@intellique.com>
| +33 1 78 94 84 02
------------------------------------------------------------------------

Steve Costaras

2010-10-31 14:41:37 UTC

Permalink

Did you try to actually freeze the failed drives (it may revive them
for a while)?

Do NOT try this. It's only good for some /very/ specific types of
issues with older drives. With an array of your size you are probably
running relatively current drives (i.e. past 5-7 years) and this has a
vary large probability of causing more damage.

The other questions are to the point to determine the circumstances
around the failure and what the state of the array was at the
time. Take your time, do not rush anything; you are already hanging
over a cliff.

First thing if you are able is to do a bit copy of the physical drives
to spares that way you can always get back to the same point where you
are now. This may not be practical with such a large array but if you
have the means it's worth it.

You want to start from the lowest component and work your way up. So
you want to make sure that your raid array itself is sane before looking
to fix any volume management functions and that before looking at your
file systems. When dealing with degraded or failed arrays be careful
on what you do if you have write cache enabled on your controllers.
Talk to the vendor! Whatever operations you do on the card could cause
this data to be lost and that can be substantial with some controllers
(MiB->GiB ranges). Normally we run w/ write cache disabled (both
on the drive and on the raid controllers) for critical data to avoid
having too much data in flight if a problem ever did occur.

The points that Emmanuel mentioned are valid; Though would hold off
on powering down until you are able to get all the geometry information
from your raid's (unless you already have them). Also would hold off
until you determine if you have any dirty caches on the raid
controllers. Most controllers keep a rotating buffer of events
including failure pointers that if you re-boot the re-scanning of drives
upon start may push that pointer further down the stack until it gets
lost and then you won't be able to recover outstanding data. I've seen
this set at 128 - 256 entries on various systems, another reason to keep
drives per controller counts down.

Steve

Roger Willcocks

2010-10-31 16:52:13 UTC

Permalink

Don't do anything which has the potential to write to your drives until you have a full bit-for-bit copy of the existing volumes.

In particular, don't run xfs_repair. This is is a hardware issue. It can't be fixed with software.

Now stop and think. There's a good chance a professional data repair outfit can get stuff off your failed drives.

So before you go any further:

* carefully label all the drives, note down their serial numbers, and their positions in the array. You need to do this for the 'failed' drives too.

* speak to your raid vendor. They will have seen this before.

* try and find out why multiple drives failed on both your main and your backup systems. Was it power related? Temperature? Vibration? Or a bad batch of disks?

* speak to the drive manufacturer. They will have seen this before.

Come back to this list and give us an update. This isn't an xfs problem per se, but there are several people here who work regularly with multi-terabyte arrays.

--
Roger

r filesystem. I don't have the expertise to decipher the output and figure out if xfs_repair would fix the filesystem in a way that would retain our remaining data or if it would, let's say!
t!

Eli Morris

2010-10-31 19:56:33 UTC

Permalink

r filesystem. I don't have the expertise to decipher the output and figure out if xfs_repair would fix the filesystem in a way that would retain our remaining data or if it would, let's say!
t!

Hi guys,

Thanks for all the responses. On the XFS volume that I'm trying to recover here, I've already re-initialized the RAID, so I've kissed that data goodbye. I am using LVM2. Each of the 5 RAID volumes is a physical volume. Then a logical volume is created out of those, and then the filesystem lies on top of that. So now we have, in order, 2 intact PVs, 1 OK, but blank PV, 2 intact PVs. On the RAID where we lost the drives, replacements are in place and I created a now healthy volume. Through LVM, I was then able to create a new PV from the re-constituted RAID volume and put that into our logical volume in place of the destroyed PV. So now, I have a logical volume that I can activate and I can see the filesystem. It still reports as having all the old files as before, although it doesn't. So th
e hardware is now OK. It's just what to do with our damaged filesystem that has a huge chunk missing out of it. I put the xfs_repair trial output on an http server, as suggested (good sugge!
stion) and it is here:

http://sczdisplay.ucsc.edu/vol_repair_test.txt

Now I also have the problem of our backup RAID unit that failed. That one failed after I re-initialized the primary RAID, but before I could restore the backups to the primary. I'm having some good luck, huh? On that RAID unit, everything was fine until the next time I looked at it, which was a couple of hours later, 4 drives went offline and it reported the volume as lost. On that unit, the only thing I have done so far is to power cycle it a couple of times. Other than that, it is untouched. In it we are using the Caviar Green 2 TB drives, which our vendor told us where fine to use. However, I have read in the last couple of days that they have as issue with timing out as they remap sectors, as noted here:

http://en.wikipedia.org/wiki/Time-Limited_Error_Recovery#Western_Digital_Time_Limit_Error_Recovery_Utility_-_WDTLER.EXE

Thus, I've learned that they are not recommended for use in RAID volumes. So I am looking hard into ways to trying to recover that data as well, although it is only a partial backup of our main volume. It contains about 10 TB of the most critical files from the main volume. Fortunately, this isn't the human genome, but it is climate modeling data that graduate students have been generating for years. So losing all this could set them back years on their PhDs. So I take the situation pretty seriously. In this case, we are thinking about going with a data recovery company, but this isn't industry. Our lab doesn't have very deep pockets. $10K would be a huge chunk of money to spend. So, I would welcome suggestions for this unit as well. I believe the drives themselves in this unit are OK, as
four going out with one minute, as the log shows, is not something that makes a lot of sense. My guess is that they were under heavy load for the first time in a few months and four of the!
drives started remapping sectors at pretty much the same time. The RAID controller in this DAS 16 drive box tried to contact the drives and reached a timeout and marked them all as dead. We are also considering that we are having some sort of power problem as we seem to be usually unlucky in the last couple of weeks, although we do have everything behind a pretty nice $7K UPS that isn't reporting any problems.

OK, that's a long tale of woe. Thanks for any advise.

Eli

Emmanuel Florac

2010-10-31 20:40:21 UTC

Permalink

Post by Eli Morris
OK, that's a long tale of woe. Thanks for any advise.

oK, so what we'd like to do is get the backup RAID volume back in
working order. You said it's made of 2TB Caviar green drives, but
didn't mention the RAID controller type... As I understand it, you
power-cycled the RAID array, so the cache is gone, whatever have been
in it...

All arrays I know will happily reassemble a working RAID if you
succesfully revive the failed drives.

Logically the failed drives are almost certainly not really dead, but
in a temporary failure state mode. First, you must check WD support and
utilities to see if something may apply to your configuration. Anyway,
checking the failed drives' health with the western digital disk
utility should allow you to determine if they're toast or not.

In the case they're not actually dead, you could try to revive the
badblocks with Spinrite (www.grc.com), it saved my life a couple of
times, however it's quite risky when used with SMART-tripped drives.

Steve Costaras

2010-10-31 21:10:06 UTC

Permalink

Post by Eli Morris
Hi guys,
Thanks for all the responses. On the XFS volume that I'm trying to recover here, I've already re-initialized the RAID, so I've kissed that data goodbye. I am using LVM2. Each of the 5 RAID volumes is a physical volume. Then a logical volume is created out of those, and then the filesystem lies on top of that. So now we have, in order, 2 intact PVs, 1 OK, but blank PV, 2 intact PVs. On the RAID where we lost the drives, replacements are in place and I created a now healthy volume. Through LVM, I was then able to create a new PV from the re-constituted RAID volume and put that into our logical volume in place of the destroyed PV. So now, I have a logical volume that I can activate and I can see the filesystem. It still reports as having all the old files as before, although it doesn't. So

the hardware is now OK. It's just what to do with our damaged filesystem that has a huge chunk missing out of it. I put the xfs_repair trial output on an http server, as suggested (good sug!
ge!
What was your raid stripe size (hardware)? Did you have any
partitioning scheme on the hdw raid volumes or did you just use the
native device? When you created the volume group & lv did you do any
striping or just concatenation of the luns? if striping what was your
lvcreate parameters (stripe size et al).

You mentioned that you lost only 1 of the 5 arrays. Assuming the
others did not have any failures? You wiped the array that failed so
you have 4/5 of the data and 1/5 is zeroed. Which removes the
possibility of vendor recovery/assistance.

Assuming that everything is equal there should be an equal distribution
of files across the AG's and the AG's should have been distributed
across the 5 volumes. Do you have the xfs_info data? I think you
may be a bit out of luck here with xfs_repair. I am not sure how XFS
handles files/fragmentation between AG's and AG's relation to the
underlying 'physical volume'. I.e. problem would be if a particular
AG was on a different volume than the blocks of the actual file,
likewise another complexity would be fragmented files where data was not
contiguous. What is the average size of the files that you had on the
volume?

In similar circumstances if files were small enough to be on the
remaining disks and contiguous/non fragmented I've had some luck w/
forensic tools Foremost & Scalpel.

Steve

Continue reading on narkive:

Search results for 'xfs_repair of critical volume' (Questions and Answers)

replies

Is there a cheap way to extract data from a damaged hard drive?

started 2006-04-28 09:50:10 UTC

desktops