recovering RAID from an old server
Hi all,
I'm trying to recover some data from and old Snap Server 4200 (c2003) belonging to a local charity. It has 4 80Gb IDE drives, and runs some sort of Linux kernel with their (snap's) own applications on top. It won't boot to the Snap OS (Guardian OS 3.1.079 - quite an old one, major version 4 and 5 have succeeded it) but it does boot to a "recovery" console with a simple web page showing some details. From my google searches the OS resides on the disks (perhaps just the first one?) but I don't know where this recovery console is coming from. I've managed to put Gentoo 2008.0_beta2 minimal (because I happened to have the iso) on a USB key and made it bootable. It boots and fdisk -l shows me the four drives and some partitions. (Ubuntu wouldn't even boot ;) I don't have the original CD's with the OS recovery on it, nor can I download it (upgraded versions are $600+). I can't even find any *ahem* backup versions online in the usual channels. OK so the question: How can I recover the RAID data? It's RAID5 (probably) with 4 disks. Can I just run some up-to-date raid tools and mount the drives or do I have to get exactly the same kernal and setup? I don't have much experience with RAID. (It's software raid - no card just 2 IDE channels with master and slave). Once I've recovered the data I don't really care what goes on it - there are some great free NAS OS's, but it's mounting the RAID partition that I'm not sure about. Can I randomly mount partitions read-only or will this screw things up further? thanks for any suggestions, -- Iain Buchanan <iaindb at netspace dot net dot au> "Don't fear the pen. When in doubt, draw a pretty picture." --Baker's Third Law of Design. |
recovering RAID from an old server
On 19 Feb 2010, at 12:15, Iain Buchanan wrote:
... Can I randomly mount partitions read-only or will this screw things up further? If this is unsafe I will have ketchup & mustard on my baseball cap. Stroller. |
recovering RAID from an old server
On Fri, 2010-02-19 at 14:44 +0000, Stroller wrote:
> On 19 Feb 2010, at 12:15, Iain Buchanan wrote: > > ... > > Can I randomly mount partitions read-only or will this screw things up > > further? > > If this is unsafe I will have ketchup & mustard on my baseball cap. er... could you translate that? How about "dead horse on my baggy green"? Should I be able to mount them automatically and let the SW RAID module sort it out or do I have to know how they're tied together beforehand? The message from the kernel is: Linux version 2.4.19-snap (root@BuildSys) (gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)) #1 Tue Jul 13 20:24:35 PDT 2004 and later there's output from "md" which is (I assume) the linux software raid module (this is a grep, so there are other messages in between): md: linear personality registered as nr 1 md: raid0 personality registered as nr 2 md: raid1 personality registered as nr 3 md: raid5 personality registered as nr 4 md: spare personality registered as nr 8 md: md driver 0.91.0 MAX_MD_DEVS=256, MD_SB_DISKS=27 md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. md: bind<hdg2,1> md: bind<hde2,2> md: bind<hda2,3> md: hda2's event counter: 0000039d md: hde2's event counter: 0000039d md: hdg2's event counter: 0000039d md: md100: raid array is not clean -- starting background reconstruction md: RAID level 1 does not need chunksize! Continuing anyway. md100: max total readahead window set to 124k md100: 1 data-disks, max readahead per data-disk: 124k raid1: md100, not all disks are operational -- trying to recover array raid1: raid set md100 active with 3 out of 4 mirrors md: updating md100 RAID superblock on device md: hda2 [events: 0000039e]<6>(write) hda2's sb offset: 546112 md: recovery thread got woken up ... md: looking for a shared spare drive md100: no spare disk to reconstruct array! -- continuing in degraded mode md: recovery thread finished ... md: hde2 [events: 0000039e]<6>(write) hde2's sb offset: 546112 md: hdg2 [events: 0000039e]<6>(write) hdg2's sb offset: 546112 md: bind<hdg5,1> md: bind<hde5,2> md: bind<hda5,3> md: hda5's event counter: 000003a4 md: hde5's event counter: 000003a4 md: hdg5's event counter: 000003a4 md: md101: raid array is not clean -- starting background reconstruction md: RAID level 1 does not need chunksize! Continuing anyway. md101: max total readahead window set to 124k md101: 1 data-disks, max readahead per data-disk: 124k raid1: md101, not all disks are operational -- trying to recover array raid1: raid set md101 active with 3 out of 4 mirrors md: updating md101 RAID superblock on device md: hda5 [events: 000003a5]<6>(write) hda5's sb offset: 273024 md: recovery thread got woken up ... md: looking for a shared spare drive md101: no spare disk to reconstruct array! -- continuing in degraded mode md: looking for a shared spare drive md100: no spare disk to reconstruct array! -- continuing in degraded mode md: recovery thread finished ... md: hde5 [events: 000003a5]<6>(write) hde5's sb offset: 273024 md: hdg5 [events: 000003a5]<6>(write) hdg5's sb offset: 273024 XFS mounting filesystem md(9,100) Ending clean XFS mount for filesystem: md(9,100) The partitions look like: 9 100 546112 md100 9 101 273024 md101 34 0 78150744 hdg 34 1 16041 hdg1 34 2 546210 hdg2 34 3 1 hdg3 34 4 76656636 hdg4 34 5 273104 hdg5 34 6 273104 hdg6 33 0 78150744 hde 33 1 16041 hde1 33 2 546210 hde2 33 3 1 hde3 33 4 76656636 hde4 33 5 273104 hde5 33 6 273104 hde6 22 0 78150744 hdc 22 1 16041 hdc1 22 2 546210 hdc2 22 3 1 hdc3 22 4 76656636 hdc4 22 5 273104 hdc5 22 6 273104 hdc6 3 0 78150744 hda 3 1 16041 hda1 3 2 546210 hda2 3 3 1 hda3 3 4 76656636 hda4 3 5 273104 hda5 3 6 273104 hda6 many thanks! -- Iain Buchanan <iaindb at netspace dot net dot au> By golly, I'm beginning to think Linux really *is* the best thing since sliced bread. -- Vance Petree, Virginia Power |
recovering RAID from an old server
On Sat, 2010-02-20 at 14:01 +0930, Iain Buchanan wrote:
> On Fri, 2010-02-19 at 14:44 +0000, Stroller wrote: > > On 19 Feb 2010, at 12:15, Iain Buchanan wrote: > > > ... > > > Can I randomly mount partitions read-only or will this screw things up > > > further? OK, I've randomly mounted partitions, and now I'm stuck because I don't know what the original /etc/raidtab was. /proc/mdstat just says: Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] unused devices: <none> which looks like nothing is used in any RAID set. Autodetect seems not to be working, perhaps because the ID wasn't set to 0xFD or 253. Each drive has identical partitions: Device Boot Start End Blocks Id System /dev/hda1 * 1 2 16041+ 83 Linux /dev/hda2 3 70 546210 83 Linux /dev/hda3 71 138 546210 5 Extended /dev/hda4 139 9682 76656636 83 Linux /dev/hda5 71 104 273104+ 83 Linux /dev/hda6 105 138 273104+ 83 Linux and /dev/hd[aceg]1 is "/boot" on each one. all the other /dev/hd[aceg][2-6] mount says: mount: unknown filesystem type 'linux_raid_member obviously this is the raid. But how do I get to it? All "/boot"s mount ok and are readable with some kernel files and stuff, however /dev/hdc1 give some errors: hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error } hdc: dma_intr: error=0x40 { UncorrectableError }, LBAsect=585, sector=575 hdc: possibly failed opcode: 0x25 end_request: I/O error, dev hdc, sector 575 __ratelimit: 22 callbacks suppressed Buffer I/O error on device hdc1, logical block 528 Buffer I/O error on device hdc1, logical block 529 Buffer I/O error on device hdc1, logical block 530 Buffer I/O error on device hdc1, logical block 531 Buffer I/O error on device hdc1, logical block 532 Buffer I/O error on device hdc1, logical block 533 Buffer I/O error on device hdc1, logical block 534 Buffer I/O error on device hdc1, logical block 535 Buffer I/O error on device hdc1, logical block 536 Buffer I/O error on device hdc1, logical block 537 so it looks like there's some problems with hdc. Are there any disk hardware testing tools on the gentoo minimal live cd? thanks, -- Iain Buchanan <iaindb at netspace dot net dot au> It's simply unbelievable how much energy and creativity people have invested into creating contradictory, bogus and stupid licenses... --- Sven Rudolph about licences in debian/non-free. |
recovering RAID from an old server
On Saturday 20 February 2010 06:29:03 Iain Buchanan wrote:
> so it looks like there's some problems with hdc. Are there any disk > hardware testing tools on the gentoo minimal live cd? If you want to check the disk use sys-apps/smartmontools, but this problem may be a fs corruption - which could of course have been caused by the hardware failing. -- Regards, Mick |
recovering RAID from an old server
> Should I be able to mount them automatically and let the SW RAID
> module sort it out or do I have to know how they're tied together > beforehand? > md: looking for a shared spare drive > md100: no spare disk to reconstruct array! -- continuing in degraded > mode > md: recovery thread finished ... > md: hde5 [events: 000003a5]<6>(write) hde5's sb offset: 273024 > md: hdg5 [events: 000003a5]<6>(write) hdg5's sb offset: 273024 > XFS mounting filesystem md(9,100) > Ending clean XFS mount for filesystem: md(9,100) > > The partitions look like: > 9 100 546112 md100 > 9 101 273024 md101 It seems it has correctly mounted its partition... Can't you find it? I have the feeling that you are messing it up. If I understand it correctly the server has an hardware RAID controller, that has to be managed via its drivers. Software RAID tools aren't suitable to mount correctly this setup, I would mount random partition for testing purposes only, on a spare machine. The wiser thing to do is find an old livecd supporting PERC SAS (or whatever raid card is in that Snap) RAID cards and assemble the array in degraded mode for data recovery. Another thing can come very useful: we once had a similar problem, we ended up borrowing one identical disc from another running server to put the array back online, we recovered our data, then restored the other server's array. HTH Francesco -- Linux Version 2.6.32-gentoo-r5, Compiled #2 SMP PREEMPT Wed Feb 17 20:30:02 CET 2010 Two 1GHz AMD Athlon 64 Processors, 4GB RAM, 4021.84 Bogomips Total aemaeth |
recovering RAID from an old server
On 20 Feb 2010, at 04:31, Iain Buchanan wrote:
On Fri, 2010-02-19 at 14:44 +0000, Stroller wrote: On 19 Feb 2010, at 12:15, Iain Buchanan wrote: ... Can I randomly mount partitions read-only or will this screw things up further? If this is unsafe I will have ketchup & mustard on my baseball cap. er... could you translate that? How about "dead horse on my baggy green"? http://idioms.thefreedictionary.com/I'll+eat+my+hat I just don't see how you can break anything *as long as* you don't let the system write anything to the disks. How can read-only be unsafe? One might be paranoid enough to clone images of the drive before proceeding, however. My one concern is over how you know this system uses software RAID. You know that EIDE hardware RAID was available, right? I'm sure this would rarely be available built-in to the motherboard. Stroller. |
recovering RAID from an old server
On Sat, 2010-02-20 at 10:46 +0100, Francesco Talamona wrote:
> > Should I be able to mount them automatically and let the SW RAID > > module sort it out or do I have to know how they're tied together > > beforehand? > > > md: looking for a shared spare drive > > md100: no spare disk to reconstruct array! -- continuing in degraded > > mode > > md: recovery thread finished ... > > md: hde5 [events: 000003a5]<6>(write) hde5's sb offset: 273024 > > md: hdg5 [events: 000003a5]<6>(write) hdg5's sb offset: 273024 > > XFS mounting filesystem md(9,100) > > Ending clean XFS mount for filesystem: md(9,100) > > > > The partitions look like: > > 9 100 546112 md100 > > 9 101 273024 md101 > > It seems it has correctly mounted its partition... Can't you find it? This is with the server recovery console, which is basically just a web page. No shell access. There's not much I can do to get at md100 and md101 (is this what software RAID devices usually appear as?) > I have the feeling that you are messing it up. If I understand it > correctly the server has an hardware RAID controller, that has to be > managed via its drivers. I think it's software RAID. There is no RAID controller AFAICT. All 4 drives are visible to the BIOS as Primary and Secondary Master and Slaves. > Another thing can come very useful: we once had a similar problem, we > ended up borrowing one identical disc from another running server to put > the array back online, we recovered our data, then restored the other > server's array. That's a possibility given what I can find on Google, however these are few and far between, so I'd have to find someone willing to send their drive to me (or vice versa) or send me the OS, which overlandstorage doesn't like! thanks, -- Iain Buchanan <iaindb at netspace dot net dot au> Come quickly, I am tasting stars! -- Dom Perignon, upon discovering champagne. |
recovering RAID from an old server
On Sat, 2010-02-20 at 13:39 +0000, Stroller wrote:
> On 20 Feb 2010, at 04:31, Iain Buchanan wrote: > > > On Fri, 2010-02-19 at 14:44 +0000, Stroller wrote: > >> On 19 Feb 2010, at 12:15, Iain Buchanan wrote: > >>> ... > >>> Can I randomly mount partitions read-only or will this screw > >>> things up > >>> further? > >> > >> If this is unsafe I will have ketchup & mustard on my baseball cap. > > > > er... could you translate that? How about "dead horse on my baggy > > green"? > > http://idioms.thefreedictionary.com/I'll+eat+my+hat yeah, I got that, I was just picking on your use of ketchup & baseball. Over here it's tomatoe sauce (dead horse) and cricket (baggy greens) :) Most of my jokes need explaining %-) > I just don't see how you can break anything *as long as* you don't let > the system write anything to the disks. How can read-only be unsafe? Perhaps something to do with the superblock or "last mount time" or something? I don't know! I know that mounting a drive while a system is hibernated, even ro, will kill kittens. > One might be paranoid enough to clone images of the drive before > proceeding, however. I don't have enough spare... > My one concern is over how you know this system uses software RAID. > You know that EIDE hardware RAID was available, right? I'm sure this > would rarely be available built-in to the motherboard. well there appears to be no RAID controller, unless it's onboard, but as I mentioned to Francessco the BIOS can see all drives, so can gentoo minimal... I've since found that the OS is in flash RAM, and only the help files are on disk, so maybe I have bigger problems if I can't boot :( I hope to get a copy of Guardian OS somehow... thanks, -- Iain Buchanan <iaindb at netspace dot net dot au> "Go ahead, bake my quiche" -- Magrat instructs the castle cook (Terry Pratchett, Lords and Ladies) |
recovering RAID from an old server
On Saturday 20 February 2010, Iain Buchanan wrote:
> On Sat, 2010-02-20 at 10:46 +0100, Francesco Talamona wrote: > > > Should I be able to mount them automatically and let the SW RAID > > > module sort it out or do I have to know how they're tied > > > together beforehand? > > > > > > md: looking for a shared spare drive > > > md100: no spare disk to reconstruct array! -- continuing in > > > degraded mode > > > md: recovery thread finished ... > > > md: hde5 [events: 000003a5]<6>(write) hde5's sb offset: 273024 > > > md: hdg5 [events: 000003a5]<6>(write) hdg5's sb offset: 273024 > > > XFS mounting filesystem md(9,100) > > > Ending clean XFS mount for filesystem: md(9,100) > > > > > > The partitions look like: > > > 9 100 546112 md100 > > > 9 101 273024 md101 > > > > It seems it has correctly mounted its partition... Can't you find > > it? > > This is with the server recovery console, which is basically just a > web page. No shell access. There's not much I can do to get at > md100 and md101 (is this what software RAID devices usually appear > as?) > > > I have the feeling that you are messing it up. If I understand it > > correctly the server has an hardware RAID controller, that has to > > be managed via its drivers. > > I think it's software RAID. There is no RAID controller AFAICT. All > 4 drives are visible to the BIOS as Primary and Secondary Master and > Slaves. This isn't a proof: most hardware RAID are proprietary software solutions pretending to be hardware. Linux without the driver can't see the logical volume and shows all the physical drives. You should do some research about that server hardware... Aren't snap equipped with PERC controller?. > > Another thing can come very useful: we once had a similar problem, > > we ended up borrowing one identical disc from another running > > server to put the array back online, we recovered our data, then > > restored the other server's array. > > That's a possibility given what I can find on Google, however these > are few and far between, so I'd have to find someone willing to send > their drive to me (or vice versa) or send me the OS, which > overlandstorage doesn't like! What happens if you physically remove the drive marked as bad? You may image it for backup, then format it at low level, then put it back in place as if it was brand new. Or add a similar disk to be considered spare by the controller (given that it is looking for a spare disk in first instance). Most controller have automated procedures to manage failures, disk swaps and so on. For this reason you can't be sure that the inspection operations you are doing are read only. Unless the drives are attached to another machine with a trusted OS doing nothing on its own. The ideas given above may let you to waste all of your data, be very careful and patient. Good luck. Francesco -- Linux Version 2.6.32-gentoo-r5, Compiled #2 SMP PREEMPT Wed Feb 17 20:30:02 CET 2010 Two 2.9GHz AMD Athlon 64 Processors, 4GB RAM, 11659 Bogomips Total aemaeth |
| All times are GMT. The time now is 08:06 PM. |
VBulletin, Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.