FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian User

 
 
LinkBack Thread Tools
 
Old 08-02-2010, 02:20 AM
Daniel Bareiro
 
Default Status of RAID (md)

Hi all!

On the day I found this e-mail notification of fail event from mdadm:

---------------------------------------------------------------------
This is an automatically generated mail message from mdadm
running on antares

A Fail event had been detected on md device /dev/md2.

It could be related to component device /dev/sdd3.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [raid1] [raid6] [raid5] [raid4]
md2 : active raid5 sda3[0] sdd3[4](F) sdc3[2]
2136170880 blocks level 5, 64k chunk, algorithm 2 [4/2] [U_U_]

md1 : active raid1 sda2[0] sdd2[3] sdc2[2]
19534976 blocks [4/3] [U_UU]

md0 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1]
979840 blocks [4/4] [UUUU]

unused devices: <none>

---------------------------------------------------------------------

It seems that a disk of the RAID-5 disappeared and another one has
failed. A closer inspection shows:

antares:~# mdadm --detail /dev/md2
/dev/md2:
Version : 00.90
Creation Time : Thu Dec 17 13:18:29 2009
Raid Level : raid5
Array Size : 2136170880 (2037.21 GiB 2187.44 GB)
Used Dev Size : 712056960 (679.07 GiB 729.15 GB)
Raid Devices : 4
Total Devices : 3
Preferred Minor : 2
Persistence : Superblock is persistent

Update Time : Sun Aug 1 16:38:59 2010
State : clean, degraded
Active Devices : 2
Working Devices : 2
Failed Devices : 1
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 64K

UUID : be723ed5:c2ac3c34:a640c0ed:43e24fc2
Events : 0.726249

Number Major Minor RaidDevice State
0 8 3 0 active sync /dev/sda3
1 0 0 1 removed
2 8 35 2 active sync /dev/sdc3
3 0 0 3 removed

4 8 51 - faulty spare /dev/sdd3


That is to say, the RAID has four disks and failed both the spare disk
and other disk from array. What is unclear to me is why if there are two
active disks, it seems that the RAID is broken because the filesystem is
on read-only mode:

# pvs
PV VG Fmt Attr PSize PFree
/dev/md2 backup lvm2 a- 1,99T 0

# lvs
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
space backup -wi-ao 1,99T


# mount
/dev/md1 on / type ext3 (rw,errors=remount-ro)
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
procbususb on /proc/bus/usb type usbfs (rw)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
/dev/mapper/backup-space on /space type ext3 (ro)


Then, I tried to add the missing disk, but the situation did not change:


# mdadm --add /dev/md2 /dev/sdb3
mdadm: re-added /dev/sdb3


antares:~# mdadm --detail /dev/md2
/dev/md2:
Version : 00.90
Creation Time : Thu Dec 17 13:18:29 2009
Raid Level : raid5
Array Size : 2136170880 (2037.21 GiB 2187.44 GB)
Used Dev Size : 712056960 (679.07 GiB 729.15 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 2
Persistence : Superblock is persistent

Update Time : Sun Aug 1 17:03:19 2010
State : clean, degraded
Active Devices : 2
Working Devices : 3
Failed Devices : 1
Spare Devices : 1

Layout : left-symmetric
Chunk Size : 64K

UUID : be723ed5:c2ac3c34:a640c0ed:43e24fc2
Events : 0.726256

Number Major Minor RaidDevice State
0 8 3 0 active sync /dev/sda3
1 0 0 1 removed
2 8 35 2 active sync /dev/sdc3
3 0 0 3 removed

4 8 19 - spare /dev/sdb3
5 8 51 - faulty spare /dev/sdd3


# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md2 : active raid5 sdb3[4](S) sda3[0] sdd3[5](F) sdc3[2]
2136170880 blocks level 5, 64k chunk, algorithm 2 [4/2] [U_U_]

md1 : active raid1 sda2[0] sdd2[3] sdc2[2]
19534976 blocks [4/3] [U_UU]

md0 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1]
979840 blocks [4/4] [UUUU]

unused devices: <none>


# mount
/dev/md1 on / type ext3 (rw,errors=remount-ro)
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
procbususb on /proc/bus/usb type usbfs (rw)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
/dev/mapper/backup-space on /space type ext3 (ro)



What could be the problem?

Thanks in advance for your reply.

Regards,
Daniel
--
Fingerprint: BFB3 08D6 B4D1 31B2 72B9 29CE 6696 BF1B 14E6 1D37
Powered by Debian GNU/Linux Lenny - Linux user #188.598
 
Old 08-02-2010, 06:30 AM
martin f krafft
 
Default Status of RAID (md)

also sprach Daniel Bareiro <daniel-listas@gmx.net> [2010.08.02.0420 +0200]:
> md2 : active raid5 sda3[0] sdd3[4](F) sdc3[2]
> 2136170880 blocks level 5, 64k chunk, algorithm 2 [4/2] [U_U_]
[…]
> That is to say, the RAID has four disks and failed both the spare
> disk and other disk from array. What is unclear to me is why if
> there are two active disks,

There is no spare disk. The reason why sdd is listed as "faulty
spare" is because it's out of sync. Remove and re-add it:

mdadm --remove /dev/md2 /dev/sdd3
mdadm --add /dev/md2 /dev/sdd3

> md2 : active raid5 sdb3[4](S) sda3[0] sdd3[5](F) sdc3[2]
> 2136170880 blocks level 5, 64k chunk, algorithm 2 [4/2] [U_U_]

This is indeed a bit strange, but the array might start syncing in
the new disk (called a spare) as soon as you remove sdd3 (see
above).

--
.'`. martin f. krafft <madduck@d.o> Related projects:
: :' : proud Debian developer http://debiansystem.info
`. `'` http://people.debian.org/~madduck http://vcs-pkg.org
`- Debian - when you have better things to do than fixing systems

"if there's anything more important than my ego,
i want it caught and shot now."
-- zaphod beeblebrox
 
Old 08-02-2010, 01:10 PM
Daniel Bareiro
 
Default Status of RAID (md)

Hi, Martin.

On Monday, 02 August 2010 08:30:03 +0200,
martin f krafft wrote:

> also sprach Daniel Bareiro <daniel-listas@gmx.net> [2010.08.02.0420 +0200]:
> > md2 : active raid5 sda3[0] sdd3[4](F) sdc3[2]
> > 2136170880 blocks level 5, 64k chunk, algorithm 2 [4/2] [U_U_]
> […]
> > That is to say, the RAID has four disks and failed both the spare
> > disk and other disk from array. What is unclear to me is why if
> > there are two active disks,

> There is no spare disk. The reason why sdd is listed as "faulty
> spare" is because it's out of sync. Remove and re-add it:
>
> mdadm --remove /dev/md2 /dev/sdd3
> mdadm --add /dev/md2 /dev/sdd3

# mdadm --remove /dev/md2 /dev/sdd3
mdadm: hot removed /dev/sdd3


# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md2 : active raid5 sdb3[4](S) sda3[0] sdc3[2]
2136170880 blocks level 5, 64k chunk, algorithm 2 [4/2] [U_U_]

md1 : active raid1 sda2[0] sdd2[3] sdc2[2]
19534976 blocks [4/3] [U_UU]

md0 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1]
979840 blocks [4/4] [UUUU]

unused devices: <none>


# mdadm --add /dev/md2 /dev/sdd3
mdadm: re-added /dev/sdd3


# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md2 : active raid5 sdd3[4](S) sdb3[5](S) sda3[0] sdc3[2]
2136170880 blocks level 5, 64k chunk, algorithm 2 [4/2] [U_U_]

md1 : active raid1 sda2[0] sdd2[3] sdc2[2]
19534976 blocks [4/3] [U_UU]

md0 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1]
979840 blocks [4/4] [UUUU]

unused devices: <none>


It draws attention to me that now both sdb3 and sdd3 are noticeable
like spare disks.

> > md2 : active raid5 sdb3[4](S) sda3[0] sdd3[5](F) sdc3[2]
> > 2136170880 blocks level 5, 64k chunk, algorithm 2 [4/2] [U_U_]

> This is indeed a bit strange, but the array might start syncing in
> the new disk (called a spare) as soon as you remove sdd3 (see
> above).

When I tried to add sdb3 was because the disk for some reason did not
appear with "cat /proc/mdstat" (not even like it was failed), although
when using "mdadm --detail /dev/md2" appear as both sdd3 and sdb3 as
removed. To what it can be due that sdb3 does not appear when doing "cat
/proc/mdstat"?

It is somewhat confusing to see a disk labeled like spare when it is
not. Is this "normal"?

If this is the case, the only thing I can think of is that after the
first disk failure, the system was in interim recovery mode, and during
it, the second disk would have failed.


Thanks for your reply.

Regards,
Daniel
--
Fingerprint: BFB3 08D6 B4D1 31B2 72B9 29CE 6696 BF1B 14E6 1D37
Powered by Debian GNU/Linux Lenny - Linux user #188.598
 
Old 08-12-2010, 09:08 AM
martin f krafft
 
Default Status of RAID (md)

also sprach Daniel Bareiro <daniel-listas@gmx.net> [2010.08.02.1510 +0200]:
> When I tried to add sdb3 was because the disk for some reason did not
> appear with "cat /proc/mdstat" (not even like it was failed), although
> when using "mdadm --detail /dev/md2" appear as both sdd3 and sdb3 as
> removed. To what it can be due that sdb3 does not appear when doing "cat
> /proc/mdstat"?
>
> It is somewhat confusing to see a disk labeled like spare when it is
> not. Is this "normal"?
>
> If this is the case, the only thing I can think of is that after the
> first disk failure, the system was in interim recovery mode, and during
> it, the second disk would have failed.

I cannot analyse this further from remote. If you are still
experiencing problems, I'd be prepared to have a look, but I'd need
root or sudo access to the host. Contact me off-list if you want
that.

--
.'`. martin f. krafft <madduck@d.o> Related projects:
: :' : proud Debian developer http://debiansystem.info
`. `'` http://people.debian.org/~madduck http://vcs-pkg.org
`- Debian - when you have better things to do than fixing systems

"with sufficient thrust, pigs fly just fine. however, this is not
necessarily a good idea. it is hard to be sure where they are going
to land, and it could be dangerous sitting under them as they fly
overhead."
-- rfc 1925
 

Thread Tools




All times are GMT. The time now is 12:04 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org