FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > CentOS > CentOS

 
 
LinkBack Thread Tools
 
Old 02-28-2010, 08:03 PM
John R Pierce
 
Default puzzling md error ?

this has never happened to me before, and I'm somewhat at a loss. got a
email from the cron thing...

/etc/cron.weekly/99-raid-check:

WARNING: mismatch_cnt is not 0 on /dev/md10
WARNING: mismatch_cnt is not 0 on /dev/md11


ok, md10 and md11 are each raid1's made from 2 x 72GB scsi drives, on a
dell 2850 or something dual single-core 3ghz server.

these two md's are in turn a striped LVM volume group

dmesg shows....

md: syncing RAID array md10
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than
200000 KB/sec) for reconstruction.
md: using 128k window, over a total of 143374656 blocks.
md: syncing RAID array md11
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than
200000 KB/sec) for reconstruction.
md: using 128k window, over a total of 143374656 blocks.
md: md10: sync done.
RAID1 conf printout:
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:sdc1
disk 1, wo:0, o:1, dev:sdd1
md: md11: sync done.
RAID1 conf printout:
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:sde1
disk 1, wo:0, o:1, dev:sdf1

I'm not sure what thats telling me. the last thing prior to this in
dmesg was when I added a swap to this vg last week.


and mdadm --detail shows...

# mdadm --detail /dev/md10
/dev/md10:
Version : 0.90
Creation Time : Wed Oct 8 12:54:48 2008
Raid Level : raid1
Array Size : 143374656 (136.73 GiB 146.82 GB)
Used Dev Size : 143374656 (136.73 GiB 146.82 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 10
Persistence : Superblock is persistent

Update Time : Sun Feb 28 04:53:29 2010
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

UUID : b6da4dc5:c7372d6e:63f32b9c:49fa95f9
Events : 0.84

Number Major Minor RaidDevice State
0 8 33 0 active sync /dev/sdc1
1 8 49 1 active sync /dev/sdd1
# mdadm --detail /dev/md11
/dev/md11:
Version : 0.90
Creation Time : Wed Oct 8 12:54:57 2008
Raid Level : raid1
Array Size : 143374656 (136.73 GiB 146.82 GB)
Used Dev Size : 143374656 (136.73 GiB 146.82 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 11
Persistence : Superblock is persistent

Update Time : Sun Feb 28 11:49:45 2010
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

UUID : be475cd9:b98ee3ff:d18e668c:a5a6e06b
Events : 0.62

Number Major Minor RaidDevice State
0 8 65 0 active sync /dev/sde1
1 8 81 1 active sync /dev/sdf1



I don't see anything wrong here ?

lvm shows no problems I detect either...

# vgdisplay vg1
Volume group "vgdisplay" not found
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
glassfish vg1 -wi-ao 10.00G
lv1 vg1 -wi-ao 97.66G
oradata vg1 -wi-ao 30.00G
pgdata vg1 -wi-ao 25.00G
pgdata_lss_idx vg1 -wi-ao 20.00G
pgdata_lss_tab vg1 -wi-ao 20.00G
swapper vg1 -wi-ao 3.00G
vmware vg1 -wi-ao 50.00G


# pvdisplay /dev/md10 /dev/md11
--- Physical volume ---
PV Name /dev/md10
VG Name vg1
PV Size 136.73 GB / not usable 2.31 MB
Allocatable yes
PE Size (KByte) 4096
Total PE 35003
Free PE 1998
Allocated PE 33005
PV UUID oAgJY7-Tmf7-ac35-KoUH-15uz-Q5Ae-bmFCys

--- Physical volume ---
PV Name /dev/md11
VG Name vg1
PV Size 136.73 GB / not usable 2.31 MB
Allocatable yes
PE Size (KByte) 4096
Total PE 35003
Free PE 2560
Allocated PE 32443
PV UUID A4Qb3P-j5Lr-8ZEv-FjbC-Iczm-QkC8-bqP0zv


_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-28-2010, 08:15 PM
Eero Volotinen
 
Default puzzling md error ?

2010/2/28 John R Pierce <pierce@hogranch.com>:
> this has never happened to me before, and I'm somewhat at a loss. *got a
> email from the cron thing...
>
> * */etc/cron.weekly/99-raid-check:
>
> * *WARNING: mismatch_cnt is not 0 on /dev/md10
> * *WARNING: mismatch_cnt is not 0 on /dev/md11
>
>
> ok, md10 and md11 are each raid1's made from 2 x 72GB scsi drives, on a
> dell 2850 or something dual single-core 3ghz server.
>
> these two md's are in turn a striped LVM volume group
>
> dmesg shows....
>
> * *md: syncing RAID array md10
> * *md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
> * *md: using maximum available idle IO bandwidth (but not more than
> 200000 KB/sec) for reconstruction.
> * *md: using 128k window, over a total of 143374656 blocks.
> * *md: syncing RAID array md11
> * *md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
> * *md: using maximum available idle IO bandwidth (but not more than
> 200000 KB/sec) for reconstruction.
> * *md: using 128k window, over a total of 143374656 blocks.
> * *md: md10: sync done.
> * *RAID1 conf printout:
> * * --- wd:2 rd:2
> * * disk 0, wo:0, o:1, dev:sdc1
> * * disk 1, wo:0, o:1, dev:sdd1
> * *md: md11: sync done.
> * *RAID1 conf printout:
> * * --- wd:2 rd:2
> * * disk 0, wo:0, o:1, dev:sde1
> * * disk 1, wo:0, o:1, dev:sdf1
>
> I'm not sure what thats telling me. *the last thing prior to this in
> dmesg was when I added a swap to this vg last week.
>
>
> and mdadm --detail shows...
>
> # mdadm --detail /dev/md10
> /dev/md10:
> * * * *Version : 0.90
> *Creation Time : Wed Oct *8 12:54:48 2008
> * * Raid Level : raid1
> * * Array Size : 143374656 (136.73 GiB 146.82 GB)
> *Used Dev Size : 143374656 (136.73 GiB 146.82 GB)
> * Raid Devices : 2
> *Total Devices : 2
> Preferred Minor : 10
> * *Persistence : Superblock is persistent
>
> * *Update Time : Sun Feb 28 04:53:29 2010
> * * * * *State : clean
> *Active Devices : 2
> Working Devices : 2
> *Failed Devices : 0
> *Spare Devices : 0
>
> * * * * * UUID : b6da4dc5:c7372d6e:63f32b9c:49fa95f9
> * * * * Events : 0.84
>
> * *Number * Major * Minor * RaidDevice State
> * * * 0 * * * 8 * * * 33 * * * *0 * * *active sync * /dev/sdc1
> * * * 1 * * * 8 * * * 49 * * * *1 * * *active sync * /dev/sdd1
> # mdadm --detail /dev/md11
> /dev/md11:
> * * * *Version : 0.90
> *Creation Time : Wed Oct *8 12:54:57 2008
> * * Raid Level : raid1
> * * Array Size : 143374656 (136.73 GiB 146.82 GB)
> *Used Dev Size : 143374656 (136.73 GiB 146.82 GB)
> * Raid Devices : 2
> *Total Devices : 2
> Preferred Minor : 11
> * *Persistence : Superblock is persistent
>
> * *Update Time : Sun Feb 28 11:49:45 2010
> * * * * *State : clean
> *Active Devices : 2
> Working Devices : 2
> *Failed Devices : 0
> *Spare Devices : 0
>
> * * * * * UUID : be475cd9:b98ee3ff:d18e668c:a5a6e06b
> * * * * Events : 0.62
>
> * *Number * Major * Minor * RaidDevice State
> * * * 0 * * * 8 * * * 65 * * * *0 * * *active sync * /dev/sde1
> * * * 1 * * * 8 * * * 81 * * * *1 * * *active sync * /dev/sdf1
>
>
>
> I don't see anything wrong here ?
>
> lvm shows no problems I detect either...
>
> # vgdisplay vg1
> *Volume group "vgdisplay" not found
> *LV * * * * * * VG * Attr * LSize *Origin Snap% *Move Log Copy% *Convert
> *glassfish * * *vg1 *-wi-ao 10.00G
> *lv1 * * * * * *vg1 *-wi-ao 97.66G
> *oradata * * * *vg1 *-wi-ao 30.00G
> *pgdata * * * * vg1 *-wi-ao 25.00G
> *pgdata_lss_idx vg1 *-wi-ao 20.00G
> *pgdata_lss_tab vg1 *-wi-ao 20.00G
> *swapper * * * *vg1 *-wi-ao *3.00G
> *vmware * * * * vg1 *-wi-ao 50.00G
>
>
> # pvdisplay /dev/md10 /dev/md11
> *--- Physical volume ---
> *PV Name * * * * * * * /dev/md10
> *VG Name * * * * * * * vg1
> *PV Size * * * * * * * 136.73 GB / not usable 2.31 MB
> *Allocatable * * * * * yes
> *PE Size (KByte) * * * 4096
> *Total PE * * * * * * *35003
> *Free PE * * * * * * * 1998
> *Allocated PE * * * * *33005
> *PV UUID * * * * * * * oAgJY7-Tmf7-ac35-KoUH-15uz-Q5Ae-bmFCys
>
> *--- Physical volume ---
> *PV Name * * * * * * * /dev/md11
> *VG Name * * * * * * * vg1
> *PV Size * * * * * * * 136.73 GB / not usable 2.31 MB
> *Allocatable * * * * * yes
> *PE Size (KByte) * * * 4096
> *Total PE * * * * * * *35003
> *Free PE * * * * * * * 2560
> *Allocated PE * * * * *32443
> *PV UUID * * * * * * * A4Qb3P-j5Lr-8ZEv-FjbC-Iczm-QkC8-bqP0zv

maybe*this*helps:*http://www.arrfab.net/blog/?p=199

--
Eero
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-28-2010, 08:16 PM
Peter Hinse
 
Default puzzling md error ?

Am 28.02.2010 22:03, schrieb John R Pierce:
> WARNING: mismatch_cnt is not 0 on

Have a look at http://www.arrfab.net/blog/?p=199
It says:

> A `echo repair >/sys/block/md0/md/sync_action` followed by a `echo
> check >/sys/block/md0/md/sync_action` seems to have corrected it. Now
> `cat /sys/block/md0/md/mismatch_cnt` returns 0 …

Regards,

Peter

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-28-2010, 08:21 PM
Clint Dilks
 
Default puzzling md error ?

On 01/03/10 10:16, Peter Hinse wrote:

Am 28.02.2010 22:03, schrieb John R Pierce:


WARNING: mismatch_cnt is not 0 on



Have a look at http://www.arrfab.net/blog/?p=199
It says:



A `echo repair >/sys/block/md0/md/sync_action` followed by a `echo
check >/sys/block/md0/md/sync_action` seems to have corrected it. Now
`cat /sys/block/md0/md/mismatch_cnt` returns 0 …



Regards,

Peter




_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Hi,



This is happening specifically because of the way swap works.* So the
issue will re-appear but it isn't actually anything to worry about.*
I'd suggest that you remove the particular drive from the list being
scanned.





_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-28-2010, 08:23 PM
John R Pierce
 
Default puzzling md error ?

Peter Hinse wrote:
> Am 28.02.2010 22:03, schrieb John R Pierce:
>
>> WARNING: mismatch_cnt is not 0 on
>>
>
> Have a look at http://www.arrfab.net/blog/?p=199
> It says:
>
>
>> A `echo repair >/sys/block/md0/md/sync_action` followed by a `echo
>> check >/sys/block/md0/md/sync_action` seems to have corrected it. Now
>> `cat /sys/block/md0/md/mismatch_cnt` returns 0 …
>>

Thanks. I was trying to figure out how from the mdadm commands (UGH!)
to do a scan.

# cat /sys/block/md10/md/mismatch_cnt
8448
# cat /sys/block/md11/md/mismatch_cnt
7296

fugly. Since the mirrors aren't checksummed, can i assume this means
there's likely some data messups here?

Anyways, the repair is running on both md10 and md11, i'll check back
with my final results...




_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-28-2010, 08:27 PM
Clint Dilks
 
Default puzzling md error ?

On 01/03/10 10:23, John R Pierce wrote:
> Peter Hinse wrote:
>
>> Am 28.02.2010 22:03, schrieb John R Pierce:
>>
>>
>>> WARNING: mismatch_cnt is not 0 on
>>>
>>>
>> Have a look at http://www.arrfab.net/blog/?p=199
>> It says:
>>
>>
>>
>>> A `echo repair>/sys/block/md0/md/sync_action` followed by a `echo
>>> check>/sys/block/md0/md/sync_action` seems to have corrected it. Now
>>> `cat /sys/block/md0/md/mismatch_cnt` returns 0 …
>>>
>>>
> Thanks. I was trying to figure out how from the mdadm commands (UGH!)
> to do a scan.
>
> # cat /sys/block/md10/md/mismatch_cnt
> 8448
> # cat /sys/block/md11/md/mismatch_cnt
> 7296
>
> fugly. Since the mirrors aren't checksummed, can i assume this means
> there's likely some data messups here?
>
> Anyways, the repair is running on both md10 and md11, i'll check back
> with my final results...
>
>
>
>
> _______________________________________________
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
>
>
Hi

It has to do with aborted writes in SWAP. Your data should be fine
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-28-2010, 08:31 PM
John R Pierce
 
Default puzzling md error ?

Clint Dilks wrote:
> It has to do with aborted writes in SWAP. Your data should be fine

so swap on LVM on MD mirrors is a bad idea?


frankly, I usually avoid LVM< but I figured I'd setup this system with
it and see how it goes. its just a dev box, but we're about to put some
oracle stuff on it (for development, but still)


_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-28-2010, 08:33 PM
Clint Dilks
 
Default puzzling md error ?

On 01/03/10 10:27, Clint Dilks wrote:
> On 01/03/10 10:23, John R Pierce wrote:
>
>> Peter Hinse wrote:
>>
>>
>>> Am 28.02.2010 22:03, schrieb John R Pierce:
>>>
>>>
>>>
>>>> WARNING: mismatch_cnt is not 0 on
>>>>
>>>>
>>>>
>>> Have a look at http://www.arrfab.net/blog/?p=199
>>> It says:
>>>
>>>
>>>
>>>
>>>> A `echo repair>/sys/block/md0/md/sync_action` followed by a `echo
>>>> check>/sys/block/md0/md/sync_action` seems to have corrected it. Now
>>>> `cat /sys/block/md0/md/mismatch_cnt` returns 0 …
>>>>
>>>>
>>>>
>> Thanks. I was trying to figure out how from the mdadm commands (UGH!)
>> to do a scan.
>>
>> # cat /sys/block/md10/md/mismatch_cnt
>> 8448
>> # cat /sys/block/md11/md/mismatch_cnt
>> 7296
>>
>> fugly. Since the mirrors aren't checksummed, can i assume this means
>> there's likely some data messups here?
>>
>> Anyways, the repair is running on both md10 and md11, i'll check back
>> with my final results...
>>
>>
>>
>>
>> _______________________________________________
>> CentOS mailing list
>> CentOS@centos.org
>> http://lists.centos.org/mailman/listinfo/centos
>>
>>
>>
> Hi
>
> It has to do with aborted writes in SWAP. Your data should be fine
> _______________________________________________
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
>
>
See http://forum.nginx.org/read.php?24,16699 for more info
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-28-2010, 08:38 PM
Clint Dilks
 
Default puzzling md error ?

On 01/03/10 10:31, John R Pierce wrote:
> Clint Dilks wrote:
>
>> It has to do with aborted writes in SWAP. Your data should be fine
>>
> so swap on LVM on MD mirrors is a bad idea?
>
>
> frankly, I usually avoid LVM< but I figured I'd setup this system with
> it and see how it goes. its just a dev box, but we're about to put some
> oracle stuff on it (for development, but still)
>
>
> _______________________________________________
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
>
>
SWAP inside LVM is fine in my experience. Personally I consider this a
benign error and generally ignore it unless the mismatch count is very high.
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-28-2010, 09:37 PM
John R Pierce
 
Default puzzling md error ?

Clint Dilks wrote:
> SWAP inside LVM is fine in my experience. Personally I consider this a
> benign error and generally ignore it unless the mismatch count is very high

And how do I know all these mirror data mismatches are Swap? does not
each mismatch mean the mirrors disagree, which means one of them is
wrong. Which one? since they aren't timestamped or checksummed (like
vxvm, zfs do), I am playing 'data maybe'. As someone who adminstrates
database servers, i have a real problem with that.

btw, this is centos 5.4+latest x86_64, its primarily running postgres,
and our inhouse java middleware apps. and was going to be a oracle grid
operations server.


_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 

Thread Tools




All times are GMT. The time now is 08:04 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org