FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > CentOS > CentOS

 
 
LinkBack Thread Tools
 
Old 12-01-2009, 11:51 AM
Farkas Levente
 
Default /etc/cron.weekly/99-raid-check

On Tue, Dec 1, 2009 at 11:06, RedShift <redshift@pandora.be> wrote:
> Jancio Wodnik wrote:
>> W dniu 30.11.2009 14:08, Farkas Levente pisze:
>>> hi,
>>> it's been a few weeks since rhel/centos 5.4 released and there were many
>>> discussion about this new "feature" the weekly raid partition check.
>>> we've got a lot's of server with raid1 system and i already try to
>>> configure them not to send these messages, but i'm not able ie. i
>>> already add to the SKIP_DEVS all of my swap partitions (since i read it
>>> on linux-kernel list that there can be mismatch_cnt even though i still
>>> not understand why?). but even the data partitions (ie. all of my
>>> servers all raid1 partitions) produce this error (ie. ther mismatch_cnt
>>> is never 0 at the weekend). and this cause all of my raid1 partitions
>>> are rebuild during the weekend. and i don't like it:-(
>>> so my questions:
>>> - is it a real bug in the raid1 system?
>>> - is it a real bug in my disk which runs raid (not really believe since
>>> it's dozens of servers)?
>>> - the /etc/cron.weekly/99-raid-check is wrong in rhel/centos-5.4?
>>> or what's the problem?
>>> can someone enlighten me?
>>> thanks in advance.
>>> regards.
>>>
>>>
>> Hi. I have this problem on my 2 servers (Both Centos 5.4) - every
>> weekend my raid1 set is rebuild, because
>>
>> mismatch_cnt is never 0 at the weekend. What is really going on ? My 1TB disk whith raid1 are rebuild every weekend.
>>
>
> They aren't being rebuilt, they are being checked if the data on the RAID disks are consistent. There are various reasons why mismatch_cnt can be higher than 0, for example aborted writes. Generally it's not really something to be worried about if you have for example a swap partition in your RAID array. If you do a repair and then a check the mismatch_cnt should reset to 0.

the mismatch_cnt is not 0 so then automatically checked and repaired
on all weekend. and this is not a swap partition as i wrote there are
the /srv partitions with only data.


--
Levente "Si vis pacem para bellum!"
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-01-2009, 12:05 PM
Paul Bijnens
 
Default /etc/cron.weekly/99-raid-check

On 2009-12-01 13:51, Farkas Levente wrote:
> On Tue, Dec 1, 2009 at 11:06, RedShift <redshift@pandora.be> wrote:
>> Jancio Wodnik wrote:
>>> W dniu 30.11.2009 14:08, Farkas Levente pisze:
>>>> hi,
>>>> it's been a few weeks since rhel/centos 5.4 released and there were many
>>>> discussion about this new "feature" the weekly raid partition check.
>>>> we've got a lot's of server with raid1 system and i already try to
>>>> configure them not to send these messages, but i'm not able ie. i
>>>> already add to the SKIP_DEVS all of my swap partitions (since i read it
>>>> on linux-kernel list that there can be mismatch_cnt even though i still
>>>> not understand why?). but even the data partitions (ie. all of my
>>>> servers all raid1 partitions) produce this error (ie. ther mismatch_cnt
>>>> is never 0 at the weekend). and this cause all of my raid1 partitions
>>>> are rebuild during the weekend. and i don't like it:-(
>>>> so my questions:
>>>> - is it a real bug in the raid1 system?
>>>> - is it a real bug in my disk which runs raid (not really believe since
>>>> it's dozens of servers)?
>>>> - the /etc/cron.weekly/99-raid-check is wrong in rhel/centos-5.4?
>>>> or what's the problem?
>>>> can someone enlighten me?
>>>> thanks in advance.
>>>> regards.
>>>>
>>>>
>>> Hi. I have this problem on my 2 servers (Both Centos 5.4) - every
>>> weekend my raid1 set is rebuild, because
>>>
>>> mismatch_cnt is never 0 at the weekend. What is really going on ? My 1TB disk whith raid1 are rebuild every weekend.
>>>
>> They aren't being rebuilt, they are being checked if the data on the RAID disks are consistent. There are various reasons why mismatch_cnt can be higher than 0, for example aborted writes. Generally it's not really something to be worried about if you have for example a swap partition in your RAID array. If you do a repair and then a check the mismatch_cnt should reset to 0.
>
> the mismatch_cnt is not 0 so then automatically checked and repaired
> on all weekend. and this is not a swap partition as i wrote there are
> the /srv partitions with only data.
>
>

I have the problem on 2 servers, and both of those servers are also running
a VMware image (very small, but constantly used) under VMware Server 2.
Could it be that the .vmem file, or even the virtual disk is constantly
written to, and the raid is constantly out of sync because of that?
(All my other VMware servers have hardware raid cards; or are still on
Centos4.)

The mismatch count is never large: 128 is usual; 512 is the maximum I've seen.

Actually, one of the servers passed the test this weekend, but has
again a mismatch count at this moment.

Two weeks ago, I had time to shut down one server completely, and had
the miscount match fixed twice, soon after startup, the mismatch count
was again != 0.


--
Paul Bijnens, Xplanation Technology Services Tel +32 16 397.525
Interleuvenlaan 86, B-3001 Leuven, BELGIUM Fax +32 16 397.552
************************************************** *********************
* I think I've got the hang of it now: exit, ^D, ^C, ^, ^Z, ^Q, ^^, *
* quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, ~., *
* stop, end, ^]c, +++ ATH, disconnect, halt, abort, hangup, KJOB, *
* ^X^X, :, kill -9 1, kill -1 $$, shutdown, init 0, Alt-F4, *
* Alt-f-e, Ctrl-Alt-Del, Alt-SysRq-reisub, Stop-A, AltGr-NumLock, ... *
* ... "Are you sure?" ... YES ... Phew ... I'm out *
************************************************** *********************
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-01-2009, 12:36 PM
Thomas Harold
 
Default /etc/cron.weekly/99-raid-check

On 12/1/2009 8:05 AM, Paul Bijnens wrote:
> I have the problem on 2 servers, and both of those servers are also running
> a VMware image (very small, but constantly used) under VMware Server 2.
> Could it be that the .vmem file, or even the virtual disk is constantly
> written to, and the raid is constantly out of sync because of that?
> (All my other VMware servers have hardware raid cards; or are still on
> Centos4.)

... that fills me with dread. The whole point of RAID-1 is supposed to
be that data that gets written to one drive also gets written to the
other drive. But yes, apparently will see this on systems where the
file is being constantly written to.

http://bergs.biz/blog/2009/03/01/startled-by-component-device-mismatches-on-raid1-volumes/

http://www.issociate.de/board/goto/1675787/mismatch_cnt_worries.html
(this is a post from 2007 that discusses the issue)

http://forum.nginx.org/read.php?24,16699

Apparently, a non-zero number is common on RAID-1 and RAID-10 due to
various (harmless?) issues like aborted writes in a swap file.

http://www.centos.org/modules/newbb/viewtopic.php?topic_id=23164&forum=37

Also mentions that it can happen with VMWare VM files.

And lastly, "please explain mismatch_cnt so I can sleep better at night".

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=405919

So my take on all of that is, if you see it on RAID-5 or RAID-6, you
should worry. But if it's on an array with memory mapped files or swap
files/partitions that is RAID-1 or RAID-10, it's less of a worry.
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-01-2009, 01:00 PM
"jancio_wodnik@wp.pl"
 
Default /etc/cron.weekly/99-raid-check

W dniu 2009-12-01 11:06, RedShift pisze:
> Jancio Wodnik wrote:
>
>> W dniu 30.11.2009 14:08, Farkas Levente pisze:
>>
>>> hi,
>>> it's been a few weeks since rhel/centos 5.4 released and there were many
>>> discussion about this new "feature" the weekly raid partition check.
>>> we've got a lot's of server with raid1 system and i already try to
>>> configure them not to send these messages, but i'm not able ie. i
>>> already add to the SKIP_DEVS all of my swap partitions (since i read it
>>> on linux-kernel list that there can be mismatch_cnt even though i still
>>> not understand why?). but even the data partitions (ie. all of my
>>> servers all raid1 partitions) produce this error (ie. ther mismatch_cnt
>>> is never 0 at the weekend). and this cause all of my raid1 partitions
>>> are rebuild during the weekend. and i don't like it:-(
>>> so my questions:
>>> - is it a real bug in the raid1 system?
>>> - is it a real bug in my disk which runs raid (not really believe since
>>> it's dozens of servers)?
>>> - the /etc/cron.weekly/99-raid-check is wrong in rhel/centos-5.4?
>>> or what's the problem?
>>> can someone enlighten me?
>>> thanks in advance.
>>> regards.
>>>
>>>
>>>
>> Hi. I have this problem on my 2 servers (Both Centos 5.4) - every
>> weekend my raid1 set is rebuild, because
>>
>> mismatch_cnt is never 0 at the weekend. What is really going on ? My 1TB disk whith raid1 are rebuild every weekend.
>>
>>
> They aren't being rebuilt, they are being checked if the data on the RAID disks are consistent. There are various reasons why mismatch_cnt can be higher than 0, for example aborted writes. Generally it's not really something to be worried about if you have for example a swap partition in your RAID array. If you do a repair and then a check the mismatch_cnt should reset to 0.
>
>
Hi. Nope, they are rebuild: /dev/md0 -> /boot and /dev/md1 -> /home and
from log:

/etc/cron.weekly/99-raid-check:

WARNING: mismatch_cnt is not 0 on /dev/md0
WARNING: mismatch_cnt is not 0 on /dev/md1

And from dmesg:

md: syncing RAID array md7
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000
KB/sec) for reconstruction.
md: using 128k window, over a total of 976759936 blocks.
md: syncing RAID array md0
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000
KB/sec) for reconstruction.
md: using 128k window, over a total of 104320 blocks.
md: delaying resync of md1 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md3 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md1 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md5 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md1 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md3 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md6 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md3 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md1 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md5 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md8 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md5 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md1 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md3 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md6 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md4 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md6 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md3 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md1 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md5 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md8 until md0 has finished resync (they share one
or more physical units)
md: md0: sync done.
RAID1 conf printout:
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:sda1
disk 1, wo:0, o:1, dev:sdb1
md: delaying resync of md8 until md4 has finished resync (they share one
or more physical units)
md: delaying resync of md5 until md6 has finished resync (they share one
or more physical units)
md: delaying resync of md1 until md3 has finished resync (they share one
or more physical units)
md: delaying resync of md3 until md5 has finished resync (they share one
or more physical units)
md: delaying resync of md6 until md8 has finished resync (they share one
or more physical units)
md: syncing RAID array md4
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000
KB/sec) for reconstruction.
md: using 128k window, over a total of 4096448 blocks.
md: md4: sync done.
md: delaying resync of md6 until md8 has finished resync (they share one
or more physical units)
md: delaying resync of md3 until md5 has finished resync (they share one
or more physical units)
md: delaying resync of md1 until md3 has finished resync (they share one
or more physical units)
md: delaying resync of md5 until md6 has finished resync (they share one
or more physical units)
md: syncing RAID array md8
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000
KB/sec) for reconstruction.
md: using 128k window, over a total of 929914368 blocks.
RAID1 conf printout:
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:sda6
disk 1, wo:0, o:1, dev:sdb6
md: md7: sync done.
md: delaying resync of md5 until md6 has finished resync (they share one
or more physical units)
md: delaying resync of md1 until md3 has finished resync (they share one
or more physical units)
md: delaying resync of md3 until md5 has finished resync (they share one
or more physical units)
md: delaying resync of md6 until md8 has finished resync (they share one
or more physical units)
RAID1 conf printout:
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:sdc1
disk 1, wo:0, o:1, dev:sdd1
md: md8: sync done.
md: syncing RAID array md6
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000
KB/sec) for reconstruction.
md: using 128k window, over a total of 7823552 blocks.
md: delaying resync of md5 until md6 has finished resync (they share one
or more physical units)
RAID1 conf printout:
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:sda9
disk 1, wo:0, o:1, dev:sdb9
md: delaying resync of md1 until md3 has finished resync (they share one
or more physical units)
md: delaying resync of md3 until md5 has finished resync (they share one
or more physical units)
md: md6: sync done.
md: delaying resync of md3 until md5 has finished resync (they share one
or more physical units)
md: syncing RAID array md5
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000
KB/sec) for reconstruction.
RAID1 conf printout:
md: using 128k window, over a total of 2048192 blocks.
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:sda8
md: delaying resync of md1 until md3 has finished resync (they share one
or more physical units)
disk 1, wo:0, o:1, dev:sdb8
md: md5: sync done.
md: delaying resync of md1 until md3 has finished resync (they share one
or more physical units)
md: syncing RAID array md3
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000
KB/sec) for reconstruction.
md: using 128k window, over a total of 8193024 blocks.
RAID1 conf printout:
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:sda7
disk 1, wo:0, o:1, dev:sdb7
md: md3: sync done.
md: syncing RAID array md1
RAID1 conf printout:
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:sda5
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000
KB/sec) for reconstruction.
md: using 128k window, over a total of 8193024 blocks.
disk 1, wo:0, o:1, dev:sdb5
md: md1: sync done.
RAID1 conf printout:
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:sda3
disk 1, wo:0, o:1, dev:sdb3
CIFS VFS: cifs_mount failed w/return code = -6
CIFS VFS: cifs_mount failed w/return code = -6
SELinux: initialized (dev cifs, type cifs), uses genfs_contexts
md: syncing RAID array md7
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000
KB/sec) for reconstruction.
md: using 128k window, over a total of 976759936 blocks.
md: syncing RAID array md0
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000
KB/sec) for reconstruction.
md: using 128k window, over a total of 104320 blocks.
md: delaying resync of md1 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md3 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md1 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md5 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md1 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md3 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md6 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md3 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md1 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md5 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md8 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md5 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md1 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md3 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md6 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md4 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md6 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md1 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md5 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md8 until md0 has finished resync (they share one
or more physical units)
md: delaying resync of md3 until md0 has finished resync (they share one
or more physical units)
md: md0: sync done.
md: delaying resync of md3 until md5 has finished resync (they share one
or more physical units)
md: delaying resync of md5 until md6 has finished resync (they share one
or more physical units)
md: delaying resync of md8 until md4 has finished resync (they share one
or more physical units)
md: delaying resync of md1 until md3 has finished resync (they share one
or more physical units)
md: delaying resync of md6 until md8 has finished resync (they share one
or more physical units)
md: syncing RAID array md4
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000
KB/sec) for reconstruction.
md: using 128k window, over a total of 4096448 blocks.
RAID1 conf printout:
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:sda1
disk 1, wo:0, o:1, dev:sdb1
md: md4: sync done.
md: delaying resync of md6 until md8 has finished resync (they share one
or more physical units)
md: syncing RAID array md8
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000
KB/sec) for reconstruction.
md: using 128k window, over a total of 929914368 blocks.
md: delaying resync of md1 until md3 has finished resync (they share one
or more physical units)
md: delaying resync of md5 until md6 has finished resync (they share one
or more physical units)
RAID1 conf printout:
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:sda6
disk 1, wo:0, o:1, dev:sdb6
md: delaying resync of md3 until md5 has finished resync (they share one
or more physical units)
md: md7: sync done.
md: delaying resync of md3 until md5 has finished resync (they share one
or more physical units)
md: delaying resync of md5 until md6 has finished resync (they share one
or more physical units)
RAID1 conf printout:
md: delaying resync of md1 until md3 has finished resync (they share one
or more physical units)
--- wd:2 rd:2
md: delaying resync of md6 until md8 has finished resync (they share one
or more physical units)
disk 0, wo:0, o:1, dev:sdc1
disk 1, wo:0, o:1, dev:sdd1
md: md8: sync done.
md: syncing RAID array md6
md: delaying resync of md3 until md5 has finished resync (they share one
or more physical units)
RAID1 conf printout:
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:sda9
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000
KB/sec) for reconstruction.
disk 1, wo:0, o:1, dev:sdb9
md: using 128k window, over a total of 7823552 blocks.
md: delaying resync of md5 until md6 has finished resync (they share one
or more physical units)
md: delaying resync of md1 until md3 has finished resync (they share one
or more physical units)
md: md6: sync done.
md: delaying resync of md1 until md3 has finished resync (they share one
or more physical units)
md: delaying resync of md3 until md5 has finished resync (they share one
or more physical units)
md: syncing RAID array md5
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000
KB/sec) for reconstruction.
md: using 128k window, over a total of 2048192 blocks.
RAID1 conf printout:
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:sda8
disk 1, wo:0, o:1, dev:sdb8
md: md5: sync done.
md: delaying resync of md1 until md3 has finished resync (they share one
or more physical units)
md: syncing RAID array md3
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000
KB/sec) for reconstruction.
md: using 128k window, over a total of 8193024 blocks.
RAID1 conf printout:
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:sda7
disk 1, wo:0, o:1, dev:sdb7
md: md3: sync done.
md: syncing RAID array md1
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000
KB/sec) for reconstruction.
md: using 128k window, over a total of 8193024 blocks.
RAID1 conf printout:
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:sda5
disk 1, wo:0, o:1, dev:sdb5
md: md1: sync done.
RAID1 conf printout:
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:sda3
disk 1, wo:0, o:1, dev:sdb3

This is a some kind of madness !

Jancio_Wodnik




>> Has anybody make bugzilla this ?
>>
>>
>>
> I don't think so, this is a feature, not a bug... And as long as it's shipped with upstream it'll be shipped with CentOS.
>
>
> Best regards,
>
> Glenn
> _______________________________________________
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
>
>
>

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-01-2009, 01:51 PM
Farkas Levente
 
Default /etc/cron.weekly/99-raid-check

On 12/01/2009 02:36 PM, Thomas Harold wrote:
> So my take on all of that is, if you see it on RAID-5 or RAID-6, you
> should worry. But if it's on an array with memory mapped files or swap
> files/partitions that is RAID-1 or RAID-10, it's less of a worry.

but then do we (the /etc/cron.weekly/99-raid-check) need to rebuild all
of my terrabyte raid1 arrays at all weekend? if not than imho it's a bug:-(

--
Levente "Si vis pacem para bellum!"
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-01-2009, 02:09 PM
Ross Walker
 
Default /etc/cron.weekly/99-raid-check

On Dec 1, 2009, at 9:51 AM, Farkas Levente <lfarkas@lfarkas.org> wrote:

> On 12/01/2009 02:36 PM, Thomas Harold wrote:
>> So my take on all of that is, if you see it on RAID-5 or RAID-6, you
>> should worry. But if it's on an array with memory mapped files or
>> swap
>> files/partitions that is RAID-1 or RAID-10, it's less of a worry.
>
> but then do we (the /etc/cron.weekly/99-raid-check) need to rebuild
> all
> of my terrabyte raid1 arrays at all weekend? if not than imho it's a
> bug:-(

I agree, I think the real problem is that the Linux MD RAID doesn't
quiesce the arrays when checking the mismatch_cnt so for RAID1/10
arrays you will see the transactions committed to one side, but not
yet on the other. For RAID5/6 the whole stripe with parity must be
committed atomically, so there should never be a mismatch_cnt.

There should be a way to specify arrays to be skipped during the check
until the real problem of quiescing the arrays is fixed. Or make the
RAID1/10 write transactions atomic like the RAID5/6 (which they should
be in my opinion, but lowers write performance).

Maybe touch a file with the name of the array to skip in a particular
directory somewhere and have the script check that directory for
arrays to skip?

-Ross

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 12-01-2009, 02:23 PM
Arturas Skauronas
 
Default /etc/cron.weekly/99-raid-check

On Tue, Dec 1, 2009 at 5:09 PM, Ross Walker <rswwalker@gmail.com> wrote:
> There should be a way to specify arrays to be skipped during the check
> until the real problem of quiescing the arrays is fixed. Or make the
> RAID1/10 write transactions atomic like the RAID5/6 (which they should
> be in my opinion, but lowers write performance).
> -Ross

you already can do this:
edit /etc/sysconfig/raid-check
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 

Thread Tools




All times are GMT. The time now is 11:16 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org