kjournald blocked in D state
I have a system on which kjournald becomes blocked in D state quite often.
Looking at a core file we have 5 mounted ext3 filesystems: crash> mount VFSMOUNT SUPERBLK TYPE DEVNAME DIRNAME 10037e07b00 10037e4ec00 rootfs rootfs / 10037e07ec0 10037e4e400 proc /proc /proc 10037e07d40 102188abc00 tmpfs none /dev 10037e07e00 102188b2400 ext3 /dev/root / 10037e07200 102188abc00 tmpfs none /dev 10037e07140 10037e4e400 proc /proc /proc 1021652bc00 102188b1c00 usbfs /proc/bus/usb /proc/bus/usb 1021652bf00 10037e4c400 sysfs /sys /sys 1021652bb40 10006967400 devpts devpts /dev/pts 1021652b180 100dfeda400 ext3 /dev/cciss/c0d0p1 /boot 1021652b240 100dfecb800 ext3 /dev/sys/home /home 1021652b300 100dfecbc00 ext3 /dev/sys/tmp /tmp 1021652b3c0 100dfeda800 ext3 /dev/sys/var /var 1021652b480 100dfedac00 tmpfs tmpfs /dev/shm 1021652bcc0 100dfecb400 binfmt_misc none /proc/sys/fs/binfmt_misc So we have 5 corresponding journal threads: crash> ps | grep kjournald 626 1 2 10218109030 IN 0.0 0 0 [kjournald] 3015 1 0 102168f2030 IN 0.0 0 0 [kjournald] 3016 1 1 102168f27f0 UN 0.0 0 0 [kjournald] 3017 1 1 1021837b030 IN 0.0 0 0 [kjournald] 3018 1 7 10216fd0030 UN 0.0 0 0 [kjournald] 2 are in the UNITERRUPTIBLE state. But only PID 3018 shows __wait_on_buffer in its stack: crash> bt -f 3018 PID: 3018 TASK: 10216fd0030 CPU: 7 COMMAND: "kjournald" -----snip----- #2 [10215a83b30] __wait_on_buffer at ffffffff8017d504 10215a83b38: 000001005fa12ce8 0000000000000000 10215a83b48: 0000010216fd0030 ffffffff8017d38a 10215a83b58: 0000010215a83b88 0000010215a83b88 10215a83b68: 000001005fa12ce8 0000000000000000 10215a83b78: 0000010216fd0030 ffffffff8017d38a 10215a83b88: ffffffff804ac808 ffffffff804ac808 10215a83b98: 000001005fa12ce8 0000000000000001 10215a83ba8: 000001004f4e90e0 ffffffffa0080ffe -----snip----- I'm not a crash expert so I then looked the last address pushed onto its stack and traced down to the inode semaphore: crash> struct file.f_dentry 000001005fa12ce8 f_dentry = 0x1021f4e5510, crash> struct dentry.d_inode 0x1021f4e5510 d_inode = 0x100c95c17c0, crash> struct inode.i_sem 0x100c95c17c0 i_sem = { count = { counter = -916711312 <-------------------- This looks wrong }, sleepers = 256, wait = { lock = { lock = 497690456, magic = 258 }, task_list = { next = 0x100000000000, <--------------- This also looks wrong prev = 0x30f75c3 } } }, At this point I'm not sure how to continue or even if I went down the right path. From this info can anyone tell what's wrong? Or did I not go down the patch to reach this conclusion. -- mikem In this case /home is a heavily accessed filesystem. _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users |
kjournald blocked in D state
On Thu, 17 Jun 2010 at 11:08, Mike Miller wrote:
> I have a system on which kjournald becomes blocked in D state quite often. Did this happen "just now", or after a kernel upgrade? Which kernel are you using? Do other systems (with the same kernel?) show similar behaviour? Christian. -- BOFH excuse #414: tachyon emissions overloading the system _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users |
kjournald blocked in D state
On Sun, Jun 20, 2010 at 12:44:37AM -0700, Christian Kujau wrote:
> On Thu, 17 Jun 2010 at 11:08, Mike Miller wrote: > > I have a system on which kjournald becomes blocked in D state quite often. > > Did this happen "just now", or after a kernel upgrade? Which kernel are > you using? Do other systems (with the same kernel?) show similar > behaviour? The kernel is a 2.6.9 variant. According to the user 2.6.9-89 exhibits the problem. Kernel 2.6.9-78 does not appear to exhibit the problem. Aside from that I've seen the the symptoms written against 2.6.18 and 2.6.32 kernels. It's not easy to reproduce. The customer is using clusters of 50+ nodes all using internal storage. AFAIK, they are not sharing filesystems between nodes. The driver differences between the 2 kernels are minimal with nothing in the main code path. -- mikem > > Christian. > -- > BOFH excuse #414: > > tachyon emissions overloading the system _______________________________________________ Ext3-users mailing list Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users |
| All times are GMT. The time now is 04:35 PM. |
VBulletin, Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.