FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian User

 
 
LinkBack Thread Tools
 
Old 08-07-2012, 11:54 PM
Claudius Hubig
 
Default Kernel Oops during/after rsync

Hello,

recently, I _sometimes_ encounter a kernel oops during a backup run.
This did not occur until a few days ago, furthermore, all the crashes
so far (a backup takes place every half hour and I’ve only seen three
during the last five days) occurred while I was playing Patrician III
with Wine. I don’t know whether Wine/Patrician III has anything to do
with that; if anything, I suspect temperature issues, but the same
machine survives five hours of video encoding just fine (though the
CPU frequency is sometimes automatically lowered due to high
temperature).

The backup script, attached below, consists of the following stages:

a) log that the backup started
b) rsync some data
c) log that the backup finished
d) call sync
e) log that the sync finished

a) and c) are logged:

Aug 7 23:30:01 ares Backup_Alix[638]: Backup started.
Aug 7 23:30:05 ares Backup_Alix[648]: Backup finished.

but not e). At the same time, the following is
written to kern.log, the screen shows part of that message and the
computer powers off after a while.

Aug 7 23:30:04 ares kernel: [75763.117797] BUG: unable to handle kernel paging request at ffff8812e5463620
Aug 7 23:30:04 ares kernel: [75763.117844] IP: [<ffffffff810d6985>] __d_lookup_rcu+0x136/0x156
Aug 7 23:30:04 ares kernel: [75763.117880] PGD 1a0c063 PUD 0
Aug 7 23:30:04 ares kernel: [75763.117901] Oops: 0000 [#1] SMP
Aug 7 23:30:04 ares kernel: [75763.117924] CPU 3
Aug 7 23:30:04 ares kernel: [75763.117934] Modules linked in: e1000e tp_smapi(O) thinkpad_ec(O) iwlwifi [last unloaded: e1000e]
Aug 7 23:30:04 ares kernel: [75763.117987]
Aug 7 23:30:04 ares kernel: [75763.117996] Pid: 641, comm: rsync Tainted: G O 3.4.1.a2017.1 #1 LENOVO 291239G/291239G
Aug 7 23:30:04 ares kernel: [75763.118042] RIP: 0010:[<ffffffff810d6985>] [<ffffffff810d6985>] __d_lookup_rcu+0x136/0x156
Aug 7 23:30:04 ares kernel: [75763.118082] RSP: 0018:ffff880143349c48 EFLAGS: 00010282
Aug 7 23:30:04 ares kernel: [75763.118107] RAX: 0000000000000008 RBX: ffff8812e5463608 RCX: 0000000000000014
Aug 7 23:30:04 ares kernel: [75763.118139] RDX: 0000000000000f40 RSI: ffffc90000002000 RDI: ffff88011c927a40
Aug 7 23:30:04 ares kernel: [75763.118171] RBP: ffff88011c927a40 R08: 0000000000000008 R09: 2f2f2f2f2f2f2f2f
Aug 7 23:30:04 ares kernel: [75763.118203] R10: 00000000ef94b111 R11: ffffffff8d9093ff R12: ffff880143349e18
Aug 7 23:30:04 ares kernel: [75763.118235] R13: ffff880143349d70 R14: ffff880143349d70 R15: ffff880143349e18
Aug 7 23:30:04 ares kernel: [75763.118268] FS: 00007fa33fd9f700(0000) GS:ffff88023bd80000(0000) knlGS:0000000000000000
Aug 7 23:30:04 ares kernel: [75763.118303] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 7 23:30:04 ares kernel: [75763.118330] CR2: ffff8812e5463620 CR3: 00000001427fd000 CR4: 00000000000007e0
Aug 7 23:30:04 ares kernel: [75763.118362] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug 7 23:30:04 ares kernel: [75763.118393] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Aug 7 23:30:04 ares kernel: [75763.118425] Process rsync (pid: 641, threadinfo ffff880143348000, task ffff88023073d980)
Aug 7 23:30:04 ares kernel: [75763.118460] Stack:
Aug 7 23:30:04 ares kernel: [75763.118471] ffff880115b097d0 0000000000000081 ffff8801e8ce6015 ffff88023073d980
Aug 7 23:30:04 ares kernel: [75763.118511] ffff88023073d980 ffff8801e8ce6015 ffff880143349cf4 0000000000000008
Aug 7 23:30:04 ares kernel: [75763.118551] ffff880115b097d0 ffff880143349e08 ffff880143349d48 ffff88011c927a40
Aug 7 23:30:04 ares kernel: [75763.118590] Call Trace:
Aug 7 23:30:04 ares kernel: [75763.118606] [<ffffffff810cdac8>] ? do_lookup+0x41/0x291
Aug 7 23:30:04 ares kernel: [75763.118632] [<ffffffff810cf160>] ? path_lookupat+0xe0/0x670
Aug 7 23:30:04 ares kernel: [75763.118660] [<ffffffff810e709c>] ? __getblk+0x24/0x245
Aug 7 23:30:04 ares kernel: [75763.118684] [<ffffffff810cf70e>] ? do_path_lookup+0x1e/0x9a
Aug 7 23:30:04 ares kernel: [75763.118711] [<ffffffff810cd876>] ? getname_flags+0x148/0x1e2
Aug 7 23:30:04 ares kernel: [75763.118738] [<ffffffff810d0fcb>] ? user_path_at_empty+0x47/0x7b
Aug 7 23:30:04 ares kernel: [75763.118768] [<ffffffff81158eb2>] ? jbd2_journal_stop+0x1f0/0x202
Aug 7 23:30:04 ares kernel: [75763.118797] [<ffffffff8113a266>] ? __ext4_journal_stop+0x6e/0x74
Aug 7 23:30:04 ares kernel: [75763.118825] [<ffffffff81136aa0>] ? ext4_link+0x121/0x156
Aug 7 23:30:04 ares kernel: [75763.118851] [<ffffffff810c86e7>] ? vfs_fstatat+0x32/0x60
Aug 7 23:30:04 ares kernel: [75763.118876] [<ffffffff810ce9e8>] ? vfs_link+0x161/0x17f
Aug 7 23:30:04 ares kernel: [75763.118901] [<ffffffff810c883c>] ? sys_newlstat+0x12/0x2b
Aug 7 23:30:04 ares kernel: [75763.118928] [<ffffffff810db57d>] ? mntput_no_expire+0x10/0xf4
Aug 7 23:30:04 ares kernel: [75763.118955] [<ffffffff810d1672>] ? sys_linkat+0x1b8/0x1c7
Aug 7 23:30:04 ares kernel: [75763.118983] [<ffffffff8159cca2>] ? system_call_fastpath+0x16/0x1b
Aug 7 23:30:04 ares kernel: [75763.119012] Code: 00 00 00 00 49 83 c8 ff 48 31 f2 49 d3 e0 49 f7 d0 49 85 d0 75 0d 48 8b 44 24 30 44 89 28 4d 89 1e eb 16 48 8b 1b 48 85 db 74 0b <44> 39 53 18 75 f2 e9 24 ff ff ff 45 31 ff 48 83 c4 48 4c 89 f8
Aug 7 23:30:04 ares kernel: [75763.119236] RIP [<ffffffff810d6985>] __d_lookup_rcu+0x136/0x156
Aug 7 23:30:04 ares kernel: [75763.119266] RSP <ffff880143349c48>
Aug 7 23:30:04 ares kernel: [75763.119282] CR2: ffff8812e5463620
Aug 7 23:30:05 ares kernel: [75763.550822] ---[ end trace f053761f106b0def ]---

The most curious thing here is that apparently, the kernel oops
occurred at 23:30:04, while the script was able to continue until
23:30:05 (when it logged that the backup finished). However, the
backup did not make it to the disk (i. e. neither of the directories
inc_alix-2012-08-07T23_30 (temporary) nor alix-2012-08-07T23_30
(final) exists).

Some data about the computer: 8 GB of RAM, Intel Core i5 520M;
self-compiled kernel 3.4.1, Debian Testing/Sid, rsync at version
3.0.9-3. Wifi, Bluetooth, UMTS or GPS have not been used during the
crashes, but the directory whereto rsync was supposed to store the
backup was located on an external hard drive with LUKS encryption.

Unfortunately, googling for anything from the above trace or just
kernel oopses in relation to rsync turned out unsuccessful, I would
hence be very thankful if someone could provide me with additional
pointers

Best regards,

Claudius

---- backup script: ----
#!/bin/bash

if [ -e /var/run/backup_alix ]; then
exit 1;
fi

if [ -e /tmp/disbackup ]; then
exit 2;
fi

# ok, this is certainly rather ugly...
if [ ! $(mount | grep -c "datapriv") -eq 1 ]; then
exit 3;
fi

logger -t Backup_Alix -i -p cron.info Backup started.

date=`date "+%Y-%m-%dT%H_%M"`

touch /var/run/backup

rsync -e "ssh -o VisualHostKey=no" --stats -aq --link-dest=/datapriv/backup/alix/current --delete --delete-excluded --exclude=/sys/* --exclude=/proc/* --exclude=/dev/* --exclude=/tmp/* alix.home.chubig.net:/ /datapriv/backup/alix/inc_alix-$date

mv /datapriv/backup/alix/inc_alix-$date /datapriv/backup/alix/alix-$date
rm -f /datapriv/backup/alix/current
ln -s /datapriv/backup/alix/alix-$date /datapriv/backup/alix/current

rm /var/run/backup
logger -t Backup_Alix -i -p cron.info Backup finished.
sync
logger -t Backup_Alix -i -p cron.info Sync finished.
---- :backup script ----
--
A board is the planck unit of boredom.
http://chubig.net telnet nightfall.org 4242
 
Old 08-08-2012, 03:20 PM
Camaleón
 
Default Kernel Oops during/after rsync

On Wed, 08 Aug 2012 01:54:11 +0200, Claudius Hubig wrote:

> recently, I _sometimes_ encounter a kernel oops during a backup run.
> This did not occur until a few days ago, furthermore, all the crashes so
> far (a backup takes place every half hour and I’ve only seen three
> during the last five days) occurred while I was playing Patrician III
> with Wine. I don’t know whether Wine/Patrician III has anything to do
> with that; if anything, I suspect temperature issues, but the same
> machine survives five hours of video encoding just fine (though the CPU
> frequency is sometimes automatically lowered due to high temperature).

Is the Patrician III a stable pattern for the crash? I mean, is it always
crashing when running it and it keeps stable when at the time the routine
runs the computer is idle?

> The backup script, attached below, consists of the following stages:
>
> a) log that the backup started
> b) rsync some data
> c) log that the backup finished
> d) call sync
> e) log that the sync finished
>
> a) and c) are logged:
>
> Aug 7 23:30:01 ares Backup_Alix[638]: Backup started. Aug 7 23:30:05
> ares Backup_Alix[648]: Backup finished.
>
> but not e). At the same time, the following is written to kern.log, the
> screen shows part of that message and the computer powers off after a
> while.
>
> Aug 7 23:30:04 ares kernel: [75763.117797] BUG: unable to handle kernel paging request at ffff8812e5463620
^^^

(...)

Can you reproduce the bug from a Debian kernel? If yes, you can report it
at the BTS; otherwise you can file a bug report upstream.

(...)

> Unfortunately, googling for anything from the above trace or just kernel
> oopses in relation to rsync turned out unsuccessful, I would hence be
> very thankful if someone could provide me with additional pointers

Google points to some user reports about a hardware problem (memory) but of
course, nothing conclusive.

Greetings,

--
Camaleón


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: http://lists.debian.org/jvu03f$kgo$11@dough.gmane.org
 
Old 08-08-2012, 03:40 PM
Claudius Hubig
 
Default Kernel Oops during/after rsync

Hello Camaleón,

Camaleón <noelamac@gmail.com> wrote:
> Is the Patrician III a stable pattern for the crash? I mean, is it always
> crashing when running it and it keeps stable when at the time the routine
> runs the computer is idle?

I am not entirely sure. So far, the computer crashed thrice during a
backup, and, at all times, Patrician III had been running. However,
I’ve been playing Patrician III for more than four weeks now (not all
that much to do at the moment…). Hence, I can say that, whenever the
computer crashed, Patrician III was running and a backup had just
finished, but there have been many more times when a backup succeeded
while Patrician III was running and the computer did not crash.

> Can you reproduce the bug from a Debian kernel? If yes, you can report it
> at the BTS; otherwise you can file a bug report upstream.

The kernel is compiled from Debian sources. I will see into running a
‘standard’ Debian kernel if the problem persists.

> Google points to some user reports about a hardware problem (memory) but of
> course, nothing conclusive.

I should have mentioned that running memtest86+ didn’t find any
problems :

Best regards & thank you very much for your reply,

Claudius
--
A board is the planck unit of boredom.
http://chubig.net telnet nightfall.org 4242
 

Thread Tools




All times are GMT. The time now is 12:34 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org