FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Gentoo > Gentoo User

 
 
LinkBack Thread Tools
 
Old 10-12-2008, 01:00 PM
Alexander Puchmayr
 
Default Kernel crash - howto find out what happened?

Am Sonntag, 12. Oktober 2008 schrieb Alan McKinnon:
> On Sunday 12 October 2008 13:12:20 Alexander Puchmayr wrote:
> > > If it's a kernel panic you actually get debugging information on the
> > > console. It's just hidden "behind" the X server. Maybe you can
> > > reproduce the problem working without X (If you can do your work
> > > purely from the VTs)
> >
> > I've tried, but unfortunately, the X-Driver on my laptop (i965) does
> > also seem to have stability problems, after ca an hour it froze using
> > 100% cpu-time, unable to kill (nither kill or kill -9 did work). I
> > guess it didn't wakeup from DPMS :-(
>
> Here's a thought: if you have a spare machine, you could ssh in to your
> desktop and continue to work normally. The ssh session would be tailing
> an appropriate log, so even if the desktop goes south there's a good
> chance the error log is visible
>
> For something more persistent, you could try temporarily sending all logs
> to a remote log server. Remote logging is quite efficient, I usually find
> the only thing that gets in it's way is a complete instant kernel halt
> that brings the whole machine down without warning - this is extremely
> rare on production kernels

I really doubt that this works, the logger does not have the change to write
anything as soon the kernel crashed, neither on a local disk or remote. It
seems to be something you called the "instant kernel halt", and I have the
luck to mess around with one of these rare cases :-(

But to give it a chance, I'm running a "cat /proc/kmsg" on the desktop,
started via ssh as you suggested.

Alex
 
Old 10-13-2008, 03:09 PM
"Duane Griffin"
 
Default Kernel crash - howto find out what happened?

2008/10/12 Alexander Puchmayr <alexander.puchmayr@linznet.at>:
> MY gentoo system (an amd64@4400+, 2GB ram, nforce4-chipset)
> worked fine for nearly two years, but now it frequently freezes, sometimes
> (not always) scrollock and capslock LED blinking).

If you have another machine lying around, try setting up netconsole
and/or serial console logging. They should catch any dying messages
from your kernel. Blinking LEDs indicates a panic, which means you
should get a message in those cases, at least.

Using serial console is the easiest and most reliable way, but
requires a serial cable. Netconsole just uses ethernet but isn't as
reliable. Take a look at Documentation/serial-console.txt and
Documentation/networking/netconsole.txt under your kernel source
directory for more info.

Cheers,
Duane.

--
"I never could learn to drink that blood and call it wine" - Bob Dylan
 
Old 10-13-2008, 11:30 PM
"Daniel da Veiga"
 
Default Kernel crash - howto find out what happened?

On Sun, Oct 12, 2008 at 07:08, Alexander Puchmayr
<alexander.puchmayr@linznet.at> wrote:
> Hi there!
>
> MY gentoo system (an amd64@4400+, 2GB ram, nforce4-chipset) worked fine for
> nearly two years, but now it frequently freezes, sometimes (not always)
> scrollock and capslock LED blinking).
>
> Since I'm using the box as desktop, I have only a frozen X-server and no
> possibility to switch to console (maybe there's some hint whats happened?).
>
> How do I find out what happened, why it crashed? Modern systems have
> MCE-logs, but how do I read it in this case? After reboot, all information
> seems to be gone since mcelog is always empty.
>
> I assume there's some problem with some hardware, I already tested RAM with
> memtest86, but no errors.
>

I had one of this freezes today.
Simply killed X using CTRL+SYSREQ+K and got back a console with error messages.

Have you tried the SYSREQ keys?

--
Daniel da Veiga
 
Old 10-14-2008, 04:31 PM
Alexander Puchmayr
 
Default Kernel crash - howto find out what happened?

On Dienstag, 14. Oktober 2008, Daniel da Veiga wrote:
> On Sun, Oct 12, 2008 at 07:08, Alexander Puchmayr
>
> <alexander.puchmayr@linznet.at> wrote:
> > Hi there!
> >
> > MY gentoo system (an amd64@4400+, 2GB ram, nforce4-chipset) worked fine
> > for nearly two years, but now it frequently freezes, sometimes (not
> > always) scrollock and capslock LED blinking).
> >
> > Since I'm using the box as desktop, I have only a frozen X-server and
> > no possibility to switch to console (maybe there's some hint whats
> > happened?).
> >
> > How do I find out what happened, why it crashed? Modern systems have
> > MCE-logs, but how do I read it in this case? After reboot, all
> > information seems to be gone since mcelog is always empty.
> >
> > I assume there's some problem with some hardware, I already tested RAM
> > with memtest86, but no errors.
>
> I had one of this freezes today.
> Simply killed X using CTRL+SYSREQ+K and got back a console with error
> messages.
>
> Have you tried the SYSREQ keys?

How does this work? I've tried it but I didn't get this working at all.
AFAIK, first step is to compile the CONFIG_MAGIC_SYSRQ into the kernel.
Then, make sure there's a "1" in /proc/sys/kernel/sysrq; well it is.
/usr/src/linux/Documentation/sysrq.txt says press "ALT-SysRq-<command key>",
I've tried it out with SysRq=printScreen and cmd='h' for help, but nothing
happens, even under normal conditions. What did I make wrong?

Alex
 
Old 10-14-2008, 05:09 PM
Alex Schuster
 
Default Kernel crash - howto find out what happened?

Alexander Puchmayr writes:

> On Dienstag, 14. Oktober 2008, Daniel da Veiga wrote:
> >
> > I had one of this freezes today.
> > Simply killed X using CTRL+SYSREQ+K and got back a console with error
> > messages.
> >
> > Have you tried the SYSREQ keys?
>
> How does this work? I've tried it but I didn't get this working at all.
> AFAIK, first step is to compile the CONFIG_MAGIC_SYSRQ into the kernel.
> Then, make sure there's a "1" in /proc/sys/kernel/sysrq; well it is.
> /usr/src/linux/Documentation/sysrq.txt says press "ALT-SysRq-<command
> key>", I've tried it out with SysRq=printScreen and cmd='h' for help,
> but nothing happens, even under normal conditions. What did I make
> wrong?

Try another key than 'h'. The space key will show a little help, probably
that what you expected to see with 'h'. Oh, you need to be on a text
console (ctrl-at-f1) to get visible output.

http://en.wikipedia.org/wiki/Magic_SysRq_key

Wonko
 
Old 10-19-2008, 09:58 AM
Alexander Puchmayr
 
Default Kernel crash - howto find out what happened?

Hi there!

As my system froze again right now, I've tried to reproduce it, tried to use
some of the hints given to me in this thread, and made the following
observations:

* The system freezes on heavy I/O on my sata-harddisks, especially when
copying mpeg-files (>2GB) from one disk to another.

* a "cat /proc/kmsg" started via ssh from another machine showed the last
lines

<4>ata6: timeout waiting for ADMA IDLE, stat=0x440
<4>ata6: timeout waiting for ADMA LEGACY, stat=0x440

* sysrq does not work at all (why?? I configured it identically to my
notebook, it works on the nb but not on the desktop. Simply no reaction
when pressing alt-sysrq-something, even under normal conditions.)

The sata-controller is an nvidia (onboard on my nforce-based mainboard),
driven by sata_nv-driver (The one from the kernel, no proprietary nvidia
chipset/sata driver installed). The kernel in question is a
gentoo-2.6.24-r8, I'll try an upgrade to the latest stable gentoo kernel.

Thanks to all who gave suggestions
Alex
 

Thread Tools




All times are GMT. The time now is 02:45 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org