FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Kernel

 
 
LinkBack Thread Tools
 
Old 12-15-2008, 05:36 PM
"Aaron D. Johnson"
 
Default Bug#508820: linux-image-2.6.26-1-mckinley: MCA and panic after bringing up loopback interface on ia64

Package: linux-image-2.6.26-1-mckinley
Version: 2.6.26-11
Severity: grave
Justification: renders package unusable

Attempting to boot linux-image-2.6.26-1-mckinley (2.6.26-11) as an HP
Integrity Virtual Machines guest OS results in a panic during network
startup. Console messages leading up to this:

[...]
[16578579.202741] EXT3 FS on dm-1, internal journal
[16578579.204938] EXT3-fs: mounted filesystem with ordered data mode.
done.
Activating swapfile swap...[16578579.463160] Adding 4194272k swap on /dev/mapper/vg00-swap. Priority:-1 extents:1 across:4194272k
done.
Setting up networking....
Configuring network interfaces...[16578581.213754] e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[16578582.433694] NET: Registered protocol family 10
[16578582.437702] lo: Disabled Privacy Extensions
[16578582.580016] Entered OS MCA handler. PSP=fff2e12c cpu=1 monarch=1
[16578582.580019] All OS MCA slaves have reached rendezvous
[16578582.607085] mlogbuf_finish: printing switched to urgent mode, MCA/INIT might be dodgy or fail.
[16578582.611888] Delaying for 5 seconds...

**** Dumping Guest Image ****

**** Done with dump (9612Kbytes) ****


*** VM restarting ***
[end of console logs]

Hardware information:
The physical system in an HP Integrity rx3600 running HP-UX 11.23 0609
and Integrity VM version A.03.00. The VM has two CPUs, 3Gbytes of RAM,
a single SCSI DVD drive, two SCSI hard drives, and two "Intel Corporation
82540EM Gigabit Ethernet Controller" NICs.

lspci output:
00:00.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 07)
00:01.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 07)
00:03.0 ISA bridge: Intel Corporation 82372FB PIIX5 ISA (rev 01)
07:00.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller
07:01.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller

Integrity VM saved a core dump file which is available if desired.
Physical access to this machine may be available for developers in the
nothern Colorado area (dannf, want to come and lay hands on?)

linux-image-2.6.26-1-mckinley(2.6.26-8) worked fine.

-- System Information:
Debian Release: lenny/sid
APT prefers testing
APT policy: (500, 'testing')
Architecture: ia64

Kernel: Linux 2.6.25-2-mckinley (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages linux-image-2.6.26-1-mckinley depends on:
ii debconf [debconf-2.0] 1.5.24 Debian configuration management sy
ii initramfs-tools [linux-initra 0.92j tools for generating an initramfs
ii module-init-tools 3.4-1 tools for managing Linux kernel mo

linux-image-2.6.26-1-mckinley recommends no packages.

Versions of packages linux-image-2.6.26-1-mckinley suggests:
ii elilo 3.8-1 Bootloader for systems using EFI-b
pn fdutils <none> (no description available)
pn linux-doc-2.6.26 <none> (no description available)

-- debconf information:
linux-image-2.6.26-1-mckinley/preinst/elilo-initrd-2.6.26-1-mckinley: true
linux-image-2.6.26-1-mckinley/postinst/depmod-error-2.6.26-1-mckinley: false
linux-image-2.6.26-1-mckinley/postinst/bootloader-test-error-2.6.26-1-mckinley:
* shared/kernel-image/really-run-bootloader: true
linux-image-2.6.26-1-mckinley/postinst/bootloader-error-2.6.26-1-mckinley:
linux-image-2.6.26-1-mckinley/preinst/abort-overwrite-2.6.26-1-mckinley:
linux-image-2.6.26-1-mckinley/postinst/create-kimage-link-2.6.26-1-mckinley: true
linux-image-2.6.26-1-mckinley/postinst/kimage-is-a-directory:
linux-image-2.6.26-1-mckinley/preinst/lilo-initrd-2.6.26-1-mckinley: true
* linux-image-2.6.26-1-mckinley/preinst/already-running-this-2.6.26-1-mckinley:
linux-image-2.6.26-1-mckinley/postinst/old-system-map-link-2.6.26-1-mckinley: true
linux-image-2.6.26-1-mckinley/postinst/depmod-error-initrd-2.6.26-1-mckinley: false
linux-image-2.6.26-1-mckinley/preinst/overwriting-modules-2.6.26-1-mckinley: true
linux-image-2.6.26-1-mckinley/postinst/old-initrd-link-2.6.26-1-mckinley: true
linux-image-2.6.26-1-mckinley/preinst/initrd-2.6.26-1-mckinley:
linux-image-2.6.26-1-mckinley/preinst/lilo-has-ramdisk:
linux-image-2.6.26-1-mckinley/prerm/removing-running-kernel-2.6.26-1-mckinley: true
linux-image-2.6.26-1-mckinley/postinst/old-dir-initrd-link-2.6.26-1-mckinley: true
linux-image-2.6.26-1-mckinley/preinst/bootloader-initrd-2.6.26-1-mckinley: true
linux-image-2.6.26-1-mckinley/preinst/abort-install-2.6.26-1-mckinley:
linux-image-2.6.26-1-mckinley/preinst/failed-to-move-modules-2.6.26-1-mckinley:
linux-image-2.6.26-1-mckinley/prerm/would-invalidate-boot-loader-2.6.26-1-mckinley: true



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 12-16-2008, 07:34 AM
dann frazier
 
Default Bug#508820: linux-image-2.6.26-1-mckinley: MCA and panic after bringing up loopback interface on ia64

On Mon, Dec 15, 2008 at 11:36:37AM -0700, Aaron D. Johnson wrote:
> Package: linux-image-2.6.26-1-mckinley
> Version: 2.6.26-11
> Severity: grave
> Justification: renders package unusable
>
> Attempting to boot linux-image-2.6.26-1-mckinley (2.6.26-11) as an HP
> Integrity Virtual Machines guest OS results in a panic during network
> startup. Console messages leading up to this:
>
> [...]
> [16578579.202741] EXT3 FS on dm-1, internal journal
> [16578579.204938] EXT3-fs: mounted filesystem with ordered data mode.
> done.
> Activating swapfile swap...[16578579.463160] Adding 4194272k swap on /dev/mapper/vg00-swap. Priority:-1 extents:1 across:4194272k
> done.
> Setting up networking....
> Configuring network interfaces...[16578581.213754] e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
> [16578582.433694] NET: Registered protocol family 10
> [16578582.437702] lo: Disabled Privacy Extensions
> [16578582.580016] Entered OS MCA handler. PSP=fff2e12c cpu=1 monarch=1
> [16578582.580019] All OS MCA slaves have reached rendezvous
> [16578582.607085] mlogbuf_finish: printing switched to urgent mode, MCA/INIT might be dodgy or fail.
> [16578582.611888] Delaying for 5 seconds...

I've got a theory - can you search the /var/log/kern.log* files on
this guest for any Oops messages? Do you recall experiencing a hang
during your kernel upgrade? I'm wondering if there was an oops at the
time you upgraded your kernel package. Also, can you mount your efi
partition and capture the md5sums of the files under
/boot/efi/efi/debian?

If my theory is correct, you may be able to get back up and running by
booting an older kernel (if you have one), running 'elilo', then
booting back into the 2.6.26-11 kernel.

--
dann frazier




--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 12-16-2008, 04:08 PM
"Aaron D. Johnson"
 
Default Bug#508820: linux-image-2.6.26-1-mckinley: MCA and panic after bringing up loopback interface on ia64

dann frazier writes:
> I've got a theory - can you search the /var/log/kern.log* files on
> this guest for any Oops messages?

No Oopses going back to 3 Dec:
ajohnso2@spielplatz:~$ sudo zgrep -i oops /var/log/kern.log*
ajohnso2@spielplatz:~$ sudo gzip -dc /var/log/kern.log.6.gz | head -n 1
Dec 3 06:26:12 spielplatz kernel: [15525239.819366] postgres(22142): floating-point assist fault at ip 40000000003de402, isr 0000040000000008
ajohnso2@spielplatz:~$

Countless floating-point assist fault messages, though. It seems that
PostgreSQL needs some help in this department.

> Do you recall experiencing a hang during your kernel upgrade?

I remember a hang on shutdown for some system during the last week,
but nothing during the kernel package upgrade proper.

> I'm wondering if there was an oops at the time you upgraded your
> kernel package. Also, can you mount your efi partition and capture
> the md5sums of the files under /boot/efi/efi/debian?

ajohnso2@spielplatz:~$ sudo mount -v -t vfat -o ro /dev/sda1 /mnt
ajohnso2@spielplatz:~$ md5sum /mnt/efi/debian/*
9fa2639fa5dca1521df76c7c254f4e04 /mnt/efi/debian/elilo.conf
5bec2375858e01c4590976f3fb479a3c /mnt/efi/debian/elilo.efi
f6d26c846defcbb6a255365b32205e69 /mnt/efi/debian/initrd.img
f43e07c02fff08489e5d1f60dc0046ae /mnt/efi/debian/initrd.img.old
35a0f1cd6e79fc7ffd93ca1dddb5df01 /mnt/efi/debian/readme.txt
384b24d661e30ca549569954ab9dc3ae /mnt/efi/debian/vmlinuz
67a9622f681abd91cc4710da8894b743 /mnt/efi/debian/vmlinuz.old
ajohnso2@spielplatz:~$

> If my theory is correct, you may be able to get back up and running
> by booting an older kernel (if you have one), running 'elilo', then
> booting back into the 2.6.26-11 kernel.

OK, so that worked. What change did re-running elilo make? Based on
the MD5sums, there are new initrd and vmlinuz files. Seems like
installing kernel-image-2.6.26-1-mckinley should have done that in its
postinst script.

What happens to the poor user who doesn't know to re-run elilo? (Not
that I expect there are too many "poor users" running ia64 systems.)

Thanks.

- Aaron



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 12-16-2008, 04:50 PM
dann frazier
 
Default Bug#508820: linux-image-2.6.26-1-mckinley: MCA and panic after bringing up loopback interface on ia64

On Tue, Dec 16, 2008 at 10:08:59AM -0700, Aaron D. Johnson wrote:
> dann frazier writes:
> > I've got a theory - can you search the /var/log/kern.log* files on
> > this guest for any Oops messages?
>
> No Oopses going back to 3 Dec:
> ajohnso2@spielplatz:~$ sudo zgrep -i oops /var/log/kern.log*
> ajohnso2@spielplatz:~$ sudo gzip -dc /var/log/kern.log.6.gz | head -n 1
> Dec 3 06:26:12 spielplatz kernel: [15525239.819366] postgres(22142): floating-point assist fault at ip 40000000003de402, isr 0000040000000008
> ajohnso2@spielplatz:~$
>
> Countless floating-point assist fault messages, though. It seems that
> PostgreSQL needs some help in this department.
>
> > Do you recall experiencing a hang during your kernel upgrade?
>
> I remember a hang on shutdown for some system during the last week,
> but nothing during the kernel package upgrade proper.
>
> > I'm wondering if there was an oops at the time you upgraded your
> > kernel package. Also, can you mount your efi partition and capture
> > the md5sums of the files under /boot/efi/efi/debian?
>
> ajohnso2@spielplatz:~$ sudo mount -v -t vfat -o ro /dev/sda1 /mnt
> ajohnso2@spielplatz:~$ md5sum /mnt/efi/debian/*
> 9fa2639fa5dca1521df76c7c254f4e04 /mnt/efi/debian/elilo.conf
> 5bec2375858e01c4590976f3fb479a3c /mnt/efi/debian/elilo.efi
> f6d26c846defcbb6a255365b32205e69 /mnt/efi/debian/initrd.img
> f43e07c02fff08489e5d1f60dc0046ae /mnt/efi/debian/initrd.img.old
> 35a0f1cd6e79fc7ffd93ca1dddb5df01 /mnt/efi/debian/readme.txt
> 384b24d661e30ca549569954ab9dc3ae /mnt/efi/debian/vmlinuz
> 67a9622f681abd91cc4710da8894b743 /mnt/efi/debian/vmlinuz.old
> ajohnso2@spielplatz:~$
>
> > If my theory is correct, you may be able to get back up and running
> > by booting an older kernel (if you have one), running 'elilo', then
> > booting back into the 2.6.26-11 kernel.
>
> OK, so that worked. What change did re-running elilo make? Based on
> the MD5sums, there are new initrd and vmlinuz files. Seems like
> installing kernel-image-2.6.26-1-mckinley should have done that in its
> postinst script.

Here's what I think happened:
- Running 2.6.26-8
- Upgraded to 2.6.26-11
- unpacked 2.6.26-11
- generated initramfs
- called elilo
- elilo loads modules it needs to mount EFI partition,
but the modules available are now for 2.6.26-11 and
are incompatible with 2.6.26-8.
- system tries to mount efi partition and hangs due to
incompatible modules - kernel/initrd in the efi partition
is now out of date with respect to the files in /boot
- system boots 2.6.26-8 again
- initramfs loads, works fine (still using 2.6.26-8 initramfs)
- system mounts root
- system starts loading modules from the root partition (which
are now 2.6.26-11 modules), and does bad things.

The bug would therefore be that we created a kernel with the same
abiname that was actually incompatible with the modules from an
earlier release.

> What happens to the poor user who doesn't know to re-run elilo? (Not
> that I expect there are too many "poor users" running ia64 systems.)

Unfortunately, I don't know that there's anyway to retroactively solve
this problem. The cat is out of the bag, as they say.

It would be a nice safety procedure to make sure the modules we need
are loaded before we unpack the new modules - i.e., in the
preinst. One way to do this would be to call 'elilo' in the preinst.

Savy users can configure their systems to do this themselves by adding
a preinst hook in /etc/kernel-img.conf.

--
dann frazier




--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 

Thread Tools




All times are GMT. The time now is 05:59 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org