FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Crash Utility

 
 
LinkBack Thread Tools
 
Old 08-11-2010, 06:04 PM
"Koornstra, Reinoud"
 
Default crash: invalid structure member offset

Hi Everyone,

I am trying to read a core file into crash, but I've got bad luck as you can see below.
Is core file corrupt? It is a vmcore file from a 32 bits kernel that was compiled with PAE, could that have corrupted things?
Any hints here?
Thanks,

Reinoud.

$ crash System.map-2.6.27 ./vmlinux-2.6.27 ./vmcore

crash 4.0-3.7
Copyright 2002, 2003, 2004, 2005, 2006 Red Hat, Inc.
Copyright 2004, 2005, 2006 IBM Corporation
Copyright 1999-2006 Hewlett-Packard Co
Copyright 2005 Fujitsu Limited
Copyright 2005 NEC Corporation
Copyright 1999, 2002 Silicon Graphics, Inc.
Copyright 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.

GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...

please wait... (gathering kmem slab cache data)
crash: invalid structure member offset: kmem_cache_s_c_num
FILE: memory.c LINE: 6891 FUNCTION: kmem_cache_init()

[/usr/bin/crash] error trace: 80827a9 => 8095398 => 80aa7ef => 8131e88
/usr/bin/nm: /usr/bin/crash: no symbols
/usr/bin/nm: /usr/bin/crash: no symbols
/usr/bin/nm: /usr/bin/crash: no symbols
/usr/bin/nm: /usr/bin/crash: no symbols

WARNING: Because this kernel was compiled with gcc version 4.1.2, certain
commands or command options may fail unless crash is invoked with
the "--readnow" command line option.

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 08-12-2010, 01:13 PM
Dave Anderson
 
Default crash: invalid structure member offset

----- "Reinoud Koornstra" <koornstra@hp.com> wrote:

> Hi Everyone,
>
> I am trying to read a core file into crash, but I've got bad luck as you can see below.
> Is core file corrupt? It is a vmcore file from a 32 bits kernel that
> was compiled with PAE, could that have corrupted things?
> Any hints here?
> Thanks,
>
> Reinoud.
>
> $ crash System.map-2.6.27 ./vmlinux-2.6.27 ./vmcore
>
> crash 4.0-3.7

I don't know if the vmcore is corrupt, but PAE wouldn't be an issue.

However, you are running a version of crash that was released almost
4 years ago (13-Oct-2006) against a two-year-old kernel that was
released 15-Oct-2008. That's pretty much a guarantee of failure.

Try updating to version 5.0.6 and see what happens.

And BTW, if the vmlinux file is the exact same kernel as the
one that generated the vmcore file, you don't need a System.map
argument.

Dave



15-Oct-2008

> Copyright 2002, 2003, 2004, 2005, 2006 Red Hat, Inc.
> Copyright 2004, 2005, 2006 IBM Corporation
> Copyright 1999-2006 Hewlett-Packard Co
> Copyright 2005 Fujitsu Limited
> Copyright 2005 NEC Corporation
> Copyright 1999, 2002 Silicon Graphics, Inc.
> Copyright 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
> This program is free software, covered by the GNU General Public License,
> and you are welcome to change it and/or distribute copies of it under
> certain conditions. Enter "help copying" to see the conditions.
> This program has absolutely no warranty. Enter "help warranty" for
> details.
>
> GNU gdb 6.1
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for details.
> This GDB was configured as "i686-pc-linux-gnu"...
>
> please wait... (gathering kmem slab cache data)
>
> crash: invalid structure member offset: kmem_cache_s_c_num
> FILE: memory.c LINE: 6891 FUNCTION: kmem_cache_init()
>
> [/usr/bin/crash] error trace: 80827a9 => 8095398 => 80aa7ef =>
> 8131e88
> /usr/bin/nm: /usr/bin/crash: no symbols
> /usr/bin/nm: /usr/bin/crash: no symbols
> /usr/bin/nm: /usr/bin/crash: no symbols
> /usr/bin/nm: /usr/bin/crash: no symbols
>
> WARNING: Because this kernel was compiled with gcc version 4.1.2, certain
> commands or command options may fail unless crash is invoked with
> the "--readnow" command line option.

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 08-12-2010, 06:56 PM
"Koornstra, Reinoud"
 
Default crash: invalid structure member offset

Thanks,

Using crash 5.0.6 worked nicely.
However, I can't really look at a lot because of a bad EIP code.

[ 726.601381] 802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com>
[ 726.601384] All bugs added by David S. Miller <davem@redhat.com>
[ 726.646757] BUG: unable to handle kernel NULL pointer dereference at 00000000
[ 726.732410] IP: [<00000000>]
[ 726.766933] *pdpt = 0000000000431001 *pde = 0000000000000000
[ 726.766937] Oops: 0010 [#1] SMP
[ 726.790844] Modules linked in: 8021q iptable_filter ip_tables x_tables ip_gre af_packet i2c_dev i2c_qs i2c_algo_bit i2c_core garp stp llc ixgbe inet_lro psmouse serio_raw intel_agp shpchp iTCO_wdt pci_hotplug iTCO_vendor_support agpgart ext3 jbd mbcache sd_mod crc_t10dif sg ata_piix ata_generic ahci libata scsi_mod ehci_hcd uhci_hcd usbcore [last unloaded: 8021q]
[ 726.790844]
[ 726.790844] Pid: 4, comm: ksoftirqd/0 Tainted: P (2.6.27)
[ 726.790844] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
[ 726.790844] EIP is at 0x0
[ 726.790844] EAX: e7f4c498 EBX: 00000000 ECX: 77470000 EDX: e7f4c498
[ 726.790844] ESI: 4bd1d300 EDI: 00000007 EBP: f784df88 ESP: f784df78
[ 726.790844] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 726.790844] Process ksoftirqd/0 (pid: 4, ti=f784c000 task=f783a5b0 task.ti=f784c000)
[ 726.790844] Stack: 40168080 00000001 403daaa0 4042c500 f784df90 401681bf f784dfb0 4012fe92
[ 726.790844] 0000000a 00000000 40429340 00000246 00000000 40130120 f784dfbc 4012ff55
[ 726.790844] 4042c500 f784dfcc 40130182 fffffffc 00000000 f784dfe0 4013e707 4013e6c0
[ 726.790844] Call Trace:
[ 726.790844] [<40168080>] ? __rcu_process_callbacks+0x70/0x190
[ 726.790844] [<401681bf>] ? rcu_process_callbacks+0x1f/0x40
[ 726.790844] [<4012fe92>] ? __do_softirq+0x82/0x100
[ 726.790844] [<40130120>] ? ksoftirqd+0x0/0xe0
[ 726.790844] [<4012ff55>] ? do_softirq+0x45/0x50
[ 726.790844] [<40130182>] ? ksoftirqd+0x62/0xe0
[ 726.790844] [<4013e707>] ? kthread+0x47/0x80
[ 726.790844] [<4013e6c0>] ? kthread+0x0/0x80
[ 726.790844] [<4010494f>] ? kernel_thread_helper+0x7/0x10
[ 726.790844] =======================
[ 726.790844] Code: Bad EIP value.
[ 726.790844] EIP: [<00000000>] 0x0 SS:ESP 0068:f784df78

So now I can't figure out the piece of code where this dereferencing occurred.


crash> bt
PID: 4 TASK: f783a5b0 CPU: 0 COMMAND: "ksoftirqd/0"
#0 [f784de88] crash_kexec at 401534a8
#1 [f784df28] __slab_free at 4019677f
#2 [f784df8c] rcu_process_callbacks at 401681ba
#3 [f784df94] __do_softirq at 4012fe90
#4 [f784dfb4] do_softirq at 4012ff50
#5 [f784dfd0] kthread at 4013e705
#6 [f784dfe4] kernel_thread_helper at 4010494d

Thanks,

Reinoud.


> -----Original Message-----
> From: crash-utility-bounces@redhat.com [mailto:crash-utility-
> bounces@redhat.com] On Behalf Of Dave Anderson
> Sent: Thursday, August 12, 2010 6:14 AM
> To: Discussion list for crash utility usage, maintenance and
> development
> Subject: Re: [Crash-utility] crash: invalid structure member offset
>
>
> ----- "Reinoud Koornstra" <koornstra@hp.com> wrote:
>
> > Hi Everyone,
> >
> > I am trying to read a core file into crash, but I've got bad luck as
> you can see below.
> > Is core file corrupt? It is a vmcore file from a 32 bits kernel that
> > was compiled with PAE, could that have corrupted things?
> > Any hints here?
> > Thanks,
> >
> > Reinoud.
> >
> > $ crash System.map-2.6.27 ./vmlinux-2.6.27 ./vmcore
> >
> > crash 4.0-3.7
>
> I don't know if the vmcore is corrupt, but PAE wouldn't be an issue.
>
> However, you are running a version of crash that was released almost
> 4 years ago (13-Oct-2006) against a two-year-old kernel that was
> released 15-Oct-2008. That's pretty much a guarantee of failure.
>
> Try updating to version 5.0.6 and see what happens.
>
> And BTW, if the vmlinux file is the exact same kernel as the
> one that generated the vmcore file, you don't need a System.map
> argument.
>
> Dave
>
>
>
> 15-Oct-2008
>
> > Copyright 2002, 2003, 2004, 2005, 2006 Red Hat, Inc.
> > Copyright 2004, 2005, 2006 IBM Corporation
> > Copyright 1999-2006 Hewlett-Packard Co
> > Copyright 2005 Fujitsu Limited
> > Copyright 2005 NEC Corporation
> > Copyright 1999, 2002 Silicon Graphics, Inc.
> > Copyright 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
> > This program is free software, covered by the GNU General Public
> License,
> > and you are welcome to change it and/or distribute copies of it under
> > certain conditions. Enter "help copying" to see the conditions.
> > This program has absolutely no warranty. Enter "help warranty" for
> > details.
> >
> > GNU gdb 6.1
> > Copyright 2004 Free Software Foundation, Inc.
> > GDB is free software, covered by the GNU General Public License, and
> you are
> > welcome to change it and/or distribute copies of it under certain
> conditions.
> > Type "show copying" to see the conditions.
> > There is absolutely no warranty for GDB. Type "show warranty" for
> details.
> > This GDB was configured as "i686-pc-linux-gnu"...
> >
> > please wait... (gathering kmem slab cache data)
> >
> > crash: invalid structure member offset: kmem_cache_s_c_num
> > FILE: memory.c LINE: 6891 FUNCTION: kmem_cache_init()
> >
> > [/usr/bin/crash] error trace: 80827a9 => 8095398 => 80aa7ef =>
> > 8131e88
> > /usr/bin/nm: /usr/bin/crash: no symbols
> > /usr/bin/nm: /usr/bin/crash: no symbols
> > /usr/bin/nm: /usr/bin/crash: no symbols
> > /usr/bin/nm: /usr/bin/crash: no symbols
> >
> > WARNING: Because this kernel was compiled with gcc version 4.1.2,
> certain
> > commands or command options may fail unless crash is invoked
> with
> > the "--readnow" command line option.
>
> --
> Crash-utility mailing list
> Crash-utility@redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 08-12-2010, 07:18 PM
Dave Anderson
 
Default crash: invalid structure member offset

----- "Reinoud Koornstra" <koornstra@hp.com> wrote:

> Thanks,
>
> Using crash 5.0.6 worked nicely.
> However, I can't really look at a lot because of a bad EIP code.
>
> [ 726.601381] 802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com>
> [ 726.601384] All bugs added by David S. Miller <davem@redhat.com>
> [ 726.646757] BUG: unable to handle kernel NULL pointer dereference at 00000000
> [ 726.732410] IP: [<00000000>]
> [ 726.766933] *pdpt = 0000000000431001 *pde = 0000000000000000
> [ 726.766937] Oops: 0010 [#1] SMP
> [ 726.790844] Modules linked in: 8021q iptable_filter ip_tables
> x_tables ip_gre af_packet i2c_dev i2c_qs i2c_algo_bit i2c_core garp
> stp llc ixgbe inet_lro psmouse serio_raw intel_agp shpchp iTCO_wdt
> pci_hotplug iTCO_vendor_support agpgart ext3 jbd mbcache sd_mod
> crc_t10dif sg ata_piix ata_generic ahci libata scsi_mod ehci_hcd
> uhci_hcd usbcore [last unloaded: 8021q]
> [ 726.790844]
> [ 726.790844] Pid: 4, comm: ksoftirqd/0 Tainted: P (2.6.27)
> [ 726.790844] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
> [ 726.790844] EIP is at 0x0
> [ 726.790844] EAX: e7f4c498 EBX: 00000000 ECX: 77470000 EDX: e7f4c498
> [ 726.790844] ESI: 4bd1d300 EDI: 00000007 EBP: f784df88 ESP: f784df78
> [ 726.790844] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> [ 726.790844] Process ksoftirqd/0 (pid: 4, ti=f784c000 task=f783a5b0 task.ti=f784c000)
> [ 726.790844] Stack: 40168080 00000001 403daaa0 4042c500 f784df90 401681bf f784dfb0 4012fe92
> [ 726.790844] 0000000a 00000000 40429340 00000246 00000000 40130120 f784dfbc 4012ff55
> [ 726.790844] 4042c500 f784dfcc 40130182 fffffffc 00000000 f784dfe0 4013e707 4013e6c0
> [ 726.790844] Call Trace:
> [ 726.790844] [<40168080>] ? __rcu_process_callbacks+0x70/0x190
> [ 726.790844] [<401681bf>] ? rcu_process_callbacks+0x1f/0x40
> [ 726.790844] [<4012fe92>] ? __do_softirq+0x82/0x100
> [ 726.790844] [<40130120>] ? ksoftirqd+0x0/0xe0
> [ 726.790844] [<4012ff55>] ? do_softirq+0x45/0x50
> [ 726.790844] [<40130182>] ? ksoftirqd+0x62/0xe0
> [ 726.790844] [<4013e707>] ? kthread+0x47/0x80
> [ 726.790844] [<4013e6c0>] ? kthread+0x0/0x80
> [ 726.790844] [<4010494f>] ? kernel_thread_helper+0x7/0x10
> [ 726.790844] =======================
> [ 726.790844] Code: Bad EIP value.
> [ 726.790844] EIP: [<00000000>] 0x0 SS:ESP 0068:f784df78
>
> So now I can't figure out the piece of code where this dereferencing
> occurred.

Yeah, I don't know why the exception frame didn't displayed below in the
bt output, but I think it may have been confusion due the the kernel text
region starting a 4000000 (instead of the typical 3G/1G user/kernel virtual
address split). I'm guessing your kernel is configured as 1G/3G user-kernel?
(I've never seen that before...)

Anyway, somehow the EIP got zeroed out, and it took a fault trying
to handle that. That can happen if a kernel function corrupts its
own stack by incorrectly writing to its own local stack variables,
and in so doing writes a zero into the return address saved on the
stack. Then when the function returns, that zero is loaded into the
EIP, and you'd see something like the above.

The exception frame in the log shows that the ESP is f784df78,
and looking at the trace data below, it looks like rcu_process_callbacks()
may have ended up calling something that lead to the EIP corruption.

Just a guess though...

Dave

>
> crash> bt
> PID: 4 TASK: f783a5b0 CPU: 0 COMMAND: "ksoftirqd/0"
> #0 [f784de88] crash_kexec at 401534a8
> #1 [f784df28] __slab_free at 4019677f
> #2 [f784df8c] rcu_process_callbacks at 401681ba
> #3 [f784df94] __do_softirq at 4012fe90
> #4 [f784dfb4] do_softirq at 4012ff50
> #5 [f784dfd0] kthread at 4013e705
> #6 [f784dfe4] kernel_thread_helper at 4010494d
>
> Thanks,
>
> Reinoud.
>
>
> > -----Original Message-----
> > From: crash-utility-bounces@redhat.com [mailto:crash-utility-
> > bounces@redhat.com] On Behalf Of Dave Anderson
> > Sent: Thursday, August 12, 2010 6:14 AM
> > To: Discussion list for crash utility usage, maintenance and
> > development
> > Subject: Re: [Crash-utility] crash: invalid structure member offset
> >
> >
> > ----- "Reinoud Koornstra" <koornstra@hp.com> wrote:
> >
> > > Hi Everyone,
> > >
> > > I am trying to read a core file into crash, but I've got bad luck
> as
> > you can see below.
> > > Is core file corrupt? It is a vmcore file from a 32 bits kernel
> that
> > > was compiled with PAE, could that have corrupted things?
> > > Any hints here?
> > > Thanks,
> > >
> > > Reinoud.
> > >
> > > $ crash System.map-2.6.27 ./vmlinux-2.6.27 ./vmcore
> > >
> > > crash 4.0-3.7
> >
> > I don't know if the vmcore is corrupt, but PAE wouldn't be an
> issue.
> >
> > However, you are running a version of crash that was released
> almost
> > 4 years ago (13-Oct-2006) against a two-year-old kernel that was
> > released 15-Oct-2008. That's pretty much a guarantee of failure.
> >
> > Try updating to version 5.0.6 and see what happens.
> >
> > And BTW, if the vmlinux file is the exact same kernel as the
> > one that generated the vmcore file, you don't need a System.map
> > argument.
> >
> > Dave
> >
> >
> >
> > 15-Oct-2008
> >
> > > Copyright 2002, 2003, 2004, 2005, 2006 Red Hat, Inc.
> > > Copyright 2004, 2005, 2006 IBM Corporation
> > > Copyright 1999-2006 Hewlett-Packard Co
> > > Copyright 2005 Fujitsu Limited
> > > Copyright 2005 NEC Corporation
> > > Copyright 1999, 2002 Silicon Graphics, Inc.
> > > Copyright 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
> > > This program is free software, covered by the GNU General Public
> > License,
> > > and you are welcome to change it and/or distribute copies of it
> under
> > > certain conditions. Enter "help copying" to see the conditions.
> > > This program has absolutely no warranty. Enter "help warranty"
> for
> > > details.
> > >
> > > GNU gdb 6.1
> > > Copyright 2004 Free Software Foundation, Inc.
> > > GDB is free software, covered by the GNU General Public License,
> and
> > you are
> > > welcome to change it and/or distribute copies of it under certain
> > conditions.
> > > Type "show copying" to see the conditions.
> > > There is absolutely no warranty for GDB. Type "show warranty"
> for
> > details.
> > > This GDB was configured as "i686-pc-linux-gnu"...
> > >
> > > please wait... (gathering kmem slab cache data)
> > >
> > > crash: invalid structure member offset: kmem_cache_s_c_num
> > > FILE: memory.c LINE: 6891 FUNCTION: kmem_cache_init()
> > >
> > > [/usr/bin/crash] error trace: 80827a9 => 8095398 => 80aa7ef =>
> > > 8131e88
> > > /usr/bin/nm: /usr/bin/crash: no symbols
> > > /usr/bin/nm: /usr/bin/crash: no symbols
> > > /usr/bin/nm: /usr/bin/crash: no symbols
> > > /usr/bin/nm: /usr/bin/crash: no symbols
> > >
> > > WARNING: Because this kernel was compiled with gcc version 4.1.2,
> > certain
> > > commands or command options may fail unless crash is
> invoked
> > with
> > > the "--readnow" command line option.
> >
> > --
> > Crash-utility mailing list
> > Crash-utility@redhat.com
> > https://www.redhat.com/mailman/listinfo/crash-utility
>
> --
> Crash-utility mailing list
> Crash-utility@redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 08-12-2010, 09:03 PM
"Koornstra, Reinoud"
 
Default crash: invalid structure member offset

> -----Original Message-----
> From: crash-utility-bounces@redhat.com [mailto:crash-utility-
> bounces@redhat.com] On Behalf Of Dave Anderson
> Sent: Thursday, August 12, 2010 12:18 PM
> To: Discussion list for crash utility usage, maintenance and
> development
> Subject: Re: [Crash-utility] crash: invalid structure member offset
>
>
> ----- "Reinoud Koornstra" <koornstra@hp.com> wrote:
>
> > Thanks,
> >
> > Using crash 5.0.6 worked nicely.
> > However, I can't really look at a lot because of a bad EIP code.
> >
> > [ 726.601381] 802.1Q VLAN Support v1.8 Ben Greear
> <greearb@candelatech.com>
> > [ 726.601384] All bugs added by David S. Miller <davem@redhat.com>
> > [ 726.646757] BUG: unable to handle kernel NULL pointer dereference
> at 00000000
> > [ 726.732410] IP: [<00000000>]
> > [ 726.766933] *pdpt = 0000000000431001 *pde = 0000000000000000
> > [ 726.766937] Oops: 0010 [#1] SMP
> > [ 726.790844] Modules linked in: 8021q iptable_filter ip_tables
> > x_tables ip_gre af_packet i2c_dev i2c_qs i2c_algo_bit i2c_core garp
> > stp llc ixgbe inet_lro psmouse serio_raw intel_agp shpchp iTCO_wdt
> > pci_hotplug iTCO_vendor_support agpgart ext3 jbd mbcache sd_mod
> > crc_t10dif sg ata_piix ata_generic ahci libata scsi_mod ehci_hcd
> > uhci_hcd usbcore [last unloaded: 8021q]
> > [ 726.790844]
> > [ 726.790844] Pid: 4, comm: ksoftirqd/0 Tainted: P (2.6.27)
> > [ 726.790844] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
> > [ 726.790844] EIP is at 0x0
> > [ 726.790844] EAX: e7f4c498 EBX: 00000000 ECX: 77470000 EDX:
> e7f4c498
> > [ 726.790844] ESI: 4bd1d300 EDI: 00000007 EBP: f784df88 ESP:
> f784df78
> > [ 726.790844] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> > [ 726.790844] Process ksoftirqd/0 (pid: 4, ti=f784c000 task=f783a5b0
> task.ti=f784c000)
> > [ 726.790844] Stack: 40168080 00000001 403daaa0 4042c500 f784df90
> 401681bf f784dfb0 4012fe92
> > [ 726.790844] 0000000a 00000000 40429340 00000246 00000000
> 40130120 f784dfbc 4012ff55
> > [ 726.790844] 4042c500 f784dfcc 40130182 fffffffc 00000000
> f784dfe0 4013e707 4013e6c0
> > [ 726.790844] Call Trace:
> > [ 726.790844] [<40168080>] ? __rcu_process_callbacks+0x70/0x190
> > [ 726.790844] [<401681bf>] ? rcu_process_callbacks+0x1f/0x40
> > [ 726.790844] [<4012fe92>] ? __do_softirq+0x82/0x100
> > [ 726.790844] [<40130120>] ? ksoftirqd+0x0/0xe0
> > [ 726.790844] [<4012ff55>] ? do_softirq+0x45/0x50
> > [ 726.790844] [<40130182>] ? ksoftirqd+0x62/0xe0
> > [ 726.790844] [<4013e707>] ? kthread+0x47/0x80
> > [ 726.790844] [<4013e6c0>] ? kthread+0x0/0x80
> > [ 726.790844] [<4010494f>] ? kernel_thread_helper+0x7/0x10
> > [ 726.790844] =======================
> > [ 726.790844] Code: Bad EIP value.
> > [ 726.790844] EIP: [<00000000>] 0x0 SS:ESP 0068:f784df78
> >
> > So now I can't figure out the piece of code where this dereferencing
> > occurred.
>
> Yeah, I don't know why the exception frame didn't displayed below in
> the
> bt output, but I think it may have been confusion due the the kernel
> text
> region starting a 4000000 (instead of the typical 3G/1G user/kernel
> virtual
> address split). I'm guessing your kernel is configured as 1G/3G user-
> kernel?

That's right, the kernel is configured as 1G/3G user/kernel.

> (I've never seen that before...)

It's a weird config indeed. I'll try rewriting some stuff so it consumes way less memory so a normal kernel/user split can be used.
Never the less, why the pointer became null remains unsolved for the moment. :-)
Would the user/kernel split also be an issue in 64 bit?

Reinoud.

>
> Anyway, somehow the EIP got zeroed out, and it took a fault trying
> to handle that. That can happen if a kernel function corrupts its
> own stack by incorrectly writing to its own local stack variables,
> and in so doing writes a zero into the return address saved on the
> stack. Then when the function returns, that zero is loaded into the
> EIP, and you'd see something like the above.
>
> The exception frame in the log shows that the ESP is f784df78,
> and looking at the trace data below, it looks like
> rcu_process_callbacks()
> may have ended up calling something that lead to the EIP corruption.
>
> Just a guess though...
>
> Dave
>
> >
> > crash> bt
> > PID: 4 TASK: f783a5b0 CPU: 0 COMMAND: "ksoftirqd/0"
> > #0 [f784de88] crash_kexec at 401534a8
> > #1 [f784df28] __slab_free at 4019677f
> > #2 [f784df8c] rcu_process_callbacks at 401681ba
> > #3 [f784df94] __do_softirq at 4012fe90
> > #4 [f784dfb4] do_softirq at 4012ff50
> > #5 [f784dfd0] kthread at 4013e705
> > #6 [f784dfe4] kernel_thread_helper at 4010494d
> >
> > Thanks,
> >
> > Reinoud.
> >
> >
> > > -----Original Message-----
> > > From: crash-utility-bounces@redhat.com [mailto:crash-utility-
> > > bounces@redhat.com] On Behalf Of Dave Anderson
> > > Sent: Thursday, August 12, 2010 6:14 AM
> > > To: Discussion list for crash utility usage, maintenance and
> > > development
> > > Subject: Re: [Crash-utility] crash: invalid structure member offset
> > >
> > >
> > > ----- "Reinoud Koornstra" <koornstra@hp.com> wrote:
> > >
> > > > Hi Everyone,
> > > >
> > > > I am trying to read a core file into crash, but I've got bad luck
> > as
> > > you can see below.
> > > > Is core file corrupt? It is a vmcore file from a 32 bits kernel
> > that
> > > > was compiled with PAE, could that have corrupted things?
> > > > Any hints here?
> > > > Thanks,
> > > >
> > > > Reinoud.
> > > >
> > > > $ crash System.map-2.6.27 ./vmlinux-2.6.27 ./vmcore
> > > >
> > > > crash 4.0-3.7
> > >
> > > I don't know if the vmcore is corrupt, but PAE wouldn't be an
> > issue.
> > >
> > > However, you are running a version of crash that was released
> > almost
> > > 4 years ago (13-Oct-2006) against a two-year-old kernel that was
> > > released 15-Oct-2008. That's pretty much a guarantee of failure.
> > >
> > > Try updating to version 5.0.6 and see what happens.
> > >
> > > And BTW, if the vmlinux file is the exact same kernel as the
> > > one that generated the vmcore file, you don't need a System.map
> > > argument.
> > >
> > > Dave
> > >
> > >
> > >
> > > 15-Oct-2008
> > >
> > > > Copyright 2002, 2003, 2004, 2005, 2006 Red Hat, Inc.
> > > > Copyright 2004, 2005, 2006 IBM Corporation
> > > > Copyright 1999-2006 Hewlett-Packard Co
> > > > Copyright 2005 Fujitsu Limited
> > > > Copyright 2005 NEC Corporation
> > > > Copyright 1999, 2002 Silicon Graphics, Inc.
> > > > Copyright 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
> > > > This program is free software, covered by the GNU General Public
> > > License,
> > > > and you are welcome to change it and/or distribute copies of it
> > under
> > > > certain conditions. Enter "help copying" to see the conditions.
> > > > This program has absolutely no warranty. Enter "help warranty"
> > for
> > > > details.
> > > >
> > > > GNU gdb 6.1
> > > > Copyright 2004 Free Software Foundation, Inc.
> > > > GDB is free software, covered by the GNU General Public License,
> > and
> > > you are
> > > > welcome to change it and/or distribute copies of it under certain
> > > conditions.
> > > > Type "show copying" to see the conditions.
> > > > There is absolutely no warranty for GDB. Type "show warranty"
> > for
> > > details.
> > > > This GDB was configured as "i686-pc-linux-gnu"...
> > > >
> > > > please wait... (gathering kmem slab cache data)
> > > >
> > > > crash: invalid structure member offset: kmem_cache_s_c_num
> > > > FILE: memory.c LINE: 6891 FUNCTION: kmem_cache_init()
> > > >
> > > > [/usr/bin/crash] error trace: 80827a9 => 8095398 => 80aa7ef =>
> > > > 8131e88
> > > > /usr/bin/nm: /usr/bin/crash: no symbols
> > > > /usr/bin/nm: /usr/bin/crash: no symbols
> > > > /usr/bin/nm: /usr/bin/crash: no symbols
> > > > /usr/bin/nm: /usr/bin/crash: no symbols
> > > >
> > > > WARNING: Because this kernel was compiled with gcc version 4.1.2,
> > > certain
> > > > commands or command options may fail unless crash is
> > invoked
> > > with
> > > > the "--readnow" command line option.
> > >
> > > --
> > > Crash-utility mailing list
> > > Crash-utility@redhat.com
> > > https://www.redhat.com/mailman/listinfo/crash-utility
> >
> > --
> > Crash-utility mailing list
> > Crash-utility@redhat.com
> > https://www.redhat.com/mailman/listinfo/crash-utility
>
> --
> Crash-utility mailing list
> Crash-utility@redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 08-13-2010, 01:19 PM
Dave Anderson
 
Default crash: invalid structure member offset

----- "Reinoud Koornstra" <koornstra@hp.com> wrote:

> > -----Original Message-----
> > From: crash-utility-bounces@redhat.com [mailto:crash-utility-
> > bounces@redhat.com] On Behalf Of Dave Anderson
> > Sent: Thursday, August 12, 2010 12:18 PM
> > To: Discussion list for crash utility usage, maintenance and
> > development
> > Subject: Re: [Crash-utility] crash: invalid structure member offset
> >
> >
> > ----- "Reinoud Koornstra" <koornstra@hp.com> wrote:
> >
> > > Thanks,
> > >
> > > Using crash 5.0.6 worked nicely.
> > > However, I can't really look at a lot because of a bad EIP code.
> > >
> > > [ 726.601381] 802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com>
> > > [ 726.601384] All bugs added by David S. Miller <davem@redhat.com>
> > > [ 726.646757] BUG: unable to handle kernel NULL pointer dereference at 00000000
> > > [ 726.732410] IP: [<00000000>]
> > > [ 726.766933] *pdpt = 0000000000431001 *pde = 0000000000000000
> > > [ 726.766937] Oops: 0010 [#1] SMP
> > > [ 726.790844] Modules linked in: 8021q iptable_filter ip_tables
> > > x_tables ip_gre af_packet i2c_dev i2c_qs i2c_algo_bit i2c_core garp
> > > stp llc ixgbe inet_lro psmouse serio_raw intel_agp shpchp iTCO_wdt
> > > pci_hotplug iTCO_vendor_support agpgart ext3 jbd mbcache sd_mod
> > > crc_t10dif sg ata_piix ata_generic ahci libata scsi_mod ehci_hcd
> > > uhci_hcd usbcore [last unloaded: 8021q]
> > > [ 726.790844]
> > > [ 726.790844] Pid: 4, comm: ksoftirqd/0 Tainted: P (2.6.27)
> > > [ 726.790844] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
> > > [ 726.790844] EIP is at 0x0
> > > [ 726.790844] EAX: e7f4c498 EBX: 00000000 ECX: 77470000 EDX: e7f4c498
> > > [ 726.790844] ESI: 4bd1d300 EDI: 00000007 EBP: f784df88 ESP: f784df78
> > > [ 726.790844] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> > > [ 726.790844] Process ksoftirqd/0 (pid: 4, ti=f784c000 task=f783a5b0 task.ti=f784c000)
> > > [ 726.790844] Stack: 40168080 00000001 403daaa0 4042c500 f784df90 401681bf f784dfb0 4012fe92
> > > [ 726.790844] 0000000a 00000000 40429340 00000246 00000000 40130120 f784dfbc 4012ff55
> > > [ 726.790844] 4042c500 f784dfcc 40130182 fffffffc 00000000
> > f784dfe0 4013e707 4013e6c0
> > > [ 726.790844] Call Trace:
> > > [ 726.790844] [<40168080>] ? __rcu_process_callbacks+0x70/0x190
> > > [ 726.790844] [<401681bf>] ? rcu_process_callbacks+0x1f/0x40
> > > [ 726.790844] [<4012fe92>] ? __do_softirq+0x82/0x100
> > > [ 726.790844] [<40130120>] ? ksoftirqd+0x0/0xe0
> > > [ 726.790844] [<4012ff55>] ? do_softirq+0x45/0x50
> > > [ 726.790844] [<40130182>] ? ksoftirqd+0x62/0xe0
> > > [ 726.790844] [<4013e707>] ? kthread+0x47/0x80
> > > [ 726.790844] [<4013e6c0>] ? kthread+0x0/0x80
> > > [ 726.790844] [<4010494f>] ? kernel_thread_helper+0x7/0x10
> > > [ 726.790844] =======================
> > > [ 726.790844] Code: Bad EIP value.
> > > [ 726.790844] EIP: [<00000000>] 0x0 SS:ESP 0068:f784df78
> > >
> > > So now I can't figure out the piece of code where this dereferencing
> > > occurred.
> >
> > Yeah, I don't know why the exception frame didn't displayed below in the
> > bt output, but I think it may have been confusion due the kernel text
> > region starting a 4000000 (instead of the typical 3G/1G user/kernel virtual
> > address split). I'm guessing your kernel is configured as 1G/3G user-kernel?
>
> That's right, the kernel is configured as 1G/3G user/kernel.
>
> > (I've never seen that before...)
>
> It's a weird config indeed. I'll try rewriting some stuff so it
> consumes way less memory so a normal kernel/user split can be used.
> Never the less, why the pointer became null remains unsolved for the moment. :-)
> Would the user/kernel split also be an issue in 64 bit?

I wouldn't expect you'd ever need to modify the user-kernel split in x86_64,
if that's what you're asking? The 64-bit virtual address range is so vast
that it's hard to conceive of a need to do anything like that.

Dave


--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 

Thread Tools




All times are GMT. The time now is 10:44 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org