HEAD'S UP -- problem with kernels built with gcc-4.6.0
Thanks Dave,
As always we all appreciate the huge effort you put into this for us.
Will keep an eye out for this.
Laurence
----- Original Message -----
From: "Dave Anderson" <anderson@redhat.com>
To: "Discussion list for crash utility usage, maintenance and development" <crash-utility@redhat.com>
Sent: Thursday, May 5, 2011 5:18:04 PM
Subject: [Crash-utility] HEAD'S UP -- problem with kernels built with gcc-4.6.0
As a heads-up to those of you who are working with kernels
that were compiled with the new gcc-4.6.0.
I had thought that gcc-4.6.0 was painful only as far as compiling
the crash utility was concerned, where there were a bunch of new
"error: variable <variable> set but not used [-Werror=unused-but-set-variable]
messages that I fixed in crash-5.1.2 and -5.1.3. And you may be aware that
that those for-the-most-part useless warnings recently caused an LKML shitstorm
w/respect to building kernels.
But it's worse than that -- there is a problem with crash's embedded gdb
determining the member offsets of the (large) pglist_data structure if
the kernel was compiled with gcc-4.6.0. This is not specific to the
gdb-7.0 version that is built into crash, but with all gdb
versions as far as I can tell, certainly with gdb-7.2-48.el6
and gdb-7.2.50.20110328-31.fc15.
The problem is most clearly seen with "struct -o pglist_data", which
dumps the structure, showing the offset of each member.
For comparison, here is the output from a (good) 2.6.38-rc4 kernel
that was compiled with gcc-4.5.1:
crash> help -k | grep gcc_version
gcc_version: 4.5.1
crash> struct -o pglist_data
struct pglist_data {
[0x0] struct zone node_zones[4];
[0x1c00] struct zonelist node_zonelists[2];
[0x13e40] int nr_zones;
[0x13e44] spinlock_t node_size_lock;
[0x13e48] long unsigned int node_start_pfn;
[0x13e50] long unsigned int node_present_pages;
[0x13e58] long unsigned int node_spanned_pages;
[0x13e60] int node_id;
[0x13e68] wait_queue_head_t kswapd_wait;
[0x13e80] struct task_struct *kswapd;
[0x13e88] int kswapd_max_order;
[0x13e8c] enum zone_type classzone_idx;
}
SIZE: 0x13f00
crash>
While here is the output from a 2.6.38.2-9.fc15 kernel that
was compiled with gcc-4.6.0:
crash> help -k | grep gcc_version
gcc_version: 4.6.0
crash> struct -o pglist_data
struct pglist_data {
[0x0] struct zone node_zones[4];
[0x1c00] struct zonelist node_zonelists[2];
[0x0] int nr_zones;
[0x0] spinlock_t node_size_lock;
[0x0] long unsigned int node_start_pfn;
[0x0] long unsigned int node_present_pages;
[0x0] long unsigned int node_spanned_pages;
[0x0] int node_id;
[0x0] wait_queue_head_t kswapd_wait;
[0x0] struct task_struct *kswapd;
[0x0] int kswapd_max_order;
[0x0] enum zone_type classzone_idx;
}
SIZE: 0x13f00
crash>
It's interesting that it gets the size correct, but the member offset
values beyond the node_zonelists[] array are returned as 0.
Taking the crash utility out of the picture, the problem can be seen
by simply running "gdb vmlinux".
For example, with the first example above using the good kernel:
$ gdb vmlinux-2.6.38-rc4
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-48.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /root/vmlinux-2.6.38-rc4...done.
(gdb) ptype struct pglist_data
type = struct pglist_data {
struct zone node_zones[4];
struct zonelist node_zonelists[2];
int nr_zones;
spinlock_t node_size_lock;
long unsigned int node_start_pfn;
long unsigned int node_present_pages;
long unsigned int node_spanned_pages;
int node_id;
wait_queue_head_t kswapd_wait;
struct task_struct *kswapd;
int kswapd_max_order;
enum zone_type classzone_idx;
}
(gdb) p &((struct pglist_data *)(0x0)).node_zonelists[0]
$1 = (struct zonelist *) 0x1c00
(gdb) p &((struct pglist_data *)(0x0)).nr_zones
$2 = (int *) 0x13e40
(gdb) p &((struct pglist_data *)(0x0)).node_size_lock
$3 = (spinlock_t *) 0x13e44
(gdb) p &((struct pglist_data *)(0x0)).node_start_pfn
$4 = (long unsigned int *) 0x13e48
(gdb) p &((struct pglist_data *)(0x0)).node_present_pages
$5 = (long unsigned int *) 0x13e50
(gdb) p &((struct pglist_data *)(0x0)).node_spanned_pages
$6 = (long unsigned int *) 0x13e58
(gdb) p &((struct pglist_data *)(0x0)).node_id
$7 = (int *) 0x13e60
(gdb) p &((struct pglist_data *)(0x0)).kswapd
$8 = (struct task_struct **) 0x13e80
(gdb) p &((struct pglist_data *)(0x0)).kswapd_max_order
$9 = (int *) 0x13e88
(gdb) p &((struct pglist_data *)(0x0)).classzone_idx
$10 = (enum zone_type *) 0x13e8c
(gdb)
And then with the kernel compiled with gcc-4.6.0:
# gdb vmlinux-2.6.38.2-9.fc15
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-48.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /root/vmlinux-2.6.38.2-9.fc15...done.
(gdb) ptype struct pglist_data
type = struct pglist_data {
struct zone node_zones[4];
struct zonelist node_zonelists[2];
int nr_zones;
spinlock_t node_size_lock;
long unsigned int node_start_pfn;
long unsigned int node_present_pages;
long unsigned int node_spanned_pages;
int node_id;
wait_queue_head_t kswapd_wait;
struct task_struct *kswapd;
int kswapd_max_order;
enum zone_type classzone_idx;
}
(gdb) p &((struct pglist_data *)(0x0)).node_zonelists[0]
$1 = (struct zonelist *) 0x1c00
(gdb) p &((struct pglist_data *)(0x0)).nr_zones
$2 = (int *) 0x0
(gdb) p &((struct pglist_data *)(0x0)).node_size_lock
$3 = (spinlock_t *) 0x0
(gdb) p &((struct pglist_data *)(0x0)).node_start_pfn
$4 = (long unsigned int *) 0x0
(gdb) p &((struct pglist_data *)(0x0)).node_present_pages
$5 = (long unsigned int *) 0x0
(gdb) p &((struct pglist_data *)(0x0)).node_spanned_pages
$6 = (long unsigned int *) 0x0
(gdb) p &((struct pglist_data *)(0x0)).node_id
$7 = (int *) 0x0
(gdb) p &((struct pglist_data *)(0x0)).kswapd_wait
$8 = (wait_queue_head_t *) 0x0
(gdb) p &((struct pglist_data *)(0x0)).kswapd
$9 = (struct task_struct **) 0x0
(gdb) p &((struct pglist_data *)(0x0)).kswapd_max_order
$10 = (int *) 0x0
(gdb) p &((struct pglist_data *)(0x0)).classzone_idx
$11 = (enum zone_type *) 0x0
(gdb)
Anyway, given that the pglist_data structure is crucial to the
crash utility, the bogus offset data generates errors such as
the MEMORY value, as shown here on a 4GB system:
And there may be other problems that I'm not aware of that are associated
with the pglist_data data structure members specifically -- and perhaps with
other data structures as well?
I filed a bugzilla with gdb, although it may likely be a bug with
the debuginfo data created by gcc-4.6.0. We'll see what happens...
In the meantime, I do have a workaround kludge for pglist_data members that
will be included in the upcoming crash-5.1.5 release.
Annoyed to no end,
Dave
--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility