FAQ Search Today's Posts Mark Forums Read

» Linux Archive
Home
New Posts
Search
FAQ


Go Back   Linux Archive > Redhat > Crash Utility

 
 
LinkBack Thread Tools
 
Old 02-07-2008, 07:45 PM
Andrew Hecox
 
Default determining a "valid" vmcore

On Thu, 2008-02-07 at 15:38 -0500, Dave Anderson wrote:
> Andrew Hecox wrote:
> >
> > I get the same:
> >
> > (/boot/System.map-2.6.9-67.0.1.ELhugemem)
> >
> > 02323bd8 d log_buf_len
> >
> > (/usr/lib/debug/lib/modules/2.6.9-67.0.1.ELhugemem/vmlinux)
> >
> > $1 = (int *) 0x2323bd8
> >
> > -Andrew
>
> So, as Takao suggested, can you dump the incoming vaddr and
> resultant pfn values in diskdumpmsg.c:read_buffer()?
>

The vaddr value is: 36846552.

-Andrew

> Dave
>
>

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 02-07-2008, 08:04 PM
Dave Anderson
 
Default determining a "valid" vmcore

Andrew Hecox wrote:

On Thu, 2008-02-07 at 15:38 -0500, Dave Anderson wrote:

Andrew Hecox wrote:
I get the same:


(/boot/System.map-2.6.9-67.0.1.ELhugemem)

02323bd8 d log_buf_len

(/usr/lib/debug/lib/modules/2.6.9-67.0.1.ELhugemem/vmlinux)

$1 = (int *) 0x2323bd8

-Andrew

So, as Takao suggested, can you dump the incoming vaddr and
resultant pfn values in diskdumpmsg.c:read_buffer()?



The vaddr value is: 36846552.

-Andrew


Dave






OK, so the incoming vaddr is 36846552 is which is 0x2323bd8.
To get a pfn, that hugemem kernel virtual address is passed
through vtop() and then divided by 4096:

static int read_buffer(DumpFile *dump, addr_t vaddr, size_t len, void *buf)
{
addr_t paddr;
int block_size = get_page_size();
unsigned long pfn;
int ret;
size_t copy_len, offs;
void *page_data;

paddr = vtop(dump, vaddr);
pfn = paddr / block_size;
offs = paddr % block_size;

When 0x2323bd8 is run through vtop(), it simply strips off the
hugemem unity-map identifier:

addr_t vtop(DumpFile *dump, addr_t vaddr)
{
if (strstr("hugemem", dump->utsname->release))
return vaddr - 0x02000000L;
else
return vaddr - 0xc0000000L;
}

leaving 0x323bd8 -- which gets divided by the page size of 4096, leaving
a pfn of 0x323.

But you see that the pfn was 271139 (0x42323). If that is expanded
to a physical address it would be 0x42323000. It looks like it's
using the non-hugemem value in vtop(), i,e, subtracting c0000000 from
the incoming vaddr. In other words, 0x2323bd8 - 0xc000000 is
equal to 0x42323bd8. If that is divided by 4096, you get
the funky pfn of 271139 (0x42323).

Print out the dump->utsname->release string in vtop(). It must
not contain "hugemem".

Dave









:







--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 02-07-2008, 08:38 PM
Andrew Hecox
 
Default determining a "valid" vmcore

On Thu, 2008-02-07 at 16:04 -0500, Dave Anderson wrote:
> Andrew Hecox wrote:
> > On Thu, 2008-02-07 at 15:38 -0500, Dave Anderson wrote:
> >> Andrew Hecox wrote:
> >>> I get the same:
> >>>
> >>> (/boot/System.map-2.6.9-67.0.1.ELhugemem)
> >>>
> >>> 02323bd8 d log_buf_len
> >>>
> >>> (/usr/lib/debug/lib/modules/2.6.9-67.0.1.ELhugemem/vmlinux)
> >>>
> >>> $1 = (int *) 0x2323bd8
> >>>
> >>> -Andrew
> >> So, as Takao suggested, can you dump the incoming vaddr and
> >> resultant pfn values in diskdumpmsg.c:read_buffer()?
> >>
> >
> > The vaddr value is: 36846552.
> >
> > -Andrew
> >
> >> Dave
> >>
> >>
> >
>
> OK, so the incoming vaddr is 36846552 is which is 0x2323bd8.
> To get a pfn, that hugemem kernel virtual address is passed
> through vtop() and then divided by 4096:
>
> static int read_buffer(DumpFile *dump, addr_t vaddr, size_t len, void *buf)
> {
> addr_t paddr;
> int block_size = get_page_size();
> unsigned long pfn;
> int ret;
> size_t copy_len, offs;
> void *page_data;
>
> paddr = vtop(dump, vaddr);
> pfn = paddr / block_size;
> offs = paddr % block_size;
>
> When 0x2323bd8 is run through vtop(), it simply strips off the
> hugemem unity-map identifier:
>
> addr_t vtop(DumpFile *dump, addr_t vaddr)
> {
> if (strstr("hugemem", dump->utsname->release))
> return vaddr - 0x02000000L;
> else
> return vaddr - 0xc0000000L;
> }
>
> leaving 0x323bd8 -- which gets divided by the page size of 4096, leaving
> a pfn of 0x323.
>
> But you see that the pfn was 271139 (0x42323). If that is expanded
> to a physical address it would be 0x42323000. It looks like it's
> using the non-hugemem value in vtop(), i,e, subtracting c0000000 from
> the incoming vaddr. In other words, 0x2323bd8 - 0xc000000 is
> equal to 0x42323bd8. If that is divided by 4096, you get
> the funky pfn of 271139 (0x42323).
>
> Print out the dump->utsname->release string in vtop(). It must
> not contain "hugemem".
>

Dave,

I get:

(gdb) print dump->utsname->release
$19 = "2.6.9-67.0.1.ELhugemem", '' <repeats 42 times>

but then

(gdb) s
16 return vaddr - 0xc0000000L;

! oh uh.

man strstr

...
char *strstr(const char *haystack, const char *needle);
...

It looks like

if (strstr("hugemem", dump->utsname->release))

should be:

if (strstr(dump->utsname->release,"hugemem"))

I patched, recompiled, tested and it works:

[root@ibm-x3455-1 ~]# diskdumpmsg -f -p /var/crash/vmcore
Jan 31 05:43:08 elabhost012 kernel: --- salvaged messages from crash
dump start
Jan 31 05:43:08 elabhost012 kernel: 0218b9c0 0232d363 0232d3e0
0215aff6 df954fac f6db4000 eaa756c0 fffffff7
Jan 31 05:43:08 elabhost012 kernel: f6db4000 df954000 0215b0c0
df954fac 00000000 00000000 00000000 df954fc4
Jan 31 05:43:08 elabhost012 kernel: Call Trace:
Jan 31 05:43:08 elabhost012 kernel: [<0220c46a>] __handle_sysrq
+0x58/0xc6
Jan 31 05:43:08 elabhost012 kernel: [<0218b9c0>] write_sysrq_trigger
+0x37/0x3e
Jan 31 05:43:08 elabhost012 kernel: [<0215aff6>] vfs_write+0xb6/0xe2
Jan 31 05:43:08 elabhost012 kernel: [<0215b0c0>] sys_write+0x3c/0x62
Jan 31 05:43:08 elabhost012 kernel: Code: 11 02 c7 05 10 fd 44 02 00 00
00 00 c7 05 38 fd 44 02 00 00 00 00 c7 05 2c fd 44 02 6e ad 87 4b 89 15
28 fd 44 02 e9 8b 41 f2 ff <c6> 05 00 00 00 00 00 c3 e9 0a ff f4 ff e9
a2 48 f5 ff 85 d2 89
Jan 31 05:43:08 elabhost012 kernel: --- salvaged messages from crash
dump end

Thanks much for all the help! Should I open a bz against the issue? It
looks like all i386 hugemem kernels would be similarly affected.

-Andrew


> Dave
>
>
>
>
>
>
>
>
>
> :
>
>
>
>
>
>
>

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 02-07-2008, 08:38 PM
Andrew Hecox
 
Default determining a "valid" vmcore

On Thu, 2008-02-07 at 16:04 -0500, Dave Anderson wrote:
> Andrew Hecox wrote:
> > On Thu, 2008-02-07 at 15:38 -0500, Dave Anderson wrote:
> >> Andrew Hecox wrote:
> >>> I get the same:
> >>>
> >>> (/boot/System.map-2.6.9-67.0.1.ELhugemem)
> >>>
> >>> 02323bd8 d log_buf_len
> >>>
> >>> (/usr/lib/debug/lib/modules/2.6.9-67.0.1.ELhugemem/vmlinux)
> >>>
> >>> $1 = (int *) 0x2323bd8
> >>>
> >>> -Andrew
> >> So, as Takao suggested, can you dump the incoming vaddr and
> >> resultant pfn values in diskdumpmsg.c:read_buffer()?
> >>
> >
> > The vaddr value is: 36846552.
> >
> > -Andrew
> >
> >> Dave
> >>
> >>
> >
>
> OK, so the incoming vaddr is 36846552 is which is 0x2323bd8.
> To get a pfn, that hugemem kernel virtual address is passed
> through vtop() and then divided by 4096:
>
> static int read_buffer(DumpFile *dump, addr_t vaddr, size_t len, void *buf)
> {
> addr_t paddr;
> int block_size = get_page_size();
> unsigned long pfn;
> int ret;
> size_t copy_len, offs;
> void *page_data;
>
> paddr = vtop(dump, vaddr);
> pfn = paddr / block_size;
> offs = paddr % block_size;
>
> When 0x2323bd8 is run through vtop(), it simply strips off the
> hugemem unity-map identifier:
>
> addr_t vtop(DumpFile *dump, addr_t vaddr)
> {
> if (strstr("hugemem", dump->utsname->release))
> return vaddr - 0x02000000L;
> else
> return vaddr - 0xc0000000L;
> }
>
> leaving 0x323bd8 -- which gets divided by the page size of 4096, leaving
> a pfn of 0x323.
>
> But you see that the pfn was 271139 (0x42323). If that is expanded
> to a physical address it would be 0x42323000. It looks like it's
> using the non-hugemem value in vtop(), i,e, subtracting c0000000 from
> the incoming vaddr. In other words, 0x2323bd8 - 0xc000000 is
> equal to 0x42323bd8. If that is divided by 4096, you get
> the funky pfn of 271139 (0x42323).
>
> Print out the dump->utsname->release string in vtop(). It must
> not contain "hugemem".
>

Dave,

I get:

(gdb) print dump->utsname->release
$19 = "2.6.9-67.0.1.ELhugemem", '' <repeats 42 times>

but then

(gdb) s
16 return vaddr - 0xc0000000L;

! oh uh.

man strstr

...
char *strstr(const char *haystack, const char *needle);
...

It looks like

if (strstr("hugemem", dump->utsname->release))

should be:

if (strstr(dump->utsname->release,"hugemem"))

I patched, recompiled, tested and it works:

[root@ibm-x3455-1 ~]# diskdumpmsg -f -p /var/crash/vmcore
Jan 31 05:43:08 elabhost012 kernel: --- salvaged messages from crash
dump start
Jan 31 05:43:08 elabhost012 kernel: 0218b9c0 0232d363 0232d3e0
0215aff6 df954fac f6db4000 eaa756c0 fffffff7
Jan 31 05:43:08 elabhost012 kernel: f6db4000 df954000 0215b0c0
df954fac 00000000 00000000 00000000 df954fc4
Jan 31 05:43:08 elabhost012 kernel: Call Trace:
Jan 31 05:43:08 elabhost012 kernel: [<0220c46a>] __handle_sysrq
+0x58/0xc6
Jan 31 05:43:08 elabhost012 kernel: [<0218b9c0>] write_sysrq_trigger
+0x37/0x3e
Jan 31 05:43:08 elabhost012 kernel: [<0215aff6>] vfs_write+0xb6/0xe2
Jan 31 05:43:08 elabhost012 kernel: [<0215b0c0>] sys_write+0x3c/0x62
Jan 31 05:43:08 elabhost012 kernel: Code: 11 02 c7 05 10 fd 44 02 00 00
00 00 c7 05 38 fd 44 02 00 00 00 00 c7 05 2c fd 44 02 6e ad 87 4b 89 15
28 fd 44 02 e9 8b 41 f2 ff <c6> 05 00 00 00 00 00 c3 e9 0a ff f4 ff e9
a2 48 f5 ff 85 d2 89
Jan 31 05:43:08 elabhost012 kernel: --- salvaged messages from crash
dump end

Thanks much for all the help! Should I open a bz against the issue? It
looks like all i386 hugemem kernels would be similarly affected.

-Andrew


> Dave
>
>
>
>
>
>
>
>
>
> :
>
>
>
>
>
>
>

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 02-07-2008, 08:46 PM
Dave Anderson
 
Default determining a "valid" vmcore

Andrew Hecox wrote:

On Thu, 2008-02-07 at 16:04 -0500, Dave Anderson wrote:

Andrew Hecox wrote:

On Thu, 2008-02-07 at 15:38 -0500, Dave Anderson wrote:

Andrew Hecox wrote:
I get the same:


(/boot/System.map-2.6.9-67.0.1.ELhugemem)

02323bd8 d log_buf_len

(/usr/lib/debug/lib/modules/2.6.9-67.0.1.ELhugemem/vmlinux)

$1 = (int *) 0x2323bd8

-Andrew

So, as Takao suggested, can you dump the incoming vaddr and
resultant pfn values in diskdumpmsg.c:read_buffer()?


The vaddr value is: 36846552.

-Andrew


Dave



OK, so the incoming vaddr is 36846552 is which is 0x2323bd8.
To get a pfn, that hugemem kernel virtual address is passed
through vtop() and then divided by 4096:

static int read_buffer(DumpFile *dump, addr_t vaddr, size_t len, void *buf)
{
addr_t paddr;
int block_size = get_page_size();
unsigned long pfn;
int ret;
size_t copy_len, offs;
void *page_data;

paddr = vtop(dump, vaddr);
pfn = paddr / block_size;
offs = paddr % block_size;

When 0x2323bd8 is run through vtop(), it simply strips off the
hugemem unity-map identifier:

addr_t vtop(DumpFile *dump, addr_t vaddr)
{
if (strstr("hugemem", dump->utsname->release))
return vaddr - 0x02000000L;
else
return vaddr - 0xc0000000L;
}

leaving 0x323bd8 -- which gets divided by the page size of 4096, leaving
a pfn of 0x323.

But you see that the pfn was 271139 (0x42323). If that is expanded
to a physical address it would be 0x42323000. It looks like it's
using the non-hugemem value in vtop(), i,e, subtracting c0000000 from
the incoming vaddr. In other words, 0x2323bd8 - 0xc000000 is
equal to 0x42323bd8. If that is divided by 4096, you get
the funky pfn of 271139 (0x42323).

Print out the dump->utsname->release string in vtop(). It must
not contain "hugemem".



Dave,

I get:

(gdb) print dump->utsname->release
$19 = "2.6.9-67.0.1.ELhugemem", '' <repeats 42 times>

but then

(gdb) s
16 return vaddr - 0xc0000000L;

! oh uh.

man strstr

...
char *strstr(const char *haystack, const char *needle);
...

It looks like

if (strstr("hugemem", dump->utsname->release))


should be:

if (strstr(dump->utsname->release,"hugemem"))


Bingo -- like the man page says:

char *strstr(const char *haystack, const char *needle);



I patched, recompiled, tested and it works:

[root@ibm-x3455-1 ~]# diskdumpmsg -f -p /var/crash/vmcore
Jan 31 05:43:08 elabhost012 kernel: --- salvaged messages from crash

dump start
Jan 31 05:43:08 elabhost012 kernel: 0218b9c0 0232d363 0232d3e0
0215aff6 df954fac f6db4000 eaa756c0 fffffff7
Jan 31 05:43:08 elabhost012 kernel: f6db4000 df954000 0215b0c0
df954fac 00000000 00000000 00000000 df954fc4
Jan 31 05:43:08 elabhost012 kernel: Call Trace:

Jan 31 05:43:08 elabhost012 kernel: [<0220c46a>] __handle_sysrq
+0x58/0xc6
Jan 31 05:43:08 elabhost012 kernel: [<0218b9c0>] write_sysrq_trigger
+0x37/0x3e
Jan 31 05:43:08 elabhost012 kernel: [<0215aff6>] vfs_write+0xb6/0xe2
Jan 31 05:43:08 elabhost012 kernel: [<0215b0c0>] sys_write+0x3c/0x62
Jan 31 05:43:08 elabhost012 kernel: Code: 11 02 c7 05 10 fd 44 02 00 00
00 00 c7 05 38 fd 44 02 00 00 00 00 c7 05 2c fd 44 02 6e ad 87 4b 89 15
28 fd 44 02 e9 8b 41 f2 ff <c6> 05 00 00 00 00 00 c3 e9 0a ff f4 ff e9
a2 48 f5 ff 85 d2 89
Jan 31 05:43:08 elabhost012 kernel: --- salvaged messages from crash

dump end

Thanks much for all the help! Should I open a bz against the issue? It
looks like all i386 hugemem kernels would be similarly affected.


Yep -- definitely open a BZ against component "diskdumputils".

Dave


--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 02-07-2008, 09:46 PM
Andrew Hecox
 
Default determining a "valid" vmcore

On Thu, 2008-02-07 at 16:46 -0500, Dave Anderson wrote:
> Andrew Hecox wrote:
> > On Thu, 2008-02-07 at 16:04 -0500, Dave Anderson wrote:
> >> Andrew Hecox wrote:
> >>> On Thu, 2008-02-07 at 15:38 -0500, Dave Anderson wrote:
> >>>> Andrew Hecox wrote:
> >>>>> I get the same:
> >>>>>
> >>>>> (/boot/System.map-2.6.9-67.0.1.ELhugemem)
> >>>>>
> >>>>> 02323bd8 d log_buf_len
> >>>>>
> >>>>> (/usr/lib/debug/lib/modules/2.6.9-67.0.1.ELhugemem/vmlinux)
> >>>>>
> >>>>> $1 = (int *) 0x2323bd8
> >>>>>
> >>>>> -Andrew
> >>>> So, as Takao suggested, can you dump the incoming vaddr and
> >>>> resultant pfn values in diskdumpmsg.c:read_buffer()?
> >>>>
> >>> The vaddr value is: 36846552.
> >>>
> >>> -Andrew
> >>>
> >>>> Dave
> >>>>
> >>>>
> >> OK, so the incoming vaddr is 36846552 is which is 0x2323bd8.
> >> To get a pfn, that hugemem kernel virtual address is passed
> >> through vtop() and then divided by 4096:
> >>
> >> static int read_buffer(DumpFile *dump, addr_t vaddr, size_t len, void *buf)
> >> {
> >> addr_t paddr;
> >> int block_size = get_page_size();
> >> unsigned long pfn;
> >> int ret;
> >> size_t copy_len, offs;
> >> void *page_data;
> >>
> >> paddr = vtop(dump, vaddr);
> >> pfn = paddr / block_size;
> >> offs = paddr % block_size;
> >>
> >> When 0x2323bd8 is run through vtop(), it simply strips off the
> >> hugemem unity-map identifier:
> >>
> >> addr_t vtop(DumpFile *dump, addr_t vaddr)
> >> {
> >> if (strstr("hugemem", dump->utsname->release))
> >> return vaddr - 0x02000000L;
> >> else
> >> return vaddr - 0xc0000000L;
> >> }
> >>
> >> leaving 0x323bd8 -- which gets divided by the page size of 4096, leaving
> >> a pfn of 0x323.
> >>
> >> But you see that the pfn was 271139 (0x42323). If that is expanded
> >> to a physical address it would be 0x42323000. It looks like it's
> >> using the non-hugemem value in vtop(), i,e, subtracting c0000000 from
> >> the incoming vaddr. In other words, 0x2323bd8 - 0xc000000 is
> >> equal to 0x42323bd8. If that is divided by 4096, you get
> >> the funky pfn of 271139 (0x42323).
> >>
> >> Print out the dump->utsname->release string in vtop(). It must
> >> not contain "hugemem".
> >>
> >
> > Dave,
> >
> > I get:
> >
> > (gdb) print dump->utsname->release
> > $19 = "2.6.9-67.0.1.ELhugemem", '' <repeats 42 times>
> >
> > but then
> >
> > (gdb) s
> > 16 return vaddr - 0xc0000000L;
> >
> > ! oh uh.
> >
> > man strstr
> >
> > ...
> > char *strstr(const char *haystack, const char *needle);
> > ...
> >
> > It looks like
> >
> > if (strstr("hugemem", dump->utsname->release))
> >
> > should be:
> >
> > if (strstr(dump->utsname->release,"hugemem"))
>
> Bingo -- like the man page says:
>
> char *strstr(const char *haystack, const char *needle);
>
> >
> > I patched, recompiled, tested and it works:
> >
> > [root@ibm-x3455-1 ~]# diskdumpmsg -f -p /var/crash/vmcore
> > Jan 31 05:43:08 elabhost012 kernel: --- salvaged messages from crash
> > dump start
> > Jan 31 05:43:08 elabhost012 kernel: 0218b9c0 0232d363 0232d3e0
> > 0215aff6 df954fac f6db4000 eaa756c0 fffffff7
> > Jan 31 05:43:08 elabhost012 kernel: f6db4000 df954000 0215b0c0
> > df954fac 00000000 00000000 00000000 df954fc4
> > Jan 31 05:43:08 elabhost012 kernel: Call Trace:
> > Jan 31 05:43:08 elabhost012 kernel: [<0220c46a>] __handle_sysrq
> > +0x58/0xc6
> > Jan 31 05:43:08 elabhost012 kernel: [<0218b9c0>] write_sysrq_trigger
> > +0x37/0x3e
> > Jan 31 05:43:08 elabhost012 kernel: [<0215aff6>] vfs_write+0xb6/0xe2
> > Jan 31 05:43:08 elabhost012 kernel: [<0215b0c0>] sys_write+0x3c/0x62
> > Jan 31 05:43:08 elabhost012 kernel: Code: 11 02 c7 05 10 fd 44 02 00 00
> > 00 00 c7 05 38 fd 44 02 00 00 00 00 c7 05 2c fd 44 02 6e ad 87 4b 89 15
> > 28 fd 44 02 e9 8b 41 f2 ff <c6> 05 00 00 00 00 00 c3 e9 0a ff f4 ff e9
> > a2 48 f5 ff 85 d2 89
> > Jan 31 05:43:08 elabhost012 kernel: --- salvaged messages from crash
> > dump end
> >
> > Thanks much for all the help! Should I open a bz against the issue? It
> > looks like all i386 hugemem kernels would be similarly affected.
>
> Yep -- definitely open a BZ against component "diskdumputils".
>

I've opened up bz431937 for the strstr change and bz431943 for the more
lack of input validation that caused the FPE. I separated them since one
actually fixes an issue for production users and the other just provides
a better error without making anything work.

-Andrew

> Dave
>
>

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 

Thread Tools




All times are GMT. The time now is 01:23 AM.

VBulletin, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2007 - 2008, www.linux-archive.org