FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Crash Utility

 
 
LinkBack Thread Tools
 
Old 10-08-2010, 12:32 PM
Sami Liedes
 
Default crash on a KVM-generated dump

Hi,

There's a bug in Debian bugzilla on crash crashing:

http://bugs.debian.org/599353

Attached is a message I sent to that bug which contains a patch that
fixes the problem (but in a non-beautiful way).

Is there a redhat bugzilla entry for crash, by the way? Finding
applications there was kind of hard, especially given that the query
would be "crash".

Sami


----- Forwarded message from Sami Liedes <sliedes@cc.hut.fi> -----

Date: Thu, 7 Oct 2010 21:50:22 +0300
From: Sami Liedes <sliedes@cc.hut.fi>
To: 599353@bugs.debian.org
Subject: [patch] Hack to fix this crash
User-Agent: Mutt/1.5.20 (2009-06-14)

Hi,

The crashing is pretty nondeterministic; today the existence of $HOME
does not seem to have an effect (confirmed by Timo).

It seems to be caused by heap corruption. The code in fault is in
x86_64.c; On some core files (produced by KVM), the interrupt stack
size (machdep->machspec->stkinfo.isize) is somehow calculated to be 0,
and 0 is passed to malloc() in x86_64.c:342. Later data is written
through that pointer.

Here's a minimal patch (crude hack, not a real fix for the underlying
problem) to make this work:

------------------------------------------------------------
diff -ur crash-5.0.7/x86_64.c crash-5.0.7.patched//x86_64.c
--- crash-5.0.7/x86_64.c 2010-08-27 20:36:18.000000000 +0300
+++ crash-5.0.7.patched//x86_64.c 2010-10-07 21:23:16.079119657 +0300
@@ -339,6 +339,9 @@
x86_64_per_cpu_init();
x86_64_ist_init();
machdep->in_alternate_stack = x86_64_in_alternate_stack;
+ /* HACK */
+ if (machdep->machspec->stkinfo.isize == 0)
+ machdep->machspec->stkinfo.isize = 65536;
if ((machdep->machspec->irqstack = (char *)
malloc(machdep->machspec->stkinfo.isize)) == NULL)
error(FATAL, "cannot malloc irqstack space.");
------------------------------------------------------------

Here are the valgrind warnings produced (search for "invalid write" to
find the fault causing this; not that the other problems would not be
worth fixing):

------------------------------------------------------------
$ valgrind crash vmlinux new.core
==10013== Memcheck, a memory error detector
==10013== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==10013== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h for copyright info
==10013== Command: crash vmlinux new.core
==10013==

crash 5.0.7
Copyright (C) 2002-2010 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.

GNU gdb (GDB) 7.0
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x5079290: inflateReset2 (inflate.c:157)
==10013== by 0x507937F: inflateInit2_ (inflate.c:193)
==10013== by 0x4DB05B: read_in_kernel_config (kernel.c:6708)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013==
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x4C26BB7: __GI___rawmemchr (mc_replace_strmem.c:729)
==10013== by 0x577D1FF: _IO_str_init_static_internal (strops.c:45)
==10013== by 0x57613E4: __isoc99_vsscanf (isoc99_vsscanf.c:42)
==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013==
==10013== Use of uninitialised value of size 8
==10013== at 0x5758FFF: _IO_vfscanf (vfscanf.c:600)
==10013== by 0x57613F9: __isoc99_vsscanf (isoc99_vsscanf.c:44)
==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013==
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x5759014: _IO_vfscanf (vfscanf.c:602)
==10013== by 0x57613F9: __isoc99_vsscanf (isoc99_vsscanf.c:44)
==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013==
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x577B789: _IO_sputbackc (genops.c:730)
==10013== by 0x5759042: _IO_vfscanf (vfscanf.c:602)
==10013== by 0x57613F9: __isoc99_vsscanf (isoc99_vsscanf.c:44)
==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013==
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x4C26BAA: __GI___rawmemchr (mc_replace_strmem.c:729)
==10013== by 0x577D1FF: _IO_str_init_static_internal (strops.c:45)
==10013== by 0x57613E4: __isoc99_vsscanf (isoc99_vsscanf.c:42)
==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013==
==10013== Use of uninitialised value of size 8
==10013== at 0x575B66C: _IO_vfscanf (vfscanf.c:2734)
==10013== by 0x57613F9: __isoc99_vsscanf (isoc99_vsscanf.c:44)
==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013==
==10013== Use of uninitialised value of size 8
==10013== at 0x575B70B: _IO_vfscanf (vfscanf.c:2734)
==10013== by 0x57613F9: __isoc99_vsscanf (isoc99_vsscanf.c:44)
==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013==
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x46318F: whitespace (tools.c:222)
==10013== by 0x4DB1A4: read_in_kernel_config (kernel.c:6743)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x463195: whitespace (tools.c:222)
==10013== by 0x4DB1A4: read_in_kernel_config (kernel.c:6743)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x4DB1B2: read_in_kernel_config (kernel.c:6747)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x4C2536A: __GI_strchr (mc_replace_strmem.c:144)
==10013== by 0x4DB218: read_in_kernel_config (kernel.c:6755)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x4C25380: __GI_strchr (mc_replace_strmem.c:144)
==10013== by 0x4DB218: read_in_kernel_config (kernel.c:6755)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
==10013== Conditional jump or move depends on uninitialised value(s)
==10013== at 0x4C2537A: __GI_strchr (mc_replace_strmem.c:144)
==10013== by 0x4DB218: read_in_kernel_config (kernel.c:6755)
==10013== by 0x45D82B: main_loop (main.c:552)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
WARNING: cannot determine how modules are linked
WARNING: no kernel module access

==10013== Invalid write of size 1
==10013== at 0x4C26A88: memset (mc_replace_strmem.c:602)
==10013== by 0x561F36: read_kvmdump (kvmdump.c:174)
==10013== by 0x473D3F: readmem (memory.c:1842)
==10013== by 0x4EC125: x86_64_post_init (x86_64.c:1062)
==10013== by 0x4E8E56: x86_64_init (x86_64.c:415)
==10013== by 0x45D871: main_loop (main.c:563)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== Address 0x5b183e0 is 0 bytes after a block of size 0 alloc'd
==10013== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
==10013== by 0x4E8AF3: x86_64_init (x86_64.c:342)
==10013== by 0x45D83A: main_loop (main.c:554)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
==10013== Invalid write of size 1
==10013== at 0x4C26A8C: memset (mc_replace_strmem.c:602)
==10013== by 0x561F36: read_kvmdump (kvmdump.c:174)
==10013== by 0x473D3F: readmem (memory.c:1842)
==10013== by 0x4EC125: x86_64_post_init (x86_64.c:1062)
==10013== by 0x4E8E56: x86_64_init (x86_64.c:415)
==10013== by 0x45D871: main_loop (main.c:563)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== Address 0x5b183e1 is 1 bytes after a block of size 0 alloc'd
==10013== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
==10013== by 0x4E8AF3: x86_64_init (x86_64.c:342)
==10013== by 0x45D83A: main_loop (main.c:554)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
==10013== Invalid write of size 1
==10013== at 0x4C26A94: memset (mc_replace_strmem.c:602)
==10013== by 0x561F36: read_kvmdump (kvmdump.c:174)
==10013== by 0x473D3F: readmem (memory.c:1842)
==10013== by 0x4EC125: x86_64_post_init (x86_64.c:1062)
==10013== by 0x4E8E56: x86_64_init (x86_64.c:415)
==10013== by 0x45D871: main_loop (main.c:563)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== Address 0x5b183e2 is 2 bytes after a block of size 0 alloc'd
==10013== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
==10013== by 0x4E8AF3: x86_64_init (x86_64.c:342)
==10013== by 0x45D83A: main_loop (main.c:554)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
==10013== Invalid write of size 1
==10013== at 0x4C26A99: memset (mc_replace_strmem.c:602)
==10013== by 0x561F36: read_kvmdump (kvmdump.c:174)
==10013== by 0x473D3F: readmem (memory.c:1842)
==10013== by 0x4EC125: x86_64_post_init (x86_64.c:1062)
==10013== by 0x4E8E56: x86_64_init (x86_64.c:415)
==10013== by 0x45D871: main_loop (main.c:563)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== Address 0x5b183e3 is 3 bytes after a block of size 0 alloc'd
==10013== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
==10013== by 0x4E8AF3: x86_64_init (x86_64.c:342)
==10013== by 0x45D83A: main_loop (main.c:554)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
==10013== Invalid write of size 1
==10013== at 0x4C26AA9: memset (mc_replace_strmem.c:602)
==10013== by 0x561F36: read_kvmdump (kvmdump.c:174)
==10013== by 0x473D3F: readmem (memory.c:1842)
==10013== by 0x4EC125: x86_64_post_init (x86_64.c:1062)
==10013== by 0x4E8E56: x86_64_init (x86_64.c:415)
==10013== by 0x45D871: main_loop (main.c:563)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== Address 0x5b183e8 is 8 bytes after a block of size 0 alloc'd
==10013== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
==10013== by 0x4E8AF3: x86_64_init (x86_64.c:342)
==10013== by 0x45D83A: main_loop (main.c:554)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
==10013== by 0x45D78E: main (main.c:525)
==10013==
KERNEL: vmlinux
DUMPFILE: new.core
CPUS: 1
DATE: Fri Oct 1 21:26:15 2010
UPTIME: 00:00:56
LOAD AVERAGE: 0.14, 0.05, 0.02
TASKS: 45
NODENAME: fstest
RELEASE: 2.6.35.6
VERSION: #2 Wed Sep 29 15:05:49 EEST 2010
MACHINE: x86_64 (2394 Mhz)
==10013== Source and destination overlap in strcpy(0x7fefffae2, 0x7fefffae4)
==10013== at 0x4C25918: strcpy (mc_replace_strmem.c:311)
==10013== by 0x46E9DE: pages_to_size (tools.c:4640)
==10013== by 0x49393F: get_memory_size (memory.c:11145)
==10013== by 0x4CFFC5: display_sys_stats (kernel.c:3927)
==10013== by 0x45D934: main_loop (main.c:581)
==10013== by 0x584413: current_interp_command_loop (interps.c:288)
==10013== by 0x584DD2: captured_command_loop (main.c:226)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585ECB: captured_main (main.c:924)
==10013== by 0x583E33: catch_errors (exceptions.c:520)
==10013== by 0x585F10: gdb_main (main.c:939)
==10013== by 0x585F65: gdb_main_entry (main.c:959)
==10013==
MEMORY: 1 GB
PANIC: ""
PID: 0
COMMAND: "swapper"
TASK: ffffffff81a13040 [THREAD_INFO: ffffffff81a00000]
CPU: 0
STATE: TASK_RUNNING (ACTIVE)
WARNING: panic task not found

crash> q
==10013==
==10013== HEAP SUMMARY:
==10013== in use at exit: 53,444,536 bytes in 10,730 blocks
==10013== total heap usage: 396,156 allocs, 385,426 frees, 2,187,205,021 bytes allocated
==10013==
==10013== LEAK SUMMARY:
==10013== definitely lost: 6,414 bytes in 35 blocks
==10013== indirectly lost: 24 bytes in 1 blocks
==10013== possibly lost: 42,174,127 bytes in 8,022 blocks
==10013== still reachable: 11,263,971 bytes in 2,672 blocks
==10013== suppressed: 0 bytes in 0 blocks
==10013== Rerun with --leak-check=full to see details of leaked memory
==10013==
==10013== For counts of detected and suppressed errors, rerun with: -v
==10013== Use --track-origins=yes to see where uninitialised values come from
==10013== ERROR SUMMARY: 6710 errors from 21 contexts (suppressed: 4 from 4)
------------------------------------------------------------

Sami

----- End forwarded message -----
--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 10-08-2010, 01:31 PM
Dave Anderson
 
Default crash on a KVM-generated dump

----- "Sami Liedes" <sliedes@cc.hut.fi> wrote:

> Hi,
>
> There's a bug in Debian bugzilla on crash crashing:
>
> http://bugs.debian.org/599353
>
> Attached is a message I sent to that bug which contains a patch that
> fixes the problem (but in a non-beautiful way).
>
> Is there a redhat bugzilla entry for crash, by the way? Finding
> applications there was kind of hard, especially given that the query
> would be "crash".

Yes, it's bugzilla component is "crash", but it's pretty much for issues
associated with running crash against RHEL kernels, and I have not seen
this before. (even with an Ubuntu vmlinux-2.6.31-17-server dumpfile
I have no hand) Reporting it here is the best thing to do.

I don't think that this is associated with KVM, but rather the kernel
version used. It should be pretty easy to debug on your end, because it
boils down to these initializations at the top of x86_64_per_cpu_init()

irq_sp = per_cpu_symbol_search("per_cpu__irq_stack_union");
cpu_sp = per_cpu_symbol_search("per_cpu__cpu_number");

If it's a UP kernel, and if "irq_sp" does not get set, then isize would
be left uninitialized.

If it's an SMP kernel, and if either "irq_sp" or "cpu_sp" do not get,
then isize would be left uninitialized.

But I can't understand why they wouldn't get initialized?

In a 2.6.36-rc1 kernel KVM dumpfile, I see this for their per-cpu
symbol values:

crash> sym irq_stack_union
0 (V) irq_stack_union
crash> sym cpu_number
e320 (V) cpu_number
crash>

Do you see something different with that kernel?

Dave


>
> Sami
>
>
> ----- Forwarded message from Sami Liedes <sliedes@cc.hut.fi> -----
>
> Date: Thu, 7 Oct 2010 21:50:22 +0300
> From: Sami Liedes <sliedes@cc.hut.fi>
> To: 599353@bugs.debian.org
> Subject: [patch] Hack to fix this crash
> User-Agent: Mutt/1.5.20 (2009-06-14)
>
> Hi,
>
> The crashing is pretty nondeterministic; today the existence of $HOME
> does not seem to have an effect (confirmed by Timo).
>
> It seems to be caused by heap corruption. The code in fault is in
> x86_64.c; On some core files (produced by KVM), the interrupt stack
> size (machdep->machspec->stkinfo.isize) is somehow calculated to be 0,
> and 0 is passed to malloc() in x86_64.c:342. Later data is written
> through that pointer.
>
> Here's a minimal patch (crude hack, not a real fix for the underlying
> problem) to make this work:
>
> ------------------------------------------------------------
> diff -ur crash-5.0.7/x86_64.c crash-5.0.7.patched//x86_64.c
> --- crash-5.0.7/x86_64.c 2010-08-27 20:36:18.000000000 +0300
> +++ crash-5.0.7.patched//x86_64.c 2010-10-07 21:23:16.079119657 +0300
> @@ -339,6 +339,9 @@
> x86_64_per_cpu_init();
> x86_64_ist_init();
> machdep->in_alternate_stack = x86_64_in_alternate_stack;
> + /* HACK */
> + if (machdep->machspec->stkinfo.isize == 0)
> + machdep->machspec->stkinfo.isize = 65536;
> if ((machdep->machspec->irqstack = (char *)
> malloc(machdep->machspec->stkinfo.isize)) == NULL)
> error(FATAL, "cannot malloc irqstack
> space.");
> ------------------------------------------------------------
>
> Here are the valgrind warnings produced (search for "invalid write"
> to
> find the fault causing this; not that the other problems would not be
> worth fixing):
>
> ------------------------------------------------------------
> $ valgrind crash vmlinux new.core
> ==10013== Memcheck, a memory error detector
> ==10013== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et
> al.
> ==10013== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h
> for copyright info
> ==10013== Command: crash vmlinux new.core
> ==10013==
>
> crash 5.0.7
> Copyright (C) 2002-2010 Red Hat, Inc.
> Copyright (C) 2004, 2005, 2006 IBM Corporation
> Copyright (C) 1999-2006 Hewlett-Packard Co
> Copyright (C) 2005, 2006 Fujitsu Limited
> Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
> Copyright (C) 2005 NEC Corporation
> Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
> This program is free software, covered by the GNU General Public
> License,
> and you are welcome to change it and/or distribute copies of it under
> certain conditions. Enter "help copying" to see the conditions.
> This program has absolutely no warranty. Enter "help warranty" for
> details.
>
> GNU gdb (GDB) 7.0
> Copyright (C) 2009 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show
> copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-unknown-linux-gnu"...
>
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x5079290: inflateReset2 (inflate.c:157)
> ==10013== by 0x507937F: inflateInit2_ (inflate.c:193)
> ==10013== by 0x4DB05B: read_in_kernel_config (kernel.c:6708)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013==
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x4C26BB7: __GI___rawmemchr (mc_replace_strmem.c:729)
> ==10013== by 0x577D1FF: _IO_str_init_static_internal (strops.c:45)
> ==10013== by 0x57613E4: __isoc99_vsscanf (isoc99_vsscanf.c:42)
> ==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
> ==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013==
> ==10013== Use of uninitialised value of size 8
> ==10013== at 0x5758FFF: _IO_vfscanf (vfscanf.c:600)
> ==10013== by 0x57613F9: __isoc99_vsscanf (isoc99_vsscanf.c:44)
> ==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
> ==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013==
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x5759014: _IO_vfscanf (vfscanf.c:602)
> ==10013== by 0x57613F9: __isoc99_vsscanf (isoc99_vsscanf.c:44)
> ==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
> ==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013==
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x577B789: _IO_sputbackc (genops.c:730)
> ==10013== by 0x5759042: _IO_vfscanf (vfscanf.c:602)
> ==10013== by 0x57613F9: __isoc99_vsscanf (isoc99_vsscanf.c:44)
> ==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
> ==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013==
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x4C26BAA: __GI___rawmemchr (mc_replace_strmem.c:729)
> ==10013== by 0x577D1FF: _IO_str_init_static_internal (strops.c:45)
> ==10013== by 0x57613E4: __isoc99_vsscanf (isoc99_vsscanf.c:42)
> ==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
> ==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013==
> ==10013== Use of uninitialised value of size 8
> ==10013== at 0x575B66C: _IO_vfscanf (vfscanf.c:2734)
> ==10013== by 0x57613F9: __isoc99_vsscanf (isoc99_vsscanf.c:44)
> ==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
> ==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013==
> ==10013== Use of uninitialised value of size 8
> ==10013== at 0x575B70B: _IO_vfscanf (vfscanf.c:2734)
> ==10013== by 0x57613F9: __isoc99_vsscanf (isoc99_vsscanf.c:44)
> ==10013== by 0x5761377: __isoc99_sscanf (isoc99_sscanf.c:33)
> ==10013== by 0x4DB12B: read_in_kernel_config (kernel.c:6733)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013==
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x46318F: whitespace (tools.c:222)
> ==10013== by 0x4DB1A4: read_in_kernel_config (kernel.c:6743)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x463195: whitespace (tools.c:222)
> ==10013== by 0x4DB1A4: read_in_kernel_config (kernel.c:6743)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x4DB1B2: read_in_kernel_config (kernel.c:6747)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x4C2536A: __GI_strchr (mc_replace_strmem.c:144)
> ==10013== by 0x4DB218: read_in_kernel_config (kernel.c:6755)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x4C25380: __GI_strchr (mc_replace_strmem.c:144)
> ==10013== by 0x4DB218: read_in_kernel_config (kernel.c:6755)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> ==10013== Conditional jump or move depends on uninitialised value(s)
> ==10013== at 0x4C2537A: __GI_strchr (mc_replace_strmem.c:144)
> ==10013== by 0x4DB218: read_in_kernel_config (kernel.c:6755)
> ==10013== by 0x45D82B: main_loop (main.c:552)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> WARNING: cannot determine how modules are linked
> WARNING: no kernel module access
>
> ==10013== Invalid write of size 1
> ==10013== at 0x4C26A88: memset (mc_replace_strmem.c:602)
> ==10013== by 0x561F36: read_kvmdump (kvmdump.c:174)
> ==10013== by 0x473D3F: readmem (memory.c:1842)
> ==10013== by 0x4EC125: x86_64_post_init (x86_64.c:1062)
> ==10013== by 0x4E8E56: x86_64_init (x86_64.c:415)
> ==10013== by 0x45D871: main_loop (main.c:563)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== Address 0x5b183e0 is 0 bytes after a block of size 0
> alloc'd
> ==10013== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
> ==10013== by 0x4E8AF3: x86_64_init (x86_64.c:342)
> ==10013== by 0x45D83A: main_loop (main.c:554)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> ==10013== Invalid write of size 1
> ==10013== at 0x4C26A8C: memset (mc_replace_strmem.c:602)
> ==10013== by 0x561F36: read_kvmdump (kvmdump.c:174)
> ==10013== by 0x473D3F: readmem (memory.c:1842)
> ==10013== by 0x4EC125: x86_64_post_init (x86_64.c:1062)
> ==10013== by 0x4E8E56: x86_64_init (x86_64.c:415)
> ==10013== by 0x45D871: main_loop (main.c:563)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== Address 0x5b183e1 is 1 bytes after a block of size 0
> alloc'd
> ==10013== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
> ==10013== by 0x4E8AF3: x86_64_init (x86_64.c:342)
> ==10013== by 0x45D83A: main_loop (main.c:554)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> ==10013== Invalid write of size 1
> ==10013== at 0x4C26A94: memset (mc_replace_strmem.c:602)
> ==10013== by 0x561F36: read_kvmdump (kvmdump.c:174)
> ==10013== by 0x473D3F: readmem (memory.c:1842)
> ==10013== by 0x4EC125: x86_64_post_init (x86_64.c:1062)
> ==10013== by 0x4E8E56: x86_64_init (x86_64.c:415)
> ==10013== by 0x45D871: main_loop (main.c:563)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== Address 0x5b183e2 is 2 bytes after a block of size 0
> alloc'd
> ==10013== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
> ==10013== by 0x4E8AF3: x86_64_init (x86_64.c:342)
> ==10013== by 0x45D83A: main_loop (main.c:554)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> ==10013== Invalid write of size 1
> ==10013== at 0x4C26A99: memset (mc_replace_strmem.c:602)
> ==10013== by 0x561F36: read_kvmdump (kvmdump.c:174)
> ==10013== by 0x473D3F: readmem (memory.c:1842)
> ==10013== by 0x4EC125: x86_64_post_init (x86_64.c:1062)
> ==10013== by 0x4E8E56: x86_64_init (x86_64.c:415)
> ==10013== by 0x45D871: main_loop (main.c:563)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== Address 0x5b183e3 is 3 bytes after a block of size 0
> alloc'd
> ==10013== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
> ==10013== by 0x4E8AF3: x86_64_init (x86_64.c:342)
> ==10013== by 0x45D83A: main_loop (main.c:554)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> ==10013== Invalid write of size 1
> ==10013== at 0x4C26AA9: memset (mc_replace_strmem.c:602)
> ==10013== by 0x561F36: read_kvmdump (kvmdump.c:174)
> ==10013== by 0x473D3F: readmem (memory.c:1842)
> ==10013== by 0x4EC125: x86_64_post_init (x86_64.c:1062)
> ==10013== by 0x4E8E56: x86_64_init (x86_64.c:415)
> ==10013== by 0x45D871: main_loop (main.c:563)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== Address 0x5b183e8 is 8 bytes after a block of size 0
> alloc'd
> ==10013== at 0x4C244E8: malloc (vg_replace_malloc.c:236)
> ==10013== by 0x4E8AF3: x86_64_init (x86_64.c:342)
> ==10013== by 0x45D83A: main_loop (main.c:554)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013== by 0x4DBA55: gdb_main_loop (gdb_interface.c:78)
> ==10013== by 0x45D78E: main (main.c:525)
> ==10013==
> KERNEL: vmlinux
> DUMPFILE: new.core
> CPUS: 1
> DATE: Fri Oct 1 21:26:15 2010
> UPTIME: 00:00:56
> LOAD AVERAGE: 0.14, 0.05, 0.02
> TASKS: 45
> NODENAME: fstest
> RELEASE: 2.6.35.6
> VERSION: #2 Wed Sep 29 15:05:49 EEST 2010
> MACHINE: x86_64 (2394 Mhz)
> ==10013== Source and destination overlap in strcpy(0x7fefffae2,
> 0x7fefffae4)
> ==10013== at 0x4C25918: strcpy (mc_replace_strmem.c:311)
> ==10013== by 0x46E9DE: pages_to_size (tools.c:4640)
> ==10013== by 0x49393F: get_memory_size (memory.c:11145)
> ==10013== by 0x4CFFC5: display_sys_stats (kernel.c:3927)
> ==10013== by 0x45D934: main_loop (main.c:581)
> ==10013== by 0x584413: current_interp_command_loop (interps.c:288)
> ==10013== by 0x584DD2: captured_command_loop (main.c:226)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585ECB: captured_main (main.c:924)
> ==10013== by 0x583E33: catch_errors (exceptions.c:520)
> ==10013== by 0x585F10: gdb_main (main.c:939)
> ==10013== by 0x585F65: gdb_main_entry (main.c:959)
> ==10013==
> MEMORY: 1 GB
> PANIC: ""
> PID: 0
> COMMAND: "swapper"
> TASK: ffffffff81a13040 [THREAD_INFO: ffffffff81a00000]
> CPU: 0
> STATE: TASK_RUNNING (ACTIVE)
> WARNING: panic task not found
>
> crash> q
> ==10013==
> ==10013== HEAP SUMMARY:
> ==10013== in use at exit: 53,444,536 bytes in 10,730 blocks
> ==10013== total heap usage: 396,156 allocs, 385,426 frees,
> 2,187,205,021 bytes allocated
> ==10013==
> ==10013== LEAK SUMMARY:
> ==10013== definitely lost: 6,414 bytes in 35 blocks
> ==10013== indirectly lost: 24 bytes in 1 blocks
> ==10013== possibly lost: 42,174,127 bytes in 8,022 blocks
> ==10013== still reachable: 11,263,971 bytes in 2,672 blocks
> ==10013== suppressed: 0 bytes in 0 blocks
> ==10013== Rerun with --leak-check=full to see details of leaked
> memory
> ==10013==
> ==10013== For counts of detected and suppressed errors, rerun with:
> -v
> ==10013== Use --track-origins=yes to see where uninitialised values
> come from
> ==10013== ERROR SUMMARY: 6710 errors from 21 contexts (suppressed: 4
> from 4)
> ------------------------------------------------------------
>
> Sami
>
> ----- End forwarded message -----
>
> --
> Crash-utility mailing list
> Crash-utility@redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 10-08-2010, 02:07 PM
Sami Liedes
 
Default crash on a KVM-generated dump

On Fri, Oct 08, 2010 at 09:31:02AM -0400, Dave Anderson wrote:
> I don't think that this is associated with KVM, but rather the kernel
> version used. It should be pretty easy to debug on your end, because it
> boils down to these initializations at the top of x86_64_per_cpu_init()
>
> irq_sp = per_cpu_symbol_search("per_cpu__irq_stack_union");
> cpu_sp = per_cpu_symbol_search("per_cpu__cpu_number");
>
> If it's a UP kernel, and if "irq_sp" does not get set, then isize would
> be left uninitialized.

It's a uniprocessor amd64 kernel. Neither irq_sp nor cpu_sp get set.

I have

crash> sym irq_stack_union
ffffffff81a1c000 (D) irq_stack_union
crash> sym cpu_number
symbol not found: cpu_number

It's not accepted by per_cpu_symbol_search() because its type is not
'V' and because it's not between __per_cpu_start and __per_cpu_end.
__per_cpu_start and __per_cpu_end are the same; I don't know if
there's something wrong with that.

(gdb) b x86_64_per_cpu_init
Breakpoint 1 at 0x4eb49c: file x86_64.c, line 823.
(gdb) r
[...]
Breakpoint 1, x86_64_per_cpu_init () at x86_64.c:823
823 ms = machdep->machspec;
(gdb) n
825 irq_sp = per_cpu_symbol_search("per_cpu__irq_stack_union");
(gdb) s
per_cpu_symbol_search (symbol=0x8a46d7 "per_cpu__irq_stack_union") at symbols.c:4106
4106 if (STRNEQ(symbol, "per_cpu__")) {
(gdb) n
4107 if ((sp = symbol_search(symbol)))
(gdb)
4109 new = symbol + strlen("per_cpu__");
(gdb)
4110 if ((sp = symbol_search(new))) {
(gdb) print new
$1 = 0x8a46e0 "irq_stack_union"
(gdb) n
4111 if ((sp->type == 'V') ||
(gdb) l
4106 if (STRNEQ(symbol, "per_cpu__")) {
4107 if ((sp = symbol_search(symbol)))
4108 return sp;
4109 new = symbol + strlen("per_cpu__");
4110 if ((sp = symbol_search(new))) {
4111 if ((sp->type == 'V') ||
4112 ((sp->value >= st->__per_cpu_start) &&
4113 (sp->value < st->__per_cpu_end)))
4114 return sp;
4115 }
(gdb) print sp->type
$2 = 68 'D'
(gdb) print sp->value
$3 = 18446744071589445632
(gdb) p/x sp->value
$4 = 0xffffffff81a1c000
(gdb) p/x st->__per_cpu_start
$5 = 0xffffffff81ae7000
(gdb) p/x st->__per_cpu_end
$6 = 0xffffffff81ae7000

Sami
--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 10-08-2010, 02:21 PM
Dave Anderson
 
Default crash on a KVM-generated dump

----- "Sami Liedes" <sliedes@cc.hut.fi> wrote:

> On Fri, Oct 08, 2010 at 09:31:02AM -0400, Dave Anderson wrote:
> > I don't think that this is associated with KVM, but rather the
> kernel
> > version used. It should be pretty easy to debug on your end,
> because it
> > boils down to these initializations at the top of
> x86_64_per_cpu_init()
> >
> > irq_sp = per_cpu_symbol_search("per_cpu__irq_stack_union");
> > cpu_sp = per_cpu_symbol_search("per_cpu__cpu_number");
> >
> > If it's a UP kernel, and if "irq_sp" does not get set, then isize would
> > be left uninitialized.
>
> It's a uniprocessor amd64 kernel. Neither irq_sp nor cpu_sp get set.
>
> I have
>
> crash> sym irq_stack_union
> ffffffff81a1c000 (D) irq_stack_union
> crash> sym cpu_number
> symbol not found: cpu_number
>
> It's not accepted by per_cpu_symbol_search() because its type is not
> 'V' and because it's not between __per_cpu_start and __per_cpu_end.
> __per_cpu_start and __per_cpu_end are the same; I don't know if
> there's something wrong with that.

Try the attached patch.

Dave

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 10-08-2010, 03:01 PM
Sami Liedes
 
Default crash on a KVM-generated dump

On Fri, Oct 08, 2010 at 10:21:30AM -0400, Dave Anderson wrote:
> Try the attached patch.

Yup, that seems to fix the problem.

FWIW, I also added support for the "slirp" section in some
qemu-produced qcow2 images I had. I didn't read qemu source to
determine whether the section size is constant, so it might not be
correct; however the attached patch works for me in this one case.

Sami
--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 10-08-2010, 03:07 PM
Dave Anderson
 
Default crash on a KVM-generated dump

----- "Sami Liedes" <sliedes@cc.hut.fi> wrote:

> On Fri, Oct 08, 2010 at 10:21:30AM -0400, Dave Anderson wrote:
> > Try the attached patch.
>
> Yup, that seems to fix the problem.
>
> FWIW, I also added support for the "slirp" section in some
> qemu-produced qcow2 images I had. I didn't read qemu source to
> determine whether the section size is constant, so it might not be
> correct; however the attached patch works for me in this one case.
>
> Sami

How did you come up with the "131" size?

Dave


--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 10-08-2010, 03:12 PM
Sami Liedes
 
Default crash on a KVM-generated dump

On Fri, Oct 08, 2010 at 11:07:10AM -0400, Dave Anderson wrote:
> How did you come up with the "131" size?

Just by adding debugging prints and inspecting the qcow2 image. So it
may be very much incorrect, except in this one case.

Sami
--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 10-08-2010, 03:26 PM
Dave Anderson
 
Default crash on a KVM-generated dump

----- "Sami Liedes" <sliedes@cc.hut.fi> wrote:

> On Fri, Oct 08, 2010 at 11:07:10AM -0400, Dave Anderson wrote:
> > How did you come up with the "131" size?
>
> Just by adding debugging prints and inspecting the qcow2 image. So it
> may be very much incorrect, except in this one case.

Damn -- this "borrowed usage" of the savevm format for virsh dump is
really getting to be a pain in the ass...

Looking at the qemu-kvm sources, it's not obvious to me what the size
of the the "slirp" device would be in the dumpfile. And apparently
Red Hat kernels don't use that device or somebody else would have
bumped into it, but I'll check with Paolo Bonzini to verify the number.

Thanks,
Dave


--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 10-08-2010, 03:29 PM
Sami Liedes
 
Default crash on a KVM-generated dump

On Fri, Oct 08, 2010 at 11:26:35AM -0400, Dave Anderson wrote:
> Looking at the qemu-kvm sources, it's not obvious to me what the size
> of the the "slirp" device would be in the dumpfile. And apparently
> Red Hat kernels don't use that device or somebody else would have
> bumped into it, but I'll check with Paolo Bonzini to verify the number.

I actually ran into it with KVM under virsh. The section disappears if
there's no -net user option to the kvm.

Sami
--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 10-08-2010, 06:48 PM
Dave Anderson
 
Default crash on a KVM-generated dump

----- "Sami Liedes" <sliedes@cc.hut.fi> wrote:

> On Fri, Oct 08, 2010 at 11:26:35AM -0400, Dave Anderson wrote:
> > Looking at the qemu-kvm sources, it's not obvious to me what the size
> > of the the "slirp" device would be in the dumpfile. And apparently
> > Red Hat kernels don't use that device or somebody else would have
> > bumped into it, but I'll check with Paolo Bonzini to verify the number.
>
> I actually ran into it with KVM under virsh. The section disappears if
> there's no -net user option to the kvm.
>
> Sami

Can you send me the -d1 output from that dumpfile session with
your slirp-patch applied? Like this:

# crash -d1 vmlinux dumpfile > /tmp/junk
q
#

Thanks,
Dave
--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 

Thread Tools




All times are GMT. The time now is 03:18 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org