FAQ Search Today's Posts Mark Forums Read

» Linux Archive
Home
New Posts
Search
FAQ


Go Back   Linux Archive > Redhat > Crash Utility

 
 
LinkBack Thread Tools
 
Old 03-28-2008, 12:56 PM
Chandru
 
Default crash aborts with cannot determine idle task

While running crash-4.0-6.1 on a vmcore , crash is aborting with

--------
crash: cannot determine idle task addresses from init_tasks[] or runqueues[]

crash: cannot resolve "init_task_union"
-------

during startup. The kernel is later than 2.6.18 . The changelog
http://people.redhat.com/anderson/crash.changelog.html mentions that
this is possibly fixed in version 4.0-3.1 . Hence could you pls point
me to the patch that fixed this problem.


thanks,
Chandru

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 03-28-2008, 03:24 PM
Dave Anderson
 
Default crash aborts with cannot determine idle task

> While running crash-4.0-6.1 on a vmcore , crash is aborting with
>
> --------
> crash: cannot determine idle task addresses from init_tasks[] or runqueues[]
>
> crash: cannot resolve "init_task_union"
> -------
>
>
> during startup. The kernel is later than 2.6.18 . The changelog
> http://people.redhat.com/anderson/crash.changelog.html mentions that this
> is possibly fixed in version 4.0-3.1 . Hence could you pls point me to the
> patch that fixed this problem.
>
> thanks,
> Chandru

That particular two-year-old patch simply recognized and dealt with the kernel
name change from "struct runqueue" to "struct rq":

--- kernel.c 2 Aug 2006 14:34:35 -0000 1.140
+++ kernel.c 2 Aug 2006 18:35:31 -0000 1.141
@@ -55,6 +55,7 @@
int i;
char *p1, *p2, buf[BUFSIZE];
struct syment *sp1, *sp2;
+ char *rqstruct;

if (pc->flags & KERNEL_DEBUG_QUERY)
return;
@@ -158,7 +159,15 @@
&kt->__per_cpu_offset[0]);
kt->flags |= PER_CPU_OFF;
}
- MEMBER_OFFSET_INIT(runqueue_cpu, "runqueue", "cpu");
+ if (STRUCT_EXISTS("runqueue"))
+ rqstruct = "runqueue";
+ else if (STRUCT_EXISTS("rq"))
+ rqstruct = "rq";
+
+ MEMBER_OFFSET_INIT(runqueue_cpu, rqstruct, "cpu");
+ /*
+ * 'cpu' does not exist in 'struct rq'.
+ */
if (VALID_MEMBER(runqueue_cpu) &&
(get_array_length("runqueue.cpu", NULL, 0) > 0)) {
MEMBER_OFFSET_INIT(cpu_s_curr, "cpu_s", "curr");
@@ -183,17 +192,17 @@
"runq_siblings: %d: __cpu_idx and __rq_idx arrays don't exist?
",
kt->runq_siblings);
} else {
- MEMBER_OFFSET_INIT(runqueue_idle, "runqueue", "idle");
- MEMBER_OFFSET_INIT(runqueue_curr, "runqueue", "curr");
+ MEMBER_OFFSET_INIT(runqueue_idle, rqstruct, "idle");
+ MEMBER_OFFSET_INIT(runqueue_curr, rqstruct, "curr");
ASSIGN_OFFSET(runqueue_cpu) = INVALID_OFFSET;
}
- MEMBER_OFFSET_INIT(runqueue_active, "runqueue", "active");
- MEMBER_OFFSET_INIT(runqueue_expired, "runqueue", "expired");
- MEMBER_OFFSET_INIT(runqueue_arrays, "runqueue", "arrays");
+ MEMBER_OFFSET_INIT(runqueue_active, rqstruct, "active");
+ MEMBER_OFFSET_INIT(runqueue_expired, rqstruct, "expired");
+ MEMBER_OFFSET_INIT(runqueue_arrays, rqstruct, "arrays");
MEMBER_OFFSET_INIT(prio_array_queue, "prio_array", "queue");
MEMBER_OFFSET_INIT(prio_array_nr_active, "prio_array",
"nr_active");
- STRUCT_SIZE_INIT(runqueue, "runqueue");
+ STRUCT_SIZE_INIT(runqueue, rqstruct);
STRUCT_SIZE_INIT(prio_array, "prio_array");

/*

So that patch was required for 2.6.18.

When you say that the "kernel is later than 2.6.18", well, that doesn't
help me much.

Look at the crash function get_idle_threads() in task.c, which is where
you're failing. It runs through the history of the symbols that Linux
has used over the years for the run queues. For the most recent kernels,
it looks for the "per_cpu__runqueues" symbol. At least on 2.6.25-rc2,
the kernel still defines them in kernel/sched.c like this:

static DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);

So if you do an "nm -Bn vmlinux | grep runqueues", you should see:

# nm -Bn vmlinux-2.6.25-rc1-ext4-1 | grep runqueues
ffffffff8082b700 d per_cpu__runqueues
#

I'm guessing that's not the problem -- so presuming that the symbol *does*
exist, find out why it's failing to increment "cnt" in this part of
get_idle_threads():

if (symbol_exists("per_cpu__runqueues") &&
VALID_MEMBER(runqueue_idle)) {
runqbuf = GETBUF(SIZE(runqueue));
for (i = 0; i < nr_cpus; i++) {
if ((kt->flags & SMP) && (kt->flags & PER_CPU_OFF)) {
runq = symbol_value("per_cpu__runqueues") +
kt->__per_cpu_offset[i];
} else
runq = symbol_value("per_cpu__runqueues");

readmem(runq, KVADDR, runqbuf,
SIZE(runqueue), "runqueues entry (per_cpu)",
FAULT_ON_ERROR);
tasklist[i] = ULONG(runqbuf + OFFSET(runqueue_idle));
if (IS_KVADDR(tasklist[i]))
cnt++;
}
}

Determine whether it even makes it to the inner for loop, whether
the pre-determined nr_cpus value makes sense, whether the SMP flag
reflects whether the kernel was compiled for SMP, whether the PER_CPU_OFF
flag was set, what address was calculated, etc...

Dave









--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 04-02-2008, 04:28 PM
Chandru
 
Default crash aborts with cannot determine idle task

Look at the crash function get_idle_threads() in task.c, which is where
you're failing. It runs through the history of the symbols that Linux
has used over the years for the run queues. For the most recent kernels,
it looks for the "per_cpu__runqueues" symbol. At least on 2.6.25-rc2,
the kernel still defines them in kernel/sched.c like this:

static DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);

So if you do an "nm -Bn vmlinux | grep runqueues", you should see:

# nm -Bn vmlinux-2.6.25-rc1-ext4-1 | grep runqueues
ffffffff8082b700 d per_cpu__runqueues
#

I'm guessing that's not the problem -- so presuming that the symbol
*does*

exist, find out why it's failing to increment "cnt" in this part of
get_idle_threads():

if (symbol_exists("per_cpu__runqueues") &&
VALID_MEMBER(runqueue_idle)) {
runqbuf = GETBUF(SIZE(runqueue));
for (i = 0; i < nr_cpus; i++) {
if ((kt->flags & SMP) && (kt->flags &
PER_CPU_OFF)) {
runq =
symbol_value("per_cpu__runqueues") +

kt->__per_cpu_offset[i];
} else
runq =
symbol_value("per_cpu__runqueues");


readmem(runq, KVADDR, runqbuf,
SIZE(runqueue), "runqueues entry
(per_cpu)",

FAULT_ON_ERROR);
tasklist[i] = ULONG(runqbuf +
OFFSET(runqueue_idle));

if (IS_KVADDR(tasklist[i]))
cnt++;
}
}

Determine whether it even makes it to the inner for loop, whether
the pre-determined nr_cpus value makes sense, whether the SMP flag
reflects whether the kernel was compiled for SMP, whether the PER_CPU_OFF
flag was set, what address was calculated, etc...

Dave

Thanks for the reply Dave. The code makes it to the inner for loop and
the condition
if (IS_KVADDR(tasklist[i])) fails which is why 'cnt' doesn't get
incremented.
The tasklist[i] somewhat has this value : 0x3d60657870722024.


I ran gdb on the vmcore file and printed the memory contents .

(gdb) print per_cpu__runqueues
$1 = {lock = {raw_lock = {slock = 1431524419}}, nr_running =
5283422954284598606,
raw_weighted_load = 5064663116585906736, cpu_load =
{2316051155752670036, 5929356451801411872,

2613857225664584019}, nr_switches = 5644502509443686462,
nr_uninterruptible = 2316072106569976142, expired_timestamp =
5142904381182533935,
timestamp_last_tick = 7235439831918129227, curr = 0x5f66696c650a5243,
idle = 0x3d60657870722024, <<<-----
prev_mm = 0x5243202b20243f60, active = 0xa247b4155535443, expired =
0x5352434449527d2f,



Does this mean that the kernel data was corrupted when vmcore was
collected ?.


Thanks,
Chandru

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 

Thread Tools




All times are GMT. The time now is 01:59 AM.

VBulletin, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2007 - 2008, www.linux-archive.org