FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Crash Utility

 
 
LinkBack Thread Tools
 
Old 11-10-2009, 09:26 PM
Bob Montgomery
 
Default invalid kernel virtual address: cc08 type: "cpu number (per_cpu)"

I have a dump from a 2.6.31-based x86_64 system where the number of
"possible" cpus equals the system's NR_CPUS (32).

On that system, the __per_cpu_offset table in the kernel consists of 32
valid offset pointers.

When crash loads this table into its __per_cpu_offset[NR_CPUS=4096]
array in struct kernel_table, it knows the length of the kernel's array
(32*sizeof(long)), and copies the 32 pointers, leaving the rest of its
(much longer) array full of 0x0s.

(This happens in kernel.c)

193 if (symbol_exists("__per_cpu_offset")) {
194 if (LKCD_KERNTYPES())
195 i = get_cpus_possible();
196 else
197 i = get_array_length("__per_cpu_offset", NULL, 0);
198 get_symbol_data("__per_cpu_offset",
199 sizeof(long)*((i && (i <= NR_CPUS)) ? i : NR_CPUS),
200 &kt->__per_cpu_offset[0]);
201 kt->flags |= PER_CPU_OFF;
202 }

Later, in a couple of places, crash checks for the maximum valid
__per_cpu_offset by reading the cpu_number value out of each per_cpu
area and comparing it to the expected number until the comparison fails.
(Remember NR_CPUS in crash is much larger then the kernel's NR_CPUS, and
that's OK).

>From x86_64.c:

4201 for (i = cpus = 0; i < NR_CPUS; i++) {
4202 readmem(symbol_value("per_cpu__cpu_number") +
4203 kt->__per_cpu_offset[i], KVADDR,
4204 &cpunumber, sizeof(int),
4205 "cpu number (per_cpu)", FAULT_ON_ERROR);
4206 if (cpunumber != cpus)
4207 break;
4208 cpus++;
4209 }

This works well when the kernel's array has fewer real per_cpu_offsets
than its own NR_CPUS, since the kernel preloads its array with a pointer
(BOOT_PERCPU_OFFSET) and when this loop runs past the real
per_cpu_offset pointers and tries to use the BOOT_PERCPU_OFFSET, it
reads a bogus value for cpunumber and terminates.

But when the kernel's table is full of valid per_cpu_offset pointers,
this loop continues off the end of that into the part of crash's
__per_cpu_offset array that has the 0x0 initial values, and dies with:

crash: invalid kernel virtual address: cc08 type: "cpu number
(per_cpu)"

The cc08 comes from the symbol_value of per_cpu__cpu_number:
000000000000cc08 D per_cpu__cpu_number

Bottom line: Crash is assuming an insufficient array termination for
the kernel's __per_cpu_offset array (a pointer that points to an invalid
cpu_number).

The included patch adds an additional loop termination so that crash
doesn't run off the end of what it loaded from the dump. It just checks
for a NULL 0x0 value in kt->__per_cpu_offset[i].

Bob Montgomery,
Working at HP






--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 11-11-2009, 05:24 PM
Bob Montgomery
 
Default invalid kernel virtual address: cc08 type: "cpu number (per_cpu)"

On Wed, 2009-11-11 at 14:52 +0000, Dave Anderson wrote:
> ----- "Bob Montgomery" <bob.montgomery@hp.com> wrote:
>
> > I have a dump from a 2.6.31-based x86_64 system where the number of
> > "possible" cpus equals the system's NR_CPUS (32).
> > On that system, the __per_cpu_offset table in the kernel consists of 32
> > valid offset pointers.

> I have a similar-but-different fix queued for this, but instead of
> checking for a NULL kt->__per_cpu_offset[i] entry, it changes the
> readmem() call to RETURN_ON_ERROR|QUIET instead of FAULT_ON_ERROR
> like this:
>
> if (!readmem(symbol_value("per_cpu__cpu_number") +
> kt->__per_cpu_offset[i],
> KVADDR, &cpunumber, sizeof(int),
> "cpu number (per_cpu)", QUIET|RETURN_ON_ERROR))
> break;

> That should prevent the failure you're seeing.

I did that first, and thought it was sort of cheating :-)


> But another question is in the (extremely) rare circumstance of a
> non-CONFIG_SMP kernel. In that case, the kt->__per_cpu_offset[] array
> would be all NULL, and the symbol_value("per_cpu__cpu_number")
> call would return the qualified unity-mapped address. So the
> virtual address calculation should work in x86_64_per_cpu_init(),
> and the loop wouldn't even be entered in x86_64_get_smp_cpus()
>
> That being said, I don't think I've seen a recent x86_64 kernel
> that was not compiled CONFIG_SMP, so I can't confirm that it's
> ever been tested.
>
> So for sanity's sake, maybe your patch should also be applied,
> but should also check if the "i" index is non-zero?

So like this?
+ if (i && (kt->__per_cpu_offset[i] == NULL))
+ break;

So it's always ok to try the readmem on the first element of
the array. And the RETURN_ON_ERROR would deal with something going
wrong with that, although that case would presumably be a real problem
with the dump, right? (cpus == 0)

Thanks,
Bob M.

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 11-11-2009, 09:32 PM
Bob Montgomery
 
Default invalid kernel virtual address: cc08 type: "cpu number (per_cpu)"

On Wed, 2009-11-11 at 18:54 +0000, Dave Anderson wrote:

> > > But another question is in the (extremely) rare circumstance of a
> > > non-CONFIG_SMP kernel. In that case, the kt->__per_cpu_offset[] array
> > > would be all NULL, and the symbol_value("per_cpu__cpu_number")
> > > call would return the qualified unity-mapped address. So the
> > > virtual address calculation should work in x86_64_per_cpu_init(),
> > > and the loop wouldn't even be entered in x86_64_get_smp_cpus()
> > >
> > > That being said, I don't think I've seen a recent x86_64 kernel
> > > that was not compiled CONFIG_SMP, so I can't confirm that it's
> > > ever been tested.
> > >
> > > So for sanity's sake, maybe your patch should also be applied,
> > > but should also check if the "i" index is non-zero?

Now I'm thinking that test won't be needed for the non-CONFIG_SMP
kernel. If the array is full of 0x0s, the loop will compute the first
address as (0x0 + symbol_value("per_cpu__cpu_number")) and read a
cpunumber of 0. Then on the next iteration, it will calculate the very
same address again, and read the same cpunumber of 0. But now the test
is against cpus==1, so that test will fail and we'll drop out of the
loop, right?

In the real smp case, we'll still try to read the small offset (cc08)
like an address, but be spared any embarrassment by the QUIET|
RETURN_ON_ERROR fix.

At least that's how it looked when I tried to explain it at lunch :-)

Bob M.



--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 11-18-2009, 07:13 PM
Bob Montgomery
 
Default invalid kernel virtual address: cc08 type: "cpu number (per_cpu)"

On Thu, 2009-11-12 at 13:39 +0000, Dave Anderson wrote:
> ----- "Bob Montgomery" <bob.montgomery@hp.com> wrote:

>
> > In the real smp case, we'll still try to read the small offset (cc08)
> > like an address, but be spared any embarrassment by the QUIET|
> > RETURN_ON_ERROR fix.
>
> Just to be clear, I think that we agree that:
>
> (1) the QUIET|RETURN_ON_ERROR be applied in both functions,
> (2) the kt->__per_cpu_offset[] NULL-check should be completely dropped
> in x86_64_per_cpu_init(), and
> (3) the kt->__per_cpu_offset[] NULL-check should still be applied in
> x86_64_get_smp_cpus() since that loop pre-requires that it's SMP.

I think (3) makes it apparent what we're trying to prevent, but even
without the NULL-check, if we go ahead and access cc08, the QUIET|
RETURN_ON_ERROR fix alone would save us, I think. Either way my problem
goes away :-)

Is the next version getting close, or do we need to patch 4.1.0
internally for a while?

Bob Montgomery

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 

Thread Tools




All times are GMT. The time now is 06:58 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org