FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Crash Utility

 
 
LinkBack Thread Tools
 
Old 12-11-2009, 03:39 PM
Dave Anderson
 
Default Request for ppc64 help from IBM

Somewhere between the RHEL5 (2.6.18-based) and RHEL6 timeframe,
the ppc64 architecture has started using a virtual memmap scheme
for the arrays of page structures used to describe/handle
each physical page of memory.

In RHEL5, the page structures in the memmap array were unity-mapped
(i.e., the physical address is or'd with c000000000000000), as
"kmem -n" shows below in the sparsemem data breakdown under MEM_MAP:

crash> kmem -n
... [ snip ] ...
NR SECTION CODED_MEM_MAP MEM_MAP PFN
0 c000000000750000 c000000000760000 c000000000760000 0
1 c000000000750008 c000000000760000 c000000000763800 256
2 c000000000750010 c000000000760000 c000000000767000 512
3 c000000000750018 c000000000760000 c00000000076a800 768
4 c000000000750020 c000000000760000 c00000000076e000 1024
5 c000000000750028 c000000000760000 c000000000771800 1280
6 c000000000750030 c000000000760000 c000000000775000 1536
7 c000000000750038 c000000000760000 c000000000778800 1792
8 c000000000750040 c000000000760000 c00000000077c000 2048
9 c000000000750048 c000000000760000 c00000000077f800 2304
10 c000000000750050 c000000000760000 c000000000783000 2560
11 c000000000750058 c000000000760000 c000000000786800 2816
12 c000000000750060 c000000000760000 c00000000078a000 3072
...

also shown via the memmap page structure listing displayed by
"kmem -p":

crash> kmem -p
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
c000000000760000 0 0 0 1 400
c000000000760038 10000 0 0 1 400
c000000000760070 20000 0 0 1 400
c0000000007600a8 30000 0 0 1 400
c0000000007600e0 40000 0 0 1 400
c000000000760118 50000 0 0 1 400
c000000000760150 60000 0 0 1 400
c000000000760188 70000 0 0 1 400
c0000000007601c0 80000 0 0 1 400
c0000000007601f8 90000 0 0 1 400
...

In RHEL6 (2.6.31-38.el6) the memmap page array is apparently
virtually memmap'd -- using a virtual range of memory starting
at a heretofore-unseen virtual address range starting at
f000000000000000:

crash> kmem -n
... [ snip ] ...
NR SECTION CODED_MEM_MAP MEM_MAP PFN
0 c000000002160000 f000000000000000 f000000000000000 0
1 c000000002160020 f000000000000000 f000000000006800 256
2 c000000002160040 f000000000000000 f00000000000d000 512
3 c000000002160060 f000000000000000 f000000000013800 768
4 c000000002160080 f000000000000000 f00000000001a000 1024
5 c0000000021600a0 f000000000000000 f000000000020800 1280
6 c0000000021600c0 f000000000000000 f000000000027000 1536
7 c0000000021600e0 f000000000000000 f00000000002d800 1792
8 c000000002160100 f000000000000000 f000000000034000 2048
9 c000000002160120 f000000000000000 f00000000003a800 2304
10 c000000002160140 f000000000000000 f000000000041000 2560
... [ snip ] ...
crash> kmem -p
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
f000000000000000 0 0 0 0 0
f000000000000068 10000 0 0 0 0
f0000000000000d0 20000 0 0 0 0
f000000000000138 30000 0 0 0 0
f0000000000001a0 40000 0 0 0 0
f000000000000208 50000 0 -4611686016392006416 0 0
f000000000000270 60000 0 0 0 0
f0000000000002d8 70000 0 0 0 0
f000000000000340 80000 0 0 0 0
f0000000000003a8 90000 0 -4611686016730798344 0 0
f000000000000410 a0000 0 0 0 0
f000000000000478 b0000 0 0 0 0
f0000000000004e0 c0000 0 0 0 c0000000651534e0
f000000000000548 d0000 0 0 0 0
...

But as can be seen in the "kmem -p" output, and when using other
commands that actually read the data in the page structure, the
data read is either bogus or the readmem() of the address just fails
the virtual address translation and indicates that the page is not mapped.

Because the page structures' virtual address is not unity-mapped,
the page address gets translated via page table walk-through in the
same manner as vmalloc()'d addresses. In the ppc64 architecture,
the vmalloc range starts at d000000000000000:

crash> mach
...
KERNEL VIRTUAL BASE: c000000000000000
KERNEL VMALLOC BASE: d000000000000000
...

Since the ppc64 virtual-to-physical address translations of
these f000000000000000-based addresses returns either a
bogus physical address or fails entirely, this in turn causes
bizarre errors in crash commands that actually read the contents
of page structures -- such as "kmem -s", where slub data is
stored in the page structure.

So my speculation (guess?) is that the ppc64.c ppc64_vtop()
function needs updating to properly translate these addresses.

Since the ppc64 stuff in the crash utility was written by, and
has been maintained by IBM (and since I am ppc64-challenged),
can you guys take a look at what needs to be done?

Thanks,
Dave



--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 12-15-2009, 03:08 PM
Dave Anderson
 
Default Request for ppc64 help from IBM

----- "Dave Anderson" <anderson@redhat.com> wrote:

> Somewhere between the RHEL5 (2.6.18-based) and RHEL6 timeframe,
> the ppc64 architecture has started using a virtual memmap scheme
> for the arrays of page structures used to describe/handle
> each physical page of memory.

... [ snip ] ...

> So my speculation (guess?) is that the ppc64.c ppc64_vtop()
> function needs updating to properly translate these addresses.
>
> Since the ppc64 stuff in the crash utility was written by, and
> has been maintained by IBM (and since I am ppc64-challenged),
> can you guys take a look at what needs to be done?

[ sound of crickets... ]

Well that request apparently fell on deaf ears...

Here's my understanding of the situation.

In 2.6.26 the ppc64 architecture started using a new kernel virtual
memory region to map the kernel's page structure array(s), so that
now there are three kernel virtual memory regions:

KERNEL 0xc000000000000000
VMALLOC 0xd000000000000000
VMEMMAP 0xf000000000000000

The KERNEL region is the unity-mapped region, where the underlying
physical address can be determined by manipulating the virtual address
itself.

The VMALLOC region requires a page-table walk-through to find
the underlying physical address in a PTE.

The new VMEMMAP region is mapped in ppc64 firmware, where a
physical address of a given size is mapped to a VMEMMAP virtual
address. So for example, the page structure for physical page 0
is at VMEMMAP address 0xf000000000000000, the page for physical
page 1 is at f000000000000068, and so on. Once mapped in the
firmware TLB (?) the virtual-to-physical translation is done
automatically while running in kernel mode.

The problem is that the physical-to-vmemmap address/size mapping
information is not stored in the kernel proper, so there is
no way for the crash utility to make the translation. That
being the case, any crash command that needs to read the contents
of any page structure will fail.

The kernel mapping is performed here in 2.6.26 through 2.6.31:

int __meminit vmemmap_populate(struct page *start_page,
unsigned long nr_pages, int node)
{
unsigned long start = (unsigned long)start_page;
unsigned long end = (unsigned long)(start_page + nr_pages);
unsigned long page_size = 1 << mmu_psize_defs[mmu_vmemmap_psize].shift;

/* Align to the page size of the linear mapping. */
start = _ALIGN_DOWN(start, page_size);

for (; start < end; start += page_size) {
int mapped;
void *p;

if (vmemmap_populated(start, page_size))
continue;

p = vmemmap_alloc_block(page_size, node);
if (!p)
return -ENOMEM;

pr_debug("vmemmap %08lx allocated at %p, physical %08lx.
",
start, p, __pa(p));

mapped = htab_bolt_mapping(start, start + page_size, __pa(p),
pgprot_val(PAGE_KERNEL),
mmu_vmemmap_psize, mmu_kernel_ssize);
BUG_ON(mapped < 0);
}

return 0;
}

So if the pr_debug() statement is turned on, it shows on my test system:

vmemmap f000000000000000 allocated at c000000003000000, physical 03000000

This would make for an extremely simple virtual-to-physical translation
for the crash utility, but note that neither the unity-mapped virtual address
of 0xc000000003000000 nor its associated physical address of 0x3000000 are
stored anywhere, since "p" is a stack variable. The htab_bolt_mapping()
function does not store the mapping information in the kernel either, it
just uses temporary stack variables before calling the ppc_md.hpte_insert()
function which eventually leads to a machine-dependent (directly to firmware)
function.

So unless I'm missing something, nowhere along the vmemmap call-chain are the
VTOP address/size particulars stored anywhere -- say for example, in a
/proc/iomem-like "resource" data structure.

(FWIW, I note that in 2.6.32, CONFIG_PPC_BOOK3E arches still use the normal page
tables to map the memmap array(s). I don't know whether BOOK3E arch is the
most common or not...)

In any case, not being able to read the page structure contents has a
significant effect on the crash utility. This is about the only thing
that can be done for these kernels, where a warning gets printed during
initialization, and any command that attempts to read a page structure
will subsequently fail:

# crash vmlinux vmcore

crash 4.1.2p1
Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.

GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "powerpc64-unknown-linux-gnu"...

WARNING: cannot translate vmemmap kernel virtual addresses:
commands requiring page structure contents will fail

KERNEL: vmlinux
DUMPFILE: vmcore
CPUS: 2
DATE: Thu Dec 10 05:40:35 2009
UPTIME: 21:44:59
LOAD AVERAGE: 0.11, 0.03, 0.01
TASKS: 196
NODENAME: ibm-js20-04.lab.bos.redhat.com
RELEASE: 2.6.31-38.el6.ppc64
VERSION: #1 SMP Sun Nov 22 08:15:30 EST 2009
MACHINE: ppc64 (unknown Mhz)
MEMORY: 2 GB
PANIC: "Oops: Kernel access of bad area, sig: 11 [#1]" (check log for details)
PID: 10656
COMMAND: "runtest.sh"
TASK: c000000072156420 [THREAD_INFO: c000000072058000]
CPU: 0
STATE: TASK_RUNNING (PANIC)

crash> kmem -i
kmem: cannot translate vmemmap address: f000000000000000
crash> kmem -p
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
kmem: cannot translate vmemmap address: f000000000000000
crash> kmem -s
CACHE NAME OBJSIZE ALLOCATED TOTAL SLABS SSIZE
kmem: cannot translate vmemmap address: f00000000030db44
crash>

Can any of the IBM engineers on this list (or any ppc64 user)
confirm my findings? Maybe I'm missing something, but I don't
see it.

And if you agree, perhaps you can work on an upstream solution to
store the vmemmap-to-physical data information?

Dave

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 

Thread Tools




All times are GMT. The time now is 01:21 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org