ARM: crash registers might be read from the wrong physical address
These are the same lines in my case.
<readmem: c0d2af6c, KVADDR, "crash_notes", 4, (ROE), 85ba84c> <read_kdump: addr: c0d2af6c paddr: 80f2af6c cnt: 4> <readmem: f9fe0000, KVADDR, "note_buf_t", 560, (ROE), 85bac40> <--- !! <readmem: c0004000, KVADDR, "pgd page", 16384, (FOE), 914e8d0> I have never seen this problem before, so the behavior you see is exactly what I have seen before. However with a fairly new kernel I did not get the correct crash_notes. The investigation lead to the patch for the problem described in my previous mail. I have not investigated if there is any patch in newer kernels that changes this behavior and in that case where it comes from (it could be a patch by us). However as the algorithm for reading crash_notes is wrong, as it depends on a variable that is not yet initialized, I think it should be corrected anyhow. I have tested my patch with both newer and older kernels and it works as intended. Jan Jan Karlsson Senior Software Engineer MIB * Sony Mobile Communications Tel: +46703062174 sonymobile.com * -----Original Message----- From: crash-utility-bounces@redhat.com [mailto:crash-utility-bounces@redhat.com] On Behalf Of Dave Anderson Sent: onsdag den 18 juli 2012 15:13 To: Discussion list for crash utility usage, maintenance and development Cc: Fänge, Thomas Subject: Re: [Crash-utility] ARM: crash registers might be read from the wrong physical address ----- Original Message ----- > > > > > Hi Dave > > > > I found a problem in arm.c that arm_get_crash_notes() is called too > early. This has never been a problem until now. > > arm_get_crash_notes() in arm.c > calls readmem(, KVADDR, ) > which calls kvtop() > which calls machdep->kvtop that is arm_kvtop which uses > vt->vmalloc_start > vt->vmalloc_start is initialized in vm_init > > From main_loop: > > machdep_init(POST_GDB); > vm_init(); > machdep_init(POST_VM); > > arm_get_crash_notes() is currently called in the POST_GDB section of > machdep_init, but should be moved to the POST_VM section. I put the > comment and the code just before: > > if (init_unwind_tables()) { > > and then it works fine. Without this fix the crash registers might be > read from the wrong physical address. > > Jan Looking at the 2.6.38-based SMP ARM sample kernel I have, the arm_get_crash_notes() does not make any readmem() calls of a vmalloc address, only unity-mapped calls: $ crash -d7 vmlinux vmcore ... <readmem: c0b04230, KVADDR, "crash_notes", 4, (ROE), 85be9e0> <read_diskdump: addr: c0b04230 paddr: 80b04230 cnt: 4> <readmem: c0d5194c, KVADDR, "note_buf_t", 180, (ROE), 85bede0> <read_diskdump: addr: c0d5194c paddr: 80d5194c cnt: 180> ... Have newer ARM kernels changed how percpu addresses are translated such that the note_ptrs[] entries become vmalloc addresses here in arm_get_crash_notes():? if (symbol_exists("__per_cpu_offset")) { /* Add __per_cpu_offset for each cpu to form the pointer to the notes */ for (i = 0; i<kt->cpus; i++) notes_ptrs[i] = notes_ptrs[kt->cpus-1] + kt->__per_cpu_offset[i]; } Dave -- Crash-utility mailing list Crash-utility@redhat.com https://www.redhat.com/mailman/listinfo/crash-utility -- Crash-utility mailing list Crash-utility@redhat.com https://www.redhat.com/mailman/listinfo/crash-utility |
ARM: crash registers might be read from the wrong physical address
What I see is the following:
crash> p crash_notes crash_notes = $29 = (note_buf_t *) 0xf662e000 crash> p/x __per_cpu_offset $31 = {0x39b2000, 0x39ba000, 0x39c2000, 0x39ca000} 0xf662e000 + 0x39b2000 = 0xf9fe0000 which is the address seen in readmem. These are the interesting lines I see in source code (both newer and older kernels): note_buf_t *crash_notes; crash_notes = alloc_percpu(note_buf_t); I do not really understand this in detail, but it seems that alloc_percpu uses "chunks" and may allocate new chunks if there is not enough memory in the currently available chunks. So what might have happen is in older cases there is space in first(??) chunk, while in the newer case a new chunk have to be allocated. Jan Jan Karlsson Senior Software Engineer MIB * Sony Mobile Communications Tel: +46703062174 sonymobile.com * -----Original Message----- From: crash-utility-bounces@redhat.com [mailto:crash-utility-bounces@redhat.com] On Behalf Of Dave Anderson Sent: torsdag den 19 juli 2012 14:42 To: Discussion list for crash utility usage, maintenance and development Cc: Fänge, Thomas Subject: Re: [Crash-utility] ARM: crash registers might be read from the wrong physical address ----- Original Message ----- > These are the same lines in my case. > > <readmem: c0d2af6c, KVADDR, "crash_notes", 4, (ROE), 85ba84c> > <read_kdump: addr: c0d2af6c paddr: 80f2af6c cnt: 4> > <readmem: f9fe0000, KVADDR, "note_buf_t", 560, (ROE), 85bac40> <--- !! > <readmem: c0004000, KVADDR, "pgd page", 16384, (FOE), 914e8d0> > > I have never seen this problem before, so the behavior you see is > exactly what I have seen before. However with a fairly new kernel I > did not get the correct crash_notes. The investigation lead to the > patch for the problem described in my previous mail. > > I have not investigated if there is any patch in newer kernels that > changes this behavior and in that case where it comes from (it could > be a patch by us). However as the algorithm for reading crash_notes is > wrong, as it depends on a variable that is not yet initialized, I > think it should be corrected anyhow. I have tested my patch with both > newer and older kernels and it works as intended. OK, good. And so apparently the per-cpu region has been moved up into vmalloc space. I'll queue the change into crash-6.0.9. For curiosity's sake, can you show me the per-cpu symbol list? In my sample ARM kernel, it's located in the unity-mapped region just below the .text section, and can be seen like this: crash> sym -l ... [ cut ] ... c004e000 (d) .data..percpu c004e000 (D) __per_cpu_load c004e000 (D) __per_cpu_start c004e000 (D) cpu_data c004e040 (d) percpu_clockevent c004e098 (D) current_kprobe c004e09c (D) kprobe_ctlblk c004e130 (d) bp_on_reg c004e170 (d) wp_on_reg c004e1b0 (D) mmu_gathers c004e1c0 (D) current_mm c004e1e0 (D) kstat ... [ cut ] ... c004f0b4 (d) xmit_recursion c004f0b8 (d) rt_cache_stat c004f100 (d) runqueues c004f620 (d) gcwq_nr_running c004f640 (d) cfd_data c004f660 (d) call_single_queue c004f6a0 (d) csd_data c004f6c0 (D) softnet_data c004f7a0 (D) __per_cpu_end c0050000 (t) .text ... Your newer kernel must move it up to ~fxxxxxxx? Thanks, Dave -- Crash-utility mailing list Crash-utility@redhat.com https://www.redhat.com/mailman/listinfo/crash-utility -- Crash-utility mailing list Crash-utility@redhat.com https://www.redhat.com/mailman/listinfo/crash-utility |
ARM: crash registers might be read from the wrong physical address
I forgot to say that the __per_cpu_start symbol is placed at a similar address as you see in your example. So there is no change in the handling of the basic per_cpu area.
Jan -----Original Message----- From: Karlsson, Jan Sent: fredag den 20 juli 2012 09:49 To: 'Discussion list for crash utility usage, maintenance and development' Cc: Fänge, Thomas Subject: RE: [Crash-utility] ARM: crash registers might be read from the wrong physical address What I see is the following: crash> p crash_notes crash_notes = $29 = (note_buf_t *) 0xf662e000 crash> p/x __per_cpu_offset $31 = {0x39b2000, 0x39ba000, 0x39c2000, 0x39ca000} 0xf662e000 + 0x39b2000 = 0xf9fe0000 which is the address seen in readmem. These are the interesting lines I see in source code (both newer and older kernels): note_buf_t *crash_notes; crash_notes = alloc_percpu(note_buf_t); I do not really understand this in detail, but it seems that alloc_percpu uses "chunks" and may allocate new chunks if there is not enough memory in the currently available chunks. So what might have happen is in older cases there is space in first(??) chunk, while in the newer case a new chunk have to be allocated. Jan Jan Karlsson Senior Software Engineer MIB * Sony Mobile Communications Tel: +46703062174 sonymobile.com * -----Original Message----- From: crash-utility-bounces@redhat.com [mailto:crash-utility-bounces@redhat.com] On Behalf Of Dave Anderson Sent: torsdag den 19 juli 2012 14:42 To: Discussion list for crash utility usage, maintenance and development Cc: Fänge, Thomas Subject: Re: [Crash-utility] ARM: crash registers might be read from the wrong physical address ----- Original Message ----- > These are the same lines in my case. > > <readmem: c0d2af6c, KVADDR, "crash_notes", 4, (ROE), 85ba84c> > <read_kdump: addr: c0d2af6c paddr: 80f2af6c cnt: 4> > <readmem: f9fe0000, KVADDR, "note_buf_t", 560, (ROE), 85bac40> <--- !! > <readmem: c0004000, KVADDR, "pgd page", 16384, (FOE), 914e8d0> > > I have never seen this problem before, so the behavior you see is > exactly what I have seen before. However with a fairly new kernel I > did not get the correct crash_notes. The investigation lead to the > patch for the problem described in my previous mail. > > I have not investigated if there is any patch in newer kernels that > changes this behavior and in that case where it comes from (it could > be a patch by us). However as the algorithm for reading crash_notes is > wrong, as it depends on a variable that is not yet initialized, I > think it should be corrected anyhow. I have tested my patch with both > newer and older kernels and it works as intended. OK, good. And so apparently the per-cpu region has been moved up into vmalloc space. I'll queue the change into crash-6.0.9. For curiosity's sake, can you show me the per-cpu symbol list? In my sample ARM kernel, it's located in the unity-mapped region just below the .text section, and can be seen like this: crash> sym -l ... [ cut ] ... c004e000 (d) .data..percpu c004e000 (D) __per_cpu_load c004e000 (D) __per_cpu_start c004e000 (D) cpu_data c004e040 (d) percpu_clockevent c004e098 (D) current_kprobe c004e09c (D) kprobe_ctlblk c004e130 (d) bp_on_reg c004e170 (d) wp_on_reg c004e1b0 (D) mmu_gathers c004e1c0 (D) current_mm c004e1e0 (D) kstat ... [ cut ] ... c004f0b4 (d) xmit_recursion c004f0b8 (d) rt_cache_stat c004f100 (d) runqueues c004f620 (d) gcwq_nr_running c004f640 (d) cfd_data c004f660 (d) call_single_queue c004f6a0 (d) csd_data c004f6c0 (D) softnet_data c004f7a0 (D) __per_cpu_end c0050000 (t) .text ... Your newer kernel must move it up to ~fxxxxxxx? Thanks, Dave -- Crash-utility mailing list Crash-utility@redhat.com https://www.redhat.com/mailman/listinfo/crash-utility -- Crash-utility mailing list Crash-utility@redhat.com https://www.redhat.com/mailman/listinfo/crash-utility |
| All times are GMT. The time now is 07:34 AM. |
VBulletin, Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.