FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Crash Utility

 
 
LinkBack Thread Tools
 
Old 06-18-2012, 04:01 PM
Dave Anderson
 
Default Handle the NT_PRSTATUS lost for the "bt" command

----- Original Message -----
> The purpose of this patch is to work out "bt" command for the diskdump
> which NT_PRSTATUS note could not be saved by IPI lost.
> I think IPI is possibly lost by panic under the serious crashed condition.
>
> I noticed that "bt" failed in my ppc environment
> when the NT_PRSTATUS notes are lost on some CPUs while IPI delivery.
> Then, I made CPU map for prstatus in diskdump more correctable
> by checking a validation of crash_notes field.
>
> I've tested this problem by patching kernel like, kernel/kexec.c
> void crash_save_cpu(struct pt_regs *regs, int cpu)
> {
> + if (current->pid == 0)
> + /* this cpu was idle; nothing to capture */
> + return;
>
> It looks terrible and impractical test case but actually I met this code
> in my using distro's kernel. I couldn't reproduce actual IPI lost case,
> then fortunately, use this as a example of the causes if IPI could not be
> delivered to other CPUs.
>
> => Taking diskdump by sysrq+c and makedumpfile.
>
> crash> help -D | grep notes
> num_prstatus_notes: 1
> notes_buf: 10ba91a8
> notes[0]: 10ba91a8
> crash> help -k | grep cpus
> cpus: 8
> cpus_override: (null)
> crash> bt
> PID: 1001 TASK: ea62b000 CPU: 2 COMMAND: "bash"
> Segmentation fault
>
> Since seven idle cpus did not save NT_PRSTATUS note,
> crash could not handle CPU#2's note where is located as CPU#0's.
>
> With this patch, crash get to work out with correct CPU map to
> prstatus.
>
> WARNING: catch lost crash_notes at cpu#0
> WARNING: catch lost crash_notes at cpu#1
> WARNING: catch lost crash_notes at cpu#3
> WARNING: catch lost crash_notes at cpu#4
> WARNING: catch lost crash_notes at cpu#5
> WARNING: catch lost crash_notes at cpu#6
> WARNING: catch lost crash_notes at cpu#7
> crash.fix> help -D | grep notes
> num_prstatus_notes: 1
> notes_buf: 107a3378
> notes[2]: 107a3378
> crash.fix> help -k | grep cpus
> cpus: 8
> cpus_override: (null)
> crash.fix> bt
> PID: 1001 TASK: ea62b000 CPU: 2 COMMAND: "bash"
>
> R0: 00000001 R1: eb793e60 R2: ea62b000 R3: 00000063
> R4: 00000000 R5: ffffffff R6: c043ba2c R7: 00000000
> R8: 00008000 R9: 00000000 R10: 00000000 R11: eb793e70
> R12: 28242444 R13: 100b8448 R14: 100b07b8 R15: 100b0894
> R16: 00000000 R17: 00000000 R18: 00000000 R19: 1006d270
> R20: 00000000 R21: 100f0430 R22: 00000000 R23: 00000001
> R24: c08f1ac8 R25: 00029002 R26: c08f1bac R27: c08d0000
> R28: 00000000 R29: c09ada48 R30: 00000063 R31: eb793e60
> NIP: c0423378 MSR: 00021002 OR3: c09ada48 CTR: c0423344
> LR: c0423d8c XER: 00000000 CCR: 28242444 MQ: 00008000
> DAR: 00000000 DSISR: 00800000 Syscall Result: eb793e60
> NIP [00000000c0423378] sysrq_handle_crash
> LR [00000000c0423d8c] __handle_sysrq
>
> #0 [eb793e60] sysrq_handle_crash at c0423378
> : snip
>
> Thanks,
> Toshi

Toshi,

I don't want to add any new initialization-time code -- especially if
it's related to the NT_PRSTATUS notes -- that can abort a crash session
unnecessarily. In your new crash_was_lost_crash_note() function, there
are these two FAULT_ON_ERROR readmem() calls:

readmem(symbol_value("crash_notes"), KVADDR, &crash_notes_ptr,
sizeof(ulong), "crash_notes", FAULT_ON_ERROR);
and

readmem(crash_notes_ptr, KVADDR, buf, SIZE(note_buf),
"cpu crash_notes", FAULT_ON_ERROR);

Although they are highly unlikely to fail, can you please make
both of them RETURN_ON_ERROR, and if the readmem() fails, have
it bail out and return FALSE? And then, if necessary, make any
adjustments to map_cpus_to_prstatus_kdump_cmprs() to handle that
remote possibility. You should be able to test it with your
patched kernel.

Also, I don't understand the wording of this error message
at the end of crash_was_lost_crash_note():

error(WARNING, "catch lost crash_notes at cpu#%d
", cpu);

Can you re-word that? The notes were not "lost", but rather were
"not saved" by the crashing system.

Lastly, in __diskdump_memory_dump(), you just skip the "lost"
notes sections:

for (i = 0, j = 0; j < dd->num_prstatus_notes; i++) {
if (dd->nt_prstatus_percpu[i] == NULL)
continue;
fprintf(fp, " notes[%d]: %lx
",
i, (ulong)dd->nt_prstatus_percpu[i]);
j++;
}

Can you make it more obvious, say, by displaying something like:

notes[6]: (not saved)

Thanks,
Dave


--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 06-19-2012, 08:47 AM
Toshikazu Nakayama
 
Default Handle the NT_PRSTATUS lost for the "bt" command

Hi Dave,

Thanks for your comments.

(2012/06/19 1:01), Dave Anderson wrote:

> I don't want to add any new initialization-time code -- especially if
> it's related to the NT_PRSTATUS notes -- that can abort a crash session
> unnecessarily. In your new crash_was_lost_crash_note() function, there
> are these two FAULT_ON_ERROR readmem() calls:
>
> readmem(symbol_value("crash_notes"), KVADDR,&crash_notes_ptr,
> sizeof(ulong), "crash_notes", FAULT_ON_ERROR);
> and
>
> readmem(crash_notes_ptr, KVADDR, buf, SIZE(note_buf),
> "cpu crash_notes", FAULT_ON_ERROR);
>
> Although they are highly unlikely to fail, can you please make
> both of them RETURN_ON_ERROR, and if the readmem() fails, have
> it bail out and return FALSE?

I see, I'll make consideration about initialization-time rule from now,
to keep crash session, should use readmem() with QUIET|RETURN_ON_ERROR.

> And then, if necessary, make any
> adjustments to map_cpus_to_prstatus_kdump_cmprs() to handle that
> remote possibility. You should be able to test it with your
> patched kernel.

I'm not able to test for unexpected, deliberate readmem() failure
because my core file contains "crash_notes" memory field...

Although I've probably misunderstood your advice,
I had better add more debugging messages into
map_cpus_to_prstatus_kdump_cmprs() to handle conditions
where imply "not saved NT_PRSTATUS" or "nt_prstatus_percpu[] is remapped".

> Also, I don't understand the wording of this error message
> at the end of crash_was_lost_crash_note():
>
> error(WARNING, "catch lost crash_notes at cpu#%d
", cpu);
>
> Can you re-word that? The notes were not "lost", but rather were
> "not saved" by the crashing system.

I re-word all "lost" keywords, so too function name with "saved".
And also make this warning valid only when CRASHDEBUG() is enable
because we can check it later by using "help -D".

> Lastly, in __diskdump_memory_dump(), you just skip the "lost"
> notes sections:
>
> for (i = 0, j = 0; j< dd->num_prstatus_notes; i++) {
> if (dd->nt_prstatus_percpu[i] == NULL)
> continue;
> fprintf(fp, " notes[%d]: %lx
",
> i, (ulong)dd->nt_prstatus_percpu[i]);
> j++;
> }
>
> Can you make it more obvious, say, by displaying something like:
>
> notes[6]: (not saved)

Looks better, thanks for giving good hint.

[updates by attached patch]

- display messages only if CRASHDEBUG() >= 1
crash -d 1
=> display about NT_PRSTATUS mapping messages as below.
----------------
WARNING: cpu#0: not saved crash_notes
WARNING: cpu#1: not saved crash_notes
crash: NT_PRSTATUS: cpu#2 = 107a34a8 (remapped from cpu#0)
WARNING: cpu#3: not saved crash_notes
WARNING: cpu#4: not saved crash_notes
WARNING: cpu#5: not saved crash_notes
WARNING: cpu#6: not saved crash_notes
WARNING: cpu#7: not saved crash_notes
----------------

- help message is enhanced
crash> help -D | grep notes
num_prstatus_notes: 1
notes_buf: 107a34a8
notes[0]: (not saved)
notes[1]: (not saved)
notes[2]: 107a34a8
notes[3]: (not saved)
notes[4]: (not saved)
notes[5]: (not saved)
notes[6]: (not saved)
notes[7]: (not saved)

Thanks,
Toshi

> Thanks,
> Dave
>
>
> --
> Crash-utility mailing list
> Crash-utility@redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
>

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 06-19-2012, 04:20 PM
Dave Anderson
 
Default Handle the NT_PRSTATUS lost for the "bt" command

----- Original Message -----
> Hi Dave,
>
> Thanks for your comments.
>
> (2012/06/19 1:01), Dave Anderson wrote:
>
> > I don't want to add any new initialization-time code -- especially if
> > it's related to the NT_PRSTATUS notes -- that can abort a crash session
> > unnecessarily. In your new crash_was_lost_crash_note() function, there
> > are these two FAULT_ON_ERROR readmem() calls:
> >
> > readmem(symbol_value("crash_notes"), KVADDR,&crash_notes_ptr,
> > sizeof(ulong), "crash_notes", FAULT_ON_ERROR);
> > and
> >
> > readmem(crash_notes_ptr, KVADDR, buf, SIZE(note_buf),
> > "cpu crash_notes", FAULT_ON_ERROR);
> >
> > Although they are highly unlikely to fail, can you please make
> > both of them RETURN_ON_ERROR, and if the readmem() fails, have
> > it bail out and return FALSE?
>
> I see, I'll make consideration about initialization-time rule from now,
> to keep crash session, should use readmem() with QUIET|RETURN_ON_ERROR.
>
> > And then, if necessary, make any adjustments to map_cpus_to_prstatus_kdump_cmprs()
> > to handle that remote possibility. You should be able to test it with your
> > patched kernel.
>
> I'm not able to test for unexpected, deliberate readmem() failure
> because my core file contains "crash_notes" memory field...
>
> Although I've probably misunderstood your advice,
> I had better add more debugging messages into
> map_cpus_to_prstatus_kdump_cmprs() to handle conditions
> where imply "not saved NT_PRSTATUS" or "nt_prstatus_percpu[] is
> remapped".
>
> > Also, I don't understand the wording of this error message
> > at the end of crash_was_lost_crash_note():
> >
> > error(WARNING, "catch lost crash_notes at cpu#%d
", cpu);
> >
> > Can you re-word that? The notes were not "lost", but rather were
> > "not saved" by the crashing system.
>
> I re-word all "lost" keywords, so too function name with "saved".
> And also make this warning valid only when CRASHDEBUG() is enable
> because we can check it later by using "help -D".
>
> > Lastly, in __diskdump_memory_dump(), you just skip the "lost"
> > notes sections:
> >
> > for (i = 0, j = 0; j< dd->num_prstatus_notes; i++) {
> > if (dd->nt_prstatus_percpu[i] == NULL)
> > continue;
> > fprintf(fp, " notes[%d]: %lx
",
> > i, (ulong)dd->nt_prstatus_percpu[i]);
> > j++;
> > }
> >
> > Can you make it more obvious, say, by displaying something like:
> >
> > notes[6]: (not saved)
>
> Looks better, thanks for giving good hint.
>
> [updates by attached patch]
>
> - display messages only if CRASHDEBUG() >= 1
> crash -d 1
> => display about NT_PRSTATUS mapping messages as below.
> ----------------
> WARNING: cpu#0: not saved crash_notes
> WARNING: cpu#1: not saved crash_notes
> crash: NT_PRSTATUS: cpu#2 = 107a34a8 (remapped from cpu#0)
> WARNING: cpu#3: not saved crash_notes
> WARNING: cpu#4: not saved crash_notes
> WARNING: cpu#5: not saved crash_notes
> WARNING: cpu#6: not saved crash_notes
> WARNING: cpu#7: not saved crash_notes
> ----------------
>
> - help message is enhanced
> crash> help -D | grep notes
> num_prstatus_notes: 1
> notes_buf: 107a34a8
> notes[0]: (not saved)
> notes[1]: (not saved)
> notes[2]: 107a34a8
> notes[3]: (not saved)
> notes[4]: (not saved)
> notes[5]: (not saved)
> notes[6]: (not saved)
> notes[7]: (not saved)
>
> Thanks,
> Toshi

OK, now I'm getting confused...

I note that currently __diskdump_memory_dump() does this:

for (i = 0; i < dd->num_prstatus_notes; i++) {
fprintf(fp, " notes[%d]: %lx
",
i, (ulong)dd->nt_prstatus_percpu[i]);
}

But your patch does this:

nrcpus = kt->kernel_NR_CPUS ? kt->kernel_NR_CPUS : NR_CPUS;

for (i = 0; i < nrcpus; i++) {
if (dd->nt_prstatus_percpu[i] == NULL) {
fprintf(fp,
" notes[%d]: (not saved)
",
i);
continue;
}
fprintf(fp, " notes[%d]: %lx
",
i, (ulong)dd->nt_prstatus_percpu[i]);
}

Now surely we don't want to dump "NR_CPUS" notes pointers,
unless there actually are that many cpus in the system.
For example, in RHEL6 sets CONFIG_NR_CPUS to 4096 for x86_64,
but rarely do we see a dumpfile with that many cpus.
So I would prefer that the for loop continue to be bounded
by "dd->num_prstatus_notes".

However, given that process_elf32_notes() and process_elf64_notes()
do a cursory test of each note by checking for the NT_PRSTATUS
type, I'm presuming that in your dumpfile, dd->num_prstatus_notes
must be equal to 1, correct?

void
process_elf64_notes(void *note_buf, unsigned long size_note)
{
Elf64_Nhdr *nt;
size_t index, len = 0;
int num = 0;

for (index = 0; index < size_note; index += len) {
nt = note_buf + index;

if(nt->n_type == NT_PRSTATUS) {
dd->nt_prstatus_percpu[num] = nt;
num++;
}
len = sizeof(Elf64_Nhdr);
len = roundup(len + nt->n_namesz, 4);
len = roundup(len + nt->n_descsz, 4);
}

if (num > 0) {
pc->flags2 |= ELF_NOTES;
dd->num_prstatus_notes = num;
}
return;
}

But then map_cpus_to_prstatus_kdump_cmprs() remaps the
notes without modifying the dd->num_prstatus_notes counter.

So then there's this:

void *
diskdump_get_prstatus_percpu(int cpu)
{
if ((cpu < 0) || (cpu >= dd->num_prstatus_notes))
return NULL;

return dd->nt_prstatus_percpu[cpu];
}

So in a case such as your example, where cpu 2 was the only cpu that saved
its notes, the function above would incorrectly return NULL when called
with diskdump_get_prstatus_percpu(2).

What am I missing?

Dave

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 06-19-2012, 06:25 PM
Dave Anderson
 
Default Handle the NT_PRSTATUS lost for the "bt" command

----- Original Message -----

>
> OK, now I'm getting confused...
>

The more I look at this patch, the more confused I get...

During initialization, the ELF notes contained in the
dumpfile file header are scanned, and if an NT_PRSTATUS
note is seen, a pointer to its location in the dumpfile
is saved in dd->nt_prstatus_percpu[num] and the "num"
of valid notes is kept in dd->num_prstatus_notes.

If the dd->num_prstatus_notes is equal to the online cpu
count, then it is presumed that there is a one-to-one
relationship, where the cpu number can be used as the
index into the dd->nt_prstatus_percpu[num] array.

If the number of notes is not equal to the number of online
cpus, then the "mapping" function is called, where if a
cpu is found to be offline, then its (incorrectly) associated
entry in the dd->nt_prstatus_percpu[num] array is "pushed up"
to the next higher entry. But the dd->num_prstatus_notes
does not seem to get incremented to reflect that move, so
then it's seems like diskdump_get_prstatus_percpu() can
possibly return NULL when there actually is a relevant
NT_PRSTATUS note.

That seems to be a bug (?), but it's not particularly important,
because for x86 and x86_64, the data in the NT_PRSTATUS notes is
only used if the starting point for backtraces if the PC/SP pair
cannot be determined otherwise, which is the case virtually all of
the time. So the registers found in the NT_PRSTATUS notes are
pretty much useless...

Now, to complicate matters, your patch does not look at the
NT_PRSTATUS notes in the dumpfile header, but instead looks
at the base kernel's original notes, and verifies their
contents, and correlates what's found there against what was
found in the dumpfile? So I don't understand what you are
attempting to do -- what is the difference between the notes
that are copied into the dumpfile vs. what you are looking at
in the base kernel?

I'm also wondering what would happen in your case if there
were a combination of "lost" notes *and* offline cpus? How
would that work?

So at this point I really don't want to add this patch
at all because it touches common code, and I don't want to
risk breaking the other arches. Nobody has ever reported
any "lost" cpus so far, probably because the kdump facility
uses non-maskable NMI's to shutdown the non-panicking cpus.
This is such a highly-unlikely corner case, that it does
not even seem worth addressing for fear of breaking something
else.

I didn't look at the reasoning behind why you ran into a
segmentation violation, but since the PPC code path would be:

...
back_trace()
get_diskdump_regs()
get_diskdump_regs_ppc()

perhaps you can rework your patch so that it is segregated
to PPC only?

Dave

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 06-20-2012, 12:31 AM
Toshikazu Nakayama
 
Default Handle the NT_PRSTATUS lost for the "bt" command

(2012/06/20 1:20), Dave Anderson wrote:

> OK, now I'm getting confused...
>
> I note that currently __diskdump_memory_dump() does this:
>
> for (i = 0; i< dd->num_prstatus_notes; i++) {
> fprintf(fp, " notes[%d]: %lx
",
> i, (ulong)dd->nt_prstatus_percpu[i]);
> }
>
> But your patch does this:
>
> nrcpus = kt->kernel_NR_CPUS ? kt->kernel_NR_CPUS : NR_CPUS;
>
> for (i = 0; i< nrcpus; i++) {
> if (dd->nt_prstatus_percpu[i] == NULL) {
> fprintf(fp,
> " notes[%d]: (not saved)
",
> i);
> continue;
> }
> fprintf(fp, " notes[%d]: %lx
",
> i, (ulong)dd->nt_prstatus_percpu[i]);
> }
>
> Now surely we don't want to dump "NR_CPUS" notes pointers,
> unless there actually are that many cpus in the system.
> For example, in RHEL6 sets CONFIG_NR_CPUS to 4096 for x86_64,
> but rarely do we see a dumpfile with that many cpus.
> So I would prefer that the for loop continue to be bounded
> by "dd->num_prstatus_notes".

I also bothered about condition whether loop should break
at dd->nt_prstatus_percpu or not, even though remaining online cpu's
notes can't display as "not saved".

However, I choised nrcpus way since my environment could break
with online cpu nums, not NR_CPUS.
Now, I check kernel_NR_CPUS validation.
if (STREQ(ln, "CONFIG_NR_CPUS")) {
kt->kernel_NR_CPUS = atoi(val);
I'm aware if ikconfig is not valid, break condition of loop is always
NR_CPUS without doubt and display gets noisy like as your counsel.

> However, given that process_elf32_notes() and process_elf64_notes()
> do a cursory test of each note by checking for the NT_PRSTATUS
> type, I'm presuming that in your dumpfile, dd->num_prstatus_notes
> must be equal to 1, correct?

Yes, dd->num_prstatus_notes in my dumpfile is 1.

> void
> process_elf64_notes(void *note_buf, unsigned long size_note)
> {
> Elf64_Nhdr *nt;
> size_t index, len = 0;
> int num = 0;
>
> for (index = 0; index< size_note; index += len) {
> nt = note_buf + index;
>
> if(nt->n_type == NT_PRSTATUS) {
> dd->nt_prstatus_percpu[num] = nt;
> num++;
> }
> len = sizeof(Elf64_Nhdr);
> len = roundup(len + nt->n_namesz, 4);
> len = roundup(len + nt->n_descsz, 4);
> }
>
> if (num> 0) {
> pc->flags2 |= ELF_NOTES;
> dd->num_prstatus_notes = num;
> }
> return;
> }
>
> But then map_cpus_to_prstatus_kdump_cmprs() remaps the
> notes without modifying the dd->num_prstatus_notes counter.
>
> So then there's this:
>
> void *
> diskdump_get_prstatus_percpu(int cpu)
> {
> if ((cpu< 0) || (cpu>= dd->num_prstatus_notes))
> return NULL;
>
> return dd->nt_prstatus_percpu[cpu];
> }
>
> So in a case such as your example, where cpu 2 was the only cpu that saved
> its notes, the function above would incorrectly return NULL when called
> with diskdump_get_prstatus_percpu(2).
>
> What am I missing?

No, you are entirely right and I could not condier about x86 parts.
I'm getting the feeling that there might be more affects for any other arches.
I reconsider different approach to fix NT_PRSTATUS up.

Thanks,
Toshi

> Dave
>
> --
> Crash-utility mailing list
> Crash-utility@redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
>

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 06-20-2012, 07:36 AM
Toshikazu Nakayama
 
Default Handle the NT_PRSTATUS lost for the "bt" command

(2012/06/20 3:25), Dave Anderson wrote:
>
>
> ----- Original Message -----
>
>>
>> OK, now I'm getting confused...
>>
>
> The more I look at this patch, the more confused I get...
>
> During initialization, the ELF notes contained in the
> dumpfile file header are scanned, and if an NT_PRSTATUS
> note is seen, a pointer to its location in the dumpfile
> is saved in dd->nt_prstatus_percpu[num] and the "num"
> of valid notes is kept in dd->num_prstatus_notes.
>
> If the dd->num_prstatus_notes is equal to the online cpu
> count, then it is presumed that there is a one-to-one
> relationship, where the cpu number can be used as the
> index into the dd->nt_prstatus_percpu[num] array.
>
> If the number of notes is not equal to the number of online
> cpus, then the "mapping" function is called, where if a
> cpu is found to be offline, then its (incorrectly) associated
> entry in the dd->nt_prstatus_percpu[num] array is "pushed up"
> to the next higher entry. But the dd->num_prstatus_notes
> does not seem to get incremented to reflect that move, so
> then it's seems like diskdump_get_prstatus_percpu() can
> possibly return NULL when there actually is a relevant
> NT_PRSTATUS note.

A dd->num_prstatus_notes is equal to number of NT_PRSTATUS notes in dump file
and unless nt_prstatus_percpu[] is continuous from 0,
diskdump_get_prstatus_percpu() can possibly return NULL.

> That seems to be a bug (?), but it's not particularly important,
> because for x86 and x86_64, the data in the NT_PRSTATUS notes is
> only used if the starting point for backtraces if the PC/SP pair
> cannot be determined otherwise, which is the case virtually all of
> the time. So the registers found in the NT_PRSTATUS notes are
> pretty much useless...

I seem to be minor bug, if some of cpus are offline while panic,
can't backtrace the same number of cpus from tail because of returning NULL.
I think current task's PC/SP can not obtain from thread_struct corectly,
then NT_PRSTATUS notes need to be stored for them instead.
But anyway, panic under offline is very unlikely condition
as same as the NMI failed.

> Now, to complicate matters, your patch does not look at the
> NT_PRSTATUS notes in the dumpfile header, but instead looks
> at the base kernel's original notes, and verifies their
> contents, and correlates what's found there against what was
> found in the dumpfile? So I don't understand what you are
> attempting to do -- what is the difference between the notes
> that are copied into the dumpfile vs. what you are looking at
> in the base kernel?

At first, my patch is trying to find out tainted note area in base kernel.
Such notes can not be found out from dumpfile, however, found out other ones
are set continuously into nt_prstatus_percpu[from 0 to found out notes].
I just want to repaire nt_prstatus_percpu[#index] corresponding to cpu#index.
[example]
cpu#0 cpu#1 cpu#2 cpu#3
saved tainted saved tainted [by looking at base kernel]

=> nt_prstatus_percpu[0] = cpu#0's note, nt_prstatus_percpu[1] = cpu#3's note
I want to repaire array like,
=> nt_prstatus_percpu[0] = cpu#0's note, nt_prstatus_percpu[3] = cpu#3's note

Since ELF note format dose not contain information which cpu saved me,
I think there is no way to identify mapping between cpu and
nt_prstatus_percpu[] from ELF note when both are not equal to.
That's why I try to resolve from base kernel area.

> I'm also wondering what would happen in your case if there
> were a combination of "lost" notes *and* offline cpus? How
> would that work?

I think offline cpus never ack IPI, thus lost notes everytime(not concerned).
I only assume the case that cpu was online but could not ack IPI caused
by any reasons in interrupts off.

> So at this point I really don't want to add this patch
> at all because it touches common code, and I don't want to
> risk breaking the other arches. Nobody has ever reported
> any "lost" cpus so far, probably because the kdump facility
> uses non-maskable NMI's to shutdown the non-panicking cpus.
> This is such a highly-unlikely corner case, that it does
> not even seem worth addressing for fear of breaking something
> else.

I can agree, x86 architecture own NMI which is most reliable way to
deliver interrupts to other cpus while powerpc may not own NMI.
I am going to rework patch so that it can be effective only in ppc arch.

> I didn't look at the reasoning behind why you ran into a
> segmentation violation, but since the PPC code path would be:
>
> ...
> back_trace()
> get_diskdump_regs()
> get_diskdump_regs_ppc()

get_diskdump_regs_ppc() is touching dd->nt_prstatus_percpu[bt->tc->processor]
when "bt" for panic_task.
---------
if (KDUMP_CMPRS_VALID() &&
(bt->task == tt->panic_task ||
(is_task_active(bt->task) && (dd->num_prstatus_notes > 1)))) {
note = (Elf32_Nhdr*) dd->nt_prstatus_percpu[bt->tc->processor];
if (!note)
error(FATAL,
"cannot determine NT_PRSTATUS ELF note "
"for %s task: %lx
",
(bt->task == tt->panic_task) ?
"panic" : "active", bt->task);
len = sizeof(Elf32_Nhdr);
len = roundup(len + note->n_namesz, 4);
bt->machdep = (void *)((char *)note + len +
MEMBER_OFFSET("elf_prstatus", "pr_reg"));
}

machdep->get_stack_frame(bt, eip, esp);
---------

In my dumpfile case, bt->tc->processor is 2 and its note is contained at
dd->nt_prstatus_percpu[0].
note = (Elf32_Nhdr*) dd->nt_prstatus_percpu[2];
* Here is no note instance which "bt" expects.

A dd->nt_prstatus_percpu[] is allocated by malloc() and not initialied with 0.
When note->n_namesz is executed, it is cause of segmentation violation.
This malloc() is moved to calloc() in my first patch.

Other problem which is also appeared in get_netdump_regs_ppc().
A dd->num_prstatus_notes is 1 in my dumpfile although there were 8 online cpus.
If next condition "is_task_active(bt->task)" is required by "bt",
get_diskdump_regs_ppc() unfortunately do machdep->get_stack_frame() for
active tasks and surely failed.
"cannot determine NT_PRSTATUS ELF note ..." should be proper.

> perhaps you can rework your patch so that it is segregated
> to PPC only?

Yes I'm going to rework, and thanks for your reviews.
I'm modifying difficult parts this time, very helpful.

I'll do more tests before sending reworked PPC only patch.
I also have to study about kexectools or makedumpfile more about
how NT_PRSTATUS or others are treated, haven't understood enough.

Toshi

> Dave
>
> --
> Crash-utility mailing list
> Crash-utility@redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
>

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 
Old 06-20-2012, 02:30 PM
Dave Anderson
 
Default Handle the NT_PRSTATUS lost for the "bt" command

----- Original Message -----

> > That seems to be a bug (?), but it's not particularly important,
> > because for x86 and x86_64, the data in the NT_PRSTATUS notes is
> > only used if the starting point for backtraces if the PC/SP pair
> > cannot be determined otherwise, which is the case virtually all of
> > the time. So the registers found in the NT_PRSTATUS notes are
> > pretty much useless...
>
> I seem to be minor bug, if some of cpus are offline while panic,
> can't backtrace the same number of cpus from tail because of returning NULL.
> I think current task's PC/SP can not obtain from thread_struct corectly,
> then NT_PRSTATUS notes need to be stored for them instead.

For the active tasks, the PC/SP starting points can typically be found
by looking at the stack contents. The NT_PRSTATUS notes are a relatively
recent addition to compressed kdumps, i.e., have only existed since
header version 4. Only in very rare circumstances do the x86 and
x86_64 arches need to utilize the registers in the compressed kdump
NT_PRSTATUS notes.

That being said, I will fix the invalid value that is stored in
dd->num_prstatus_notes so that diskdump_get_prstatus_percpu()
will work as expected if there are offline cpus. That function is
currently used only by x86 and x86_64, so there should be no issues
with the other arches.

> Yes I'm going to rework, and thanks for your reviews.
> I'm modifying difficult parts this time, very helpful.
>
> I'll do more tests before sending reworked PPC only patch.
> I also have to study about kexectools or makedumpfile more about
> how NT_PRSTATUS or others are treated, haven't understood enough.

OK thanks,
Dave

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility
 

Thread Tools




All times are GMT. The time now is 06:45 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org