Handle the NT_PRSTATUS lost for the "bt" command
The purpose of this patch is to work out "bt" command for the diskdump
which NT_PRSTATUS note could not be saved by IPI lost. I think IPI is possibly lost by panic under the serious crashed condition. I noticed that "bt" failed in my ppc environment when the NT_PRSTATUS notes are lost on some CPUs while IPI delivery. Then, I made CPU map for prstatus in diskdump more correctable by checking a validation of crash_notes field. I've tested this problem by patching kernel like, - kernel/kexec.c void crash_save_cpu(struct pt_regs *regs, int cpu) { + if (current->pid == 0) + /* this cpu was idle; nothing to capture */ + return; It looks terrible and impractical test case but actually I met this code in my using distro's kernel. I couldn't reproduce actual IPI lost case, then fortunately, use this as a example of the causes if IPI could not be delivered to other CPUs. => Taking diskdump by sysrq+c and makedumpfile. crash> help -D | grep notes num_prstatus_notes: 1 notes_buf: 10ba91a8 notes[0]: 10ba91a8 crash> help -k | grep cpus cpus: 8 cpus_override: (null) crash> bt PID: 1001 TASK: ea62b000 CPU: 2 COMMAND: "bash" Segmentation fault Since seven idle cpus did not save NT_PRSTATUS note, crash could not handle CPU#2's note where is located as CPU#0's. With this patch, crash get to work out with correct CPU map to prstatus. WARNING: catch lost crash_notes at cpu#0 WARNING: catch lost crash_notes at cpu#1 WARNING: catch lost crash_notes at cpu#3 WARNING: catch lost crash_notes at cpu#4 WARNING: catch lost crash_notes at cpu#5 WARNING: catch lost crash_notes at cpu#6 WARNING: catch lost crash_notes at cpu#7 crash.fix> help -D | grep notes num_prstatus_notes: 1 notes_buf: 107a3378 notes[2]: 107a3378 crash.fix> help -k | grep cpus cpus: 8 cpus_override: (null) crash.fix> bt PID: 1001 TASK: ea62b000 CPU: 2 COMMAND: "bash" R0: 00000001 R1: eb793e60 R2: ea62b000 R3: 00000063 R4: 00000000 R5: ffffffff R6: c043ba2c R7: 00000000 R8: 00008000 R9: 00000000 R10: 00000000 R11: eb793e70 R12: 28242444 R13: 100b8448 R14: 100b07b8 R15: 100b0894 R16: 00000000 R17: 00000000 R18: 00000000 R19: 1006d270 R20: 00000000 R21: 100f0430 R22: 00000000 R23: 00000001 R24: c08f1ac8 R25: 00029002 R26: c08f1bac R27: c08d0000 R28: 00000000 R29: c09ada48 R30: 00000063 R31: eb793e60 NIP: c0423378 MSR: 00021002 OR3: c09ada48 CTR: c0423344 LR: c0423d8c XER: 00000000 CCR: 28242444 MQ: 00008000 DAR: 00000000 DSISR: 00800000 Syscall Result: eb793e60 NIP [00000000c0423378] sysrq_handle_crash LR [00000000c0423d8c] __handle_sysrq #0 [eb793e60] sysrq_handle_crash at c0423378 : snip Thanks, Toshi -- Crash-utility mailing list Crash-utility@redhat.com https://www.redhat.com/mailman/listinfo/crash-utility |
| All times are GMT. The time now is 01:30 AM. |
VBulletin, Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.