> Hi Dave,
>
> I got an s390x dump of a Linux 2.6.36 system, where a task (kmcheck, pid=44) is
> missing in the ps output. I debugged the problem and I think that I found the
> reason:
>
> It looks like that crash does not walk the linked list of the pid hash table
> to the end, if it finds a NULL pointer in the pid.tasks[PIDTYPE_PID=0]
> array. Unfortunately, for the struct pid that is before our lost task in the
> linked list this condition is true. Therefore crash does not find our task.
That sounds similar to the fix Bob Montgomery made in 5.0.7:
- Fix for the potential to miss one or more tasks in 2.6.23 and earlier
kernels, presumably due to catching an entry the kernel's pid_hash[]
chain in transition. Without the patch, the task will simply not be
seen in the gathered task list.
(bob.montgomery@hp.com)
where this was his patch posting -- which fixed refresh_hlist_task_table_v2():
[Crash-utility] Missing PID 1 is crash problem with losing tasks
https://www.redhat.com/archives/crash-utility/2010-August/msg00049.html
and where your patch fixes refresh_hlist_task_table_v3().
> ----- "Michael Holzheu" <holzheu@linux.vnet.ibm.com> wrote:
>
> > Hi Dave,
> >
> > I got an s390x dump of a Linux 2.6.36 system, where a task (kmcheck, pid=44) is
> > missing in the ps output. I debugged the problem and I think that I found the
> > reason:
> >
> > It looks like that crash does not walk the linked list of the pid hash table
> > to the end, if it finds a NULL pointer in the pid.tasks[PIDTYPE_PID=0]
> > array. Unfortunately, for the struct pid that is before our lost task in the
> > linked list this condition is true. Therefore crash does not find our task.
>
> That sounds similar to the fix Bob Montgomery made in 5.0.7:
>
> - Fix for the potential to miss one or more tasks in 2.6.23 and earlier
> kernels, presumably due to catching an entry the kernel's pid_hash[]
> chain in transition. Without the patch, the task will simply not be
> seen in the gathered task list.
> (bob.montgomery@hp.com)
>
> where this was his patch posting -- which fixed refresh_hlist_task_table_v2():
>
> [Crash-utility] Missing PID 1 is crash problem with losing tasks
> https://www.redhat.com/archives/crash-utility/2010-August/msg00049.html
>
> and where your patch fixes refresh_hlist_task_table_v3().
>
> I'll give it a test run...
>
> Thanks,
> Dave
Hi Michael,
Works well -- it's a rare occurrance, but the patch uncovered a total of
seven missing tasks in a test run on a sample set of 50 "v3" dumpfiles.