Ok, both of the patches look sane to me, but it would really be nice
to hear from somebody with the actual affected architectures, and get
a tested-by.
Testing it on hacked-up x86 sounds fine, but doesn't quite have the
same kind of "yes, this fixes the actual problem" feel to it.
Also, can you clarify: does the second patch make the first patch just
an "irrelevant safety net", or are there possible callers of
topology_add_dev() that could cause problems? I'm just wondering
whether maybe the safety net ends up then possibly hiding some future
bug where we (once more) don't register a cpu and then never really
notice?
Or am I just being difficult?
Linus
On Sun, Jan 8, 2012 at 12:48 PM, Ben Hutchings <ben@decadent.org.uk> wrote:
> Commit ccbc60d3e19a1b6ae66ca0d89b3da02dde62088b ('topology: Provide
> CPU topology in sysfs in !SMP configurations') causes a crash at boot
> on a several architectures. *The topology sysfs code assumes that
> there is a CPU device for each online CPU whereas some architectures
> that do not support SMP or cpufreq do not register any CPU devices.
> Check for this before trying to use a device.
>
> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
> ---
> *drivers/base/topology.c | * *5 ++++-
> *1 files changed, 4 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/base/topology.c b/drivers/base/topology.c
> index ae989c5..4467c85 100644
> --- a/drivers/base/topology.c
> +++ b/drivers/base/topology.c
> @@ -147,6 +147,8 @@ static int __cpuinit topology_add_dev(unsigned int cpu)
> *{
> * * * *struct device *dev = get_cpu_device(cpu);
>
> + * * * if (!dev)
> + * * * * * * * return -ENODEV;
> * * * *return sysfs_create_group(&dev->kobj, &topology_attr_group);
> *}
>
> @@ -154,7 +156,8 @@ static void __cpuinit topology_remove_dev(unsigned int cpu)
> *{
> * * * *struct device *dev = get_cpu_device(cpu);
>
> - * * * sysfs_remove_group(&dev->kobj, &topology_attr_group);
> + * * * if (dev)
> + * * * * * * * sysfs_remove_group(&dev->kobj, &topology_attr_group);
> *}
>
> *static int __cpuinit topology_cpu_callback(struct notifier_block *nfb,
> --
> 1.7.8.2
>
>
>
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: CA+55aFwwnJPpVY++Vgr6p3s47YjdYWV3sgLZzvCtiJp6DEu78 Q@mail.gmail.com">http://lists.debian.org/CA+55aFwwnJPpVY++Vgr6p3s47YjdYWV3sgLZzvCtiJp6DEu78 Q@mail.gmail.com
01-09-2012, 12:06 AM
richard -rw- weinberger
topology: Check for missing CPU devices
On Mon, Jan 9, 2012 at 1:18 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> Ok, both of the patches look sane to me, but it would really be nice
> to hear from somebody with the actual affected architectures, and get
> a tested-by.
UML is affected:
https://lkml.org/lkml/2012/1/8/186
I wasted an hour finding out why it is crashing.
Instead of testing kernels I really should read more LKML. ;-)
--
Thanks,
//richard
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: CAFLxGvwJWrSRSNxHE620XZkc4w0ayzeqqxZOR1efkh1V+VBO1 w@mail.gmail.com">http://lists.debian.org/CAFLxGvwJWrSRSNxHE620XZkc4w0ayzeqqxZOR1efkh1V+VBO1 w@mail.gmail.com
01-09-2012, 12:29 AM
Linus Torvalds
topology: Check for missing CPU devices
On Sun, Jan 8, 2012 at 5:06 PM, richard -rw- weinberger
<richard.weinberger@gmail.com> wrote:
> On Mon, Jan 9, 2012 at 1:18 AM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>> Ok, both of the patches look sane to me, but it would really be nice
>> to hear from somebody with the actual affected architectures, and get
>> a tested-by.
>
> UML is affected:
> https://lkml.org/lkml/2012/1/8/186
>
> I wasted an hour finding out why it is crashing.
> Instead of testing kernels I really should read more LKML. ;-)
Hmm.
Ben - how about that
static DEFINE_PER_CPU(struct cpu, cpu_devices);
approach that Richard uses in his patch, instead of the kcalloc? And
clearly UM should also do that CONFIG_GENERIC_CPU_DEVICES thing with
your patch.
Richard - does Ben's patch work for you too if you just add "select
GENERIC_CPU_DEVICES" in the UM Kconfig too (Kconfig.common, probably)?
Linus
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: CA+55aFwVnLE-DLRzS8G=8NxQZjF5M_O4kgrPoX+AOOEH-OZgog@mail.gmail.com">http://lists.debian.org/CA+55aFwVnLE-DLRzS8G=8NxQZjF5M_O4kgrPoX+AOOEH-OZgog@mail.gmail.com
01-09-2012, 01:47 AM
Ben Hutchings
topology: Check for missing CPU devices
On Sun, 2012-01-08 at 16:18 -0800, Linus Torvalds wrote:
> Ok, both of the patches look sane to me, but it would really be nice
> to hear from somebody with the actual affected architectures, and get
> a tested-by.
>
> Testing it on hacked-up x86 sounds fine, but doesn't quite have the
> same kind of "yes, this fixes the actual problem" feel to it.
Indeed.
> Also, can you clarify: does the second patch make the first patch just
> an "irrelevant safety net", or are there possible callers of
> topology_add_dev() that could cause problems? I'm just wondering
> whether maybe the safety net ends up then possibly hiding some future
> bug where we (once more) don't register a cpu and then never really
> notice?
[...]
driver_init() doesn't check that cpu_dev_init() - or any of the other
functions it calls - is successful. So in theory at least we could boot
and still have no CPU devices after the first patch.
Ben.
--
Ben Hutchings
Life is what happens to you while you're busy making other plans.
- John Lennon
01-09-2012, 01:52 AM
Ben Hutchings
topology: Check for missing CPU devices
On Sun, 2012-01-08 at 17:29 -0800, Linus Torvalds wrote:
> On Sun, Jan 8, 2012 at 5:06 PM, richard -rw- weinberger
> <richard.weinberger@gmail.com> wrote:
> > On Mon, Jan 9, 2012 at 1:18 AM, Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> >> Ok, both of the patches look sane to me, but it would really be nice
> >> to hear from somebody with the actual affected architectures, and get
> >> a tested-by.
> >
> > UML is affected:
> > https://lkml.org/lkml/2012/1/8/186
> >
> > I wasted an hour finding out why it is crashing.
> > Instead of testing kernels I really should read more LKML. ;-)
>
> Hmm.
>
> Ben - how about that
>
> static DEFINE_PER_CPU(struct cpu, cpu_devices);
>
> approach that Richard uses in his patch, instead of the kcalloc?
That seems perfectly good.
> And
> clearly UM should also do that CONFIG_GENERIC_CPU_DEVICES thing with
> your patch.
>
> Richard - does Ben's patch work for you too if you just add "select
> GENERIC_CPU_DEVICES" in the UM Kconfig too (Kconfig.common, probably)?
Sorry, I meant to cover UM as well but I couldn't see how its Kconfig
files were organised.
Ben.
--
Ben Hutchings
Life is what happens to you while you're busy making other plans.
- John Lennon
01-09-2012, 01:56 AM
Ben Hutchings
topology: Check for missing CPU devices
On Mon, 2012-01-09 at 02:47 +0000, Ben Hutchings wrote:
> On Sun, 2012-01-08 at 16:18 -0800, Linus Torvalds wrote:
> > Ok, both of the patches look sane to me, but it would really be nice
> > to hear from somebody with the actual affected architectures, and get
> > a tested-by.
> >
> > Testing it on hacked-up x86 sounds fine, but doesn't quite have the
> > same kind of "yes, this fixes the actual problem" feel to it.
>
> Indeed.
>
> > Also, can you clarify: does the second patch make the first patch just
> > an "irrelevant safety net", or are there possible callers of
> > topology_add_dev() that could cause problems? I'm just wondering
> > whether maybe the safety net ends up then possibly hiding some future
> > bug where we (once more) don't register a cpu and then never really
> > notice?
> [...]
>
> driver_init() doesn't check that cpu_dev_init() - or any of the other
> functions it calls - is successful. So in theory at least we could boot
> and still have no CPU devices after the first patch.
I mean to say that we could have no CPU devices after the *second*
patch. So the first patch is an extra defence against that. (Though we
could just as well panic if register_cpu() fails at boot time.)
Ben.
--
Ben Hutchings
Life is what happens to you while you're busy making other plans.
- John Lennon
01-09-2012, 01:19 PM
Richard Weinberger
topology: Check for missing CPU devices
Am 09.01.2012 03:52, schrieb Ben Hutchings:
> On Sun, 2012-01-08 at 17:29 -0800, Linus Torvalds wrote:
>> On Sun, Jan 8, 2012 at 5:06 PM, richard -rw- weinberger
>> <richard.weinberger@gmail.com> wrote:
>>> On Mon, Jan 9, 2012 at 1:18 AM, Linus Torvalds
>>> <torvalds@linux-foundation.org> wrote:
>>>> Ok, both of the patches look sane to me, but it would really be nice
>>>> to hear from somebody with the actual affected architectures, and get
>>>> a tested-by.
>>>
>>> UML is affected:
>>> https://lkml.org/lkml/2012/1/8/186
>>>
>>> I wasted an hour finding out why it is crashing.
>>> Instead of testing kernels I really should read more LKML. ;-)
>>
>> Hmm.
>>
>> Ben - how about that
>>
>> static DEFINE_PER_CPU(struct cpu, cpu_devices);
>>
>> approach that Richard uses in his patch, instead of the kcalloc?
>
> That seems perfectly good.
>
>> And
>> clearly UM should also do that CONFIG_GENERIC_CPU_DEVICES thing with
>> your patch.
>>
>> Richard - does Ben's patch work for you too if you just add "select
>> GENERIC_CPU_DEVICES" in the UM Kconfig too (Kconfig.common, probably)?
>
Yeah, Ben's patch solves the problem too. :-)
Ben, please add the attached patch snippet to your patch.
On Sun, Jan 8, 2012 at 21:48, Ben Hutchings <ben@decadent.org.uk> wrote:
> Commit ccbc60d3e19a1b6ae66ca0d89b3da02dde62088b ('topology: Provide
> CPU topology in sysfs in !SMP configurations') causes a crash at boot
> on a several architectures. *The topology sysfs code assumes that
> there is a CPU device for each online CPU whereas some architectures
> that do not support SMP or cpufreq do not register any CPU devices.
> Check for this before trying to use a device.
>
> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Thanks!
Works on its own or combined with "[PATCH 2/2] cpu: Register a generic
CPU device on architectures that currently do not".
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
* * * * * * * * * * * * * ** ** -- Linus Torvalds
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: http://lists.debian.org/CAMuHMdV614Hb9dFykFNbkwf-6hqx36qg_WftNqsexEHQ8d0A@mail.gmail.com
01-09-2012, 07:15 PM
Geert Uytterhoeven
topology: Check for missing CPU devices
On Mon, Jan 9, 2012 at 02:06, richard -rw- weinberger
<richard.weinberger@gmail.com> wrote:
> Instead of testing kernels I really should read more LKML. ;-)
As an architecture maintainer, you want to read at least linux-arch.
Gr{oetje,eeting}s,
* * * * * * * * * * * * Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
* * * * * * * * * * * * * ** ** -- Linus Torvalds
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: CAMuHMdWrj5qGvnwaXL62btSuSJF1OD3gjiyRD1KVc2bWWgD4u w@mail.gmail.com">http://lists.debian.org/CAMuHMdWrj5qGvnwaXL62btSuSJF1OD3gjiyRD1KVc2bWWgD4u w@mail.gmail.com
01-09-2012, 07:23 PM
Linus Torvalds
topology: Check for missing CPU devices
On Sun, Jan 8, 2012 at 6:56 PM, Ben Hutchings <ben@decadent.org.uk> wrote:
>
> I mean to say that we could have no CPU devices after the *second*
> patch. *So the first patch is an extra defence against that. *(Though we
> could just as well panic if register_cpu() fails at boot time.)
I think I'd rather just panic - if you have allocation issues during
early boot, the machine is hosed anyway - and then get rid of the
first patch?
Willing to send out a new patch along those lines (and with UML added)?
Thanks,
Linus
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: CA+55aFzz3FUsKRQ33NiDghkA3ugV65sBrc+N-h8NLuDRUSBAHQ@mail.gmail.com">http://lists.debian.org/CA+55aFzz3FUsKRQ33NiDghkA3ugV65sBrc+N-h8NLuDRUSBAHQ@mail.gmail.com