> On Tue, 27 Sep 2011 00:45:01 -0500
> Jonathan Nieder <jrnieder@gmail.com> wrote:
> > John O'Hagan wrote:
> >
> > > I run daily-updated testing on an HP Mini 5102 with N10 graphics
> > > card (uses i915) running XFCE4.8. When 3.0.0-1 kernel came in I
> > > installed both RT and non- RT versions. On non-RT version, both
> > > suspend-to-ram and resume work as expected. On RT kernel they work
> > > with only XFCE desktop running, however if other programs are
> > > running as well (I've tried Chromium and Sylpheed), when suspend is
> > > attempted, the screen goes black, fan runs high, and keyboard
> > > becomes unresponsive, requiring a hard shutdown every time. I've
> > > reproduced this about a dozen times.
> > >
> > > Nothing unusual in pm-suspend log that I could see.
> >
> > Excellent detective work; thank you. It would be nice to pin this on
> > a particular driver, if possible. So:
> >
> > 1. Can you reproduce it with suspend-to-disk, too? (That means
> > "echo disk >/sys/power/state".)
>
> Yes, behaviour is the same.
>
> > 2. Can you reproduce it without the graphics driver loaded (e.g., if
> > you can reproduce it by inducing some network activity with programs
> > like "w3m" from a terminal in X, can you do the same by booting with
> > "single" on the kernel command line and trying the same in the
> > console)? I realize this might be hard to check.
>
> This is confusing: I can reproduce it in single-user mode with nothing
> additional running with echo mem > /sys/power/state. Not every time
> but most times.
>
> > 3. Since hey, one can be lucky sometimes: is it possible to catch the
> > failure as it happens, for example by not suspending the console
> > and suspending everything else? See
> >
> > https://raw.github.com/torvalds/linux/master/Documentation/power/basic-pm-debugging.txt
> > https://raw.github.com/torvalds/linux/master/Documentation/networking/netconsole.txt
> >
> > (or the analagously named files in the linux-doc-3.0.0 package)
> > for hints in that direction.
>
> This is a little out of my comfort zone, but I did follow the
> procedure in the first link, running the sequential tests with
> /sys/power/pm_test. "freezer", "devices" and "platform" tests all
> work, the failure occurs with "processors", both for STR and STD.
> IFACT from the link, this means the problem is not with a driver but
> with processor states? But my laptop only has a single processor (even
> though in /sys/devices/system/cpu/ there is cpu0 and cpu1, only the
> latter has a file "online" mentioned in the link, is this relevant?).
Did you try adding
no_console_suspend
to the kernel commandline?
> > 4. Does Alt+Sysrq work in the broken state? If so, the following
> > could be useful.
>
> No, totally frozen.
That means you tried Alt+Sysrq+b and nothing happend? But sysrq works
before the crash?
> > > ** Tainted: C (1024)
> > > * Module from drivers/staging has been loaded.
> > [...]
> > > [ 8.960305] brcmutil: module is from the staging directory, the
> > > quality is unknown, you have been warned. [ 9.019973] brcmsmac:
> > > module is from the staging directory, the quality is unknown, you
> > > have been warned.
> >
> > Is it possible to reproduce this without brcmsmac? E.g., does the b43
> > driver support your card (I haven't checked)?
>
> With this kernel AFAIK I can only use brcmsmac, but I can unload it and still reproduce the problem.
Better don't load it at all. I don't think it's responsible for the
problem but ruling this out should be easy enough.
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20110927125953.GK20550@pengutronix.de">http://lists.debian.org/20110927125953.GK20550@pengutronix.de
> Hi John,
>
> > On Tue, 27 Sep 2011 00:45:01 -0500
> > Jonathan Nieder <jrnieder@gmail.com> wrote:
[...]
> > > 3. Since hey, one can be lucky sometimes: is it possible to
> > > catch the failure as it happens, for example by not suspending
> > > the console and suspending everything else? See
> > >
> > > https://raw.github.com/torvalds/linux/master/Documentation/power/basic-pm-debugging.txt
> > > https://raw.github.com/torvalds/linux/master/Documentation/networking/netconsole.txt
> > >
> > > (or the analagously named files in the linux-doc-3.0.0
> > > package) for hints in that direction.
> >
> > This is a little out of my comfort zone, but I did follow the
> > procedure in the first link, running the sequential tests with
> > /sys/power/pm_test. "freezer", "devices" and "platform" tests all
> > work, the failure occurs with "processors", both for STR and STD.
> > IFACT from the link, this means the problem is not with a driver but
> > with processor states? But my laptop only has a single processor
> > (even though in /sys/devices/system/cpu/ there is cpu0 and cpu1,
> > only the latter has a file "online" mentioned in the link, is this
> > relevant?).
> Did you try adding
>
> no_console_suspend
>
> to the kernel commandline?
I have now: for STR, the screen goes blank so if the freeze occurs I don't get to see any messages. For STD, all messages are normal up to and including the one that says the image is being saved (didn't note the exact wording), and there it stops.
> > > 4. Does Alt+Sysrq work in the broken state? If so, the following
> > > could be useful.
> >
> > No, totally frozen.
> That means you tried Alt+Sysrq+b and nothing happend? But sysrq works
> before the crash?
Yes, and yes.
[...]
> > > Is it possible to reproduce this without brcmsmac? E.g., does
> > > the b43 driver support your card (I haven't checked)?
> >
> > With this kernel AFAIK I can only use brcmsmac, but I can unload it
> > and still reproduce the problem.
> Better don't load it at all. I don't think it's responsible for the
> problem but ruling this out should be easy enough.
[...]
OK, now I have tried it with brcmsmac blacklisted so it doesn't get loaded, the behaviour is the same.
As Jonathan has advised, I tried the Sid rt kernel and got the same results.
I would add though that the behaviour is not as consistent as it seemed yesterday: sometimes (today) I can suspend successfully several times in a row, sometimes there are many consecutive failures. It _seems_ to be more likely when biggish programs are running, although yesterday it failed in "processors" test mode every time (maybe ten times) even in single user mode. Hard to say, too, if the unstable kernel behaves a little differently or not. Seems to be random, but without doing hundreds of tests (and hard shutdowns and kernel swaps) I can't be sure.
Should I go ahead with a report to linux-rt-users as Jonathan has advised?
Best regards,
John
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20110929000438.1729759d0e560ae036cdc58a@gmail.com" >http://lists.debian.org/20110929000438.1729759d0e560ae036cdc58a@gmail.com
09-29-2011, 07:43 AM
Uwe Kleine-König
Bug#643301:
Hello John,
On Thu, Sep 29, 2011 at 12:04:38AM +1000, John O'Hagan wrote:
> On Tue, 27 Sep 2011 14:59:53 +0200
> Uwe Kleine-König <u.kleine-koenig@pengutronix.de> wrote:
> > > On Tue, 27 Sep 2011 00:45:01 -0500
> > > Jonathan Nieder <jrnieder@gmail.com> wrote:
> > > > 3. Since hey, one can be lucky sometimes: is it possible to
> > > > catch the failure as it happens, for example by not suspending
> > > > the console and suspending everything else? See
> > > >
> > > > https://raw.github.com/torvalds/linux/master/Documentation/power/basic-pm-debugging.txt
> > > > https://raw.github.com/torvalds/linux/master/Documentation/networking/netconsole.txt
> > > >
> > > > (or the analagously named files in the linux-doc-3.0.0
> > > > package) for hints in that direction.
> > >
> > > This is a little out of my comfort zone, but I did follow the
> > > procedure in the first link, running the sequential tests with
> > > /sys/power/pm_test. "freezer", "devices" and "platform" tests all
> > > work, the failure occurs with "processors", both for STR and STD.
> > > IFACT from the link, this means the problem is not with a driver but
> > > with processor states? But my laptop only has a single processor
> > > (even though in /sys/devices/system/cpu/ there is cpu0 and cpu1,
> > > only the latter has a file "online" mentioned in the link, is this
> > > relevant?).
> > Did you try adding
> >
> > no_console_suspend
> >
> > to the kernel commandline?
>
> I have now: for STR, the screen goes blank so if the freeze occurs I
> don't get to see any messages. For STD, all messages are normal up to
> and including the one that says the image is being saved (didn't note
> the exact wording), and there it stops.
As netconsole doesn't work on -rt IIRC do you can try using a serial
console?
> Should I go ahead with a report to linux-rt-users as Jonathan has
> advised?
You can try, but I already asked in the linux-rt irc channel and Thomas
has no idea and doesn't seem to be interested to look deeper.
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20110929074352.GF6949@pengutronix.de">http://lists.debian.org/20110929074352.GF6949@pengutronix.de
10-05-2011, 09:35 AM
Uwe Kleine-König
Bug#643301:
tag 643301 + upstream fixed-upstream
quit
Hello,
this should be fixed with 3.0.6-rt16 that will be included in 3.0.0-5.
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20111005093515.GP6949@pengutronix.de">http://lists.debian.org/20111005093515.GP6949@pengutronix.de
10-05-2011, 03:08 PM
John O'Hagan
Bug#643301:
On Wed, 5 Oct 2011 11:35:15 +0200
Uwe Kleine-König <u.kleine-koenig@pengutronix.de> wrote:
> tag 643301 + upstream fixed-upstream
> quit
>
> Hello,
>
> this should be fixed with 3.0.6-rt16 that will be included in 3.0.0-5.
Does this means I can forget about the more complex tests Jonathon has suggested? Has the problem already been isolated?
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20111006020842.ee4892ad0d1af4ec69f0cb1c@gmail.com" >http://lists.debian.org/20111006020842.ee4892ad0d1af4ec69f0cb1c@gmail.com
10-05-2011, 07:07 PM
Uwe Kleine-König
Bug#643301:
On Thu, Oct 06, 2011 at 02:08:42AM +1100, John O'Hagan wrote:
> On Wed, 5 Oct 2011 11:35:15 +0200
> Uwe Kleine-König <u.kleine-koenig@pengutronix.de> wrote:
> > this should be fixed with 3.0.6-rt16 that will be included in 3.0.0-5.
>
> Does this means I can forget about the more complex tests Jonathon has suggested? Has the problem already been isolated?
You can try my private build which includes -rt16. It's available at
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20111005190718.GR6949@pengutronix.de">http://lists.debian.org/20111005190718.GR6949@pengutronix.de