FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Kernel

 
 
LinkBack Thread Tools
 
Old 12-06-2011, 10:00 PM
Jonathan Nieder
 
Default Bug#644550: indefinite soft lockup on rm

tags 644550 + upstream
found 644550 linux-2.6/3.0.0-5
quit

Egon Eckert wrote:

> Attached. It makes no sense for me. And it doesn't seem to contain the _CST
> table, maybe because these being disabled in BIOS?

Yep, I meant acpidump with c-states enabled. But don't worry about it.

> As of Shyam_Iyer's answers in the list, I'm probably being hit by known
> Nehalem/Westmere C-states transition trouble (documented in Intel's errata).
> Maybe another workaround (instead of disabling the C-states completely in
> BIOS) would be booting with
>
> processor.max_cstate=2

Sounds sensible. I hope people more familiar with the problem will
work on a patch upstream, so it can make its way to the 2.6.32.y
stable kernel and all distributors benefit.

Here's the best article on such trouble I can find from a quick
search: [1]. Do you happen to know the name of the erratum, or an
Intel document describing it?

I don't see any relevant fixes upstream recently, but please confirm
the problem with a 3.2 release candidate from experimental, and then
we should take this upstream (that means the linux-pm@vger.kernel.org
list, cc-ing Len Brown <lenb@kernel.org>,
linux-kernel@vger.kernel.org, and either me or this bug log so we can
track it).

Thanks a lot,
Jonathan

[1] http://support.citrix.com/article/CTX127395



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20111206230034.GB5821@elie.hsd1.il.comcast.net">ht tp://lists.debian.org/20111206230034.GB5821@elie.hsd1.il.comcast.net
 
Old 12-09-2011, 08:26 AM
Egon Eckert
 
Default Bug#644550: indefinite soft lockup on rm

> > Attached. It makes no sense for me. And it doesn't seem to contain the _CST
> > table, maybe because these being disabled in BIOS?
>
> Yep, I meant acpidump with c-states enabled. But don't worry about it.

It's at

http://joni.heaven-industries.com/~egon/tornado/acpidump-cst-enabled.txt

(much bigger now with C-states enabled)

> Here's the best article on such trouble I can find from a quick
> search: [1]. Do you happen to know the name of the erratum, or an
> Intel document describing it?

No. Nothing else an ordinary guy may find on web:

http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-8800-4800-2800-families-specification-update.html

...there are quite a few mentioning C-states.

> I don't see any relevant fixes upstream recently, but please confirm
> the problem with a 3.2 release candidate from experimental, and then
> we should take this upstream (that means the linux-pm@vger.kernel.org
> list, cc-ing Len Brown <lenb@kernel.org>,
> linux-kernel@vger.kernel.org, and either me or this bug log so we can
> track it).

Actually, 3.2.0rc4 seems to run well! The big (and relevant) change since
the squeeze kernel is that the new has the intel_idle driver (which kicks in
instead of the acpi_idle):

(2.6.32)
root@tornado:/sys/devices/system/cpu/cpu0/cpuidle# grep . */*
state0/desc:CPUIDLE CORE POLL IDLE
state0/latency:0
state0/name:C0
state0/power:4294967295
state0/time:735633
state0/usage:85
state1/desc:ACPI FFH INTEL MWAIT 0x0
state1/latency:1
state1/name:C1
state1/power:1000
state1/time:457743320
state1/usage:49228
state2/desc:ACPI FFH INTEL MWAIT 0x10
state2/latency:64
state2/name:C2
state2/power:500
state2/time:86053513
state2/usage:47278
state3/desc:ACPI FFH INTEL MWAIT 0x20
state3/latency:96
state3/name:C3
state3/power:350
state3/time:1759785634
state3/usage:365191

(3.2.0rc4)
root@tornado:/sys/devices/system/cpu/cpu0/cpuidle# grep . */*
state0/desc:CPUIDLE CORE POLL IDLE
state0/latency:0
state0/name:POLL
state0/power:4294967295
state0/time:1985644
state0/usage:9221
state1/desc:MWAIT 0x00
state1/latency:3
state1/name:C1-NHM
state1/power:4294967294
state1/time:411021632
state1/usage:9459157
state2/desc:MWAIT 0x10
state2/latency:20
state2/name:C3-NHM
state2/power:4294967293
state2/time:1632399690
state2/usage:2649168
state3/desc:MWAIT 0x20
state3/latency:200
state3/name:C6-NHM
state3/power:4294967292
state3/time:51728887664
state3/usage:8613936

...and it possibly takes care somehow better, even when it (as it seems)
allows the CPU to enter the deeper (and Nehalem/Westmere problematic?) states
too. However, I am able to trigger the lockup just once a day, so it would
need more testing. So far I only know that the experimental kernel survived
once the same test case that crashes the squeeze kernel reliably (with no
kernel cmdline tweaks regarding idle behavior).

So regarding an eventual 2.6.32.x patch, I'm not sure what to think of about,
an ACPI blacklist update perhaps?

Now I'm switching back to stable kernel (as it's the one the company wants to
run in the long term) and I will try to confirm the processor.max_cstate
acpi_idle driver cmdline option workaround. I'll be back.

Thanks,

Egon



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20111209092630.GA6728@heaven-industries.com">http://lists.debian.org/20111209092630.GA6728@heaven-industries.com
 

Thread Tools




All times are GMT. The time now is 08:05 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org