FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Kernel

 
 
LinkBack Thread Tools
 
Old 10-01-2011, 09:20 AM
Jonathan Nieder
 
Default Bug#538158: BUG: soft lockup - CPU#5 stuck for 62s with 2.6.26-2-686-bigmem kernel

Hi,

Juan Miguel Corral Cano wrote:

> Hi. I had the same problem on several machines of my network (all of them
> using nfs-mounted homes). We installed kernel 2.6.32-2 from lenny-backports
> a couple months ago, and no hang since then.
>
> Hope it is helpful. Best regards. Juan.

Thanks, that _is_ helpful.

Unfortunately none of the patches between 2.6.26 and 2.6.32 jumps out
at me as likely to have fixed this. So I would like more help, if you
are still interested (and please let me know if you are not). Quick
questions:

- Arcady, are you still experiencing the soft lockups in default_idle?
If so, can you confirm Juan's finding that 2.6.32 from
lenny-backports avoids trouble?

- Juan, do you still have machines that can be convinced to produce
the soft lockups again? Could you send a log snippet (starting at
BUG and ending at =======================) for such a soft lockup,
either by reproducing it again or by extracting it from 2010's
kernel logs if you still have them?

- Assuming the lenny kernel is still broken and the lenny-backports
kernel fixed, a useful next step would be to try a kernel from
between (2.6.30 or so) from snapshot.debian.org to help narrow down
the search for the fix.

Cheers,
Jonathan



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20111001091900.GA14119@elie">http://lists.debian.org/20111001091900.GA14119@elie
 
Old 10-02-2011, 07:58 PM
Jonathan Nieder
 
Default Bug#538158: BUG: soft lockup - CPU#5 stuck for 62s with 2.6.26-2-686-bigmem kernel

Juan Miguel Corral Cano wrote:

> I am not 100% sure, but I think we installed every kernel update on these
> machines until the problem was fixed. So maybe the first place to look is
> the 2.6.31/32 changelog.

Just for reference: I assume you were on i386, tracking
lenny-backports, and that "the 2.6.32-2 kernel" means something like

linux-image-2.6.32-bpo.2-686-bigmem 2.6.32-8~bpo50

Based on [1] and surrounding pages, I think the kernel before that in
lenny-backports was

linux-image-2.6.30-bpo.2-686-bigmem 2.6.30-8~bpo50+2

> The only clue I can give is that both machines had NFS4+kerberos mounted
> homes, so maybe it was related to bug 552706.

That reminds me: do you remember what exactly indicated to you that
the problem on your machines was the same as bug#538158? Was it just
the same message ("soft lockup - CPU#5 stuck") or did your machines
show the same shallow backtrace with only default_idle and cpu_idle in
it?

Thanks for some useful clarifications,
Jonathan

[1] http://lists.debian.org/debian-backports-changes/2010/02/subject.html



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20111002195827.GC17331@elie">http://lists.debian.org/20111002195827.GC17331@elie


Sun Oct 2 23:30:01 2011
Return-path: <gentoo-dev+bounces-47968-tom=linux-archive.org@lists.gentoo.org>
Envelope-to: tom@linux-archive.org
Delivery-date: Sun, 02 Oct 2011 22:54:10 +0300
Received: from pigeon.gentoo.org ([208.92.234.80]:41129 helo=lists.gentoo.org)
by s2.java-tips.org with esmtps (TLSv1:AES256-SHA:256)
(Exim 4.69)
(envelope-from <gentoo-dev+bounces-47968-tom=linux-archive.org@lists.gentoo.org>)
id 1RAS6z-00036d-W1
for tom@linux-archive.org; Sun, 02 Oct 2011 22:54:10 +0300
Received: from pigeon.gentoo.org (localhost [127.0.0.1])
by pigeon.gentoo.org (Postfix) with SMTP id 08D2D21C2C1;
Sun, 2 Oct 2011 20:02:58 +0000 (UTC)
X-Original-To: gentoo-dev@lists.gentoo.org
Delivered-To: gentoo-dev@lists.gentoo.org
Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183])
by pigeon.gentoo.org (Postfix) with ESMTP id 3EE6221C29A
for <gentoo-dev@lists.gentoo.org>; Sun, 2 Oct 2011 20:00:38 +0000 (UTC)
Received: from [192.168.178.24] (e178067176.adsl.alicedsl.de [85.178.67.176])
(using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits))
(No client certificate requested)
(Authenticated sender: chithanh)
by smtp.gentoo.org (Postfix) with ESMTPSA id A31991B4008;
Sun, 2 Oct 2011 20:00:36 +0000 (UTC)
Message-ID: <4E88C2DE.6000005@gentoo.org>
Date: Sun, 02 Oct 2011 22:00:30 +0200
From: =?UTF-8?B?Q2jDrS1UaGFuaCBDaHJpc3RvcGhlciBOZ3V54buFbg==?=
<chithanh@gentoo.org>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:6.0) Gecko/20110827 Firefox/6.0 SeaMonkey/2.3.1
Precedence: bulk
List-Post: <mailto:gentoo-dev@lists.gentoo.org>
List-Help: <mailto:gentoo-dev+help@lists.gentoo.org>
List-Unsubscribe: <mailto:gentoo-dev+unsubscribe@lists.gentoo.org>
List-Subscribe: <mailto:gentoo-dev+subscribe@lists.gentoo.org>
List-Id: Gentoo Linux mail <gentoo-dev.gentoo.org>
X-BeenThere: gentoo-dev@lists.gentoo.org
Reply-to: gentoo-dev@lists.gentoo.org
MIME-Version: 1.0
To: gentoo-dev@lists.gentoo.org, Mike Frysinger <vapier@gentoo.org>,
Samuli Suominen <ssuominen@gentoo.org>
Subject: Re: [gentoo-dev] Re: [gentoo-commits] gentoo-x86 commit in net-im/qutecom:
metadata.xml ChangeLog qutecom-2.2_p20110210.ebuild
References: <20111001170259.E4D702004B@flycatcher.gentoo.org > <4E881EC1.5030008@gentoo.org> <4E885FEB.2040301@gentoo.org> <201110021420.47397.vapier@gentoo.org>
In-Reply-To: <201110021420.47397.vapier@gentoo.org>
X-Enigmail-Version: 1.3
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Mike Frysinger schrieb:
> On Sunday, October 02, 2011 08:58:19 Ch=C3=AD-Thanh Christopher Nguy=E1=
=BB=85n wrote:
>> Samuli Suominen schrieb:
>>>> Please point to existing authoritative documentation which says that
>>>> downgrades are unacceptable.
>>>>
>>>>> It is NOT gentoo-x86 compatible package in it's current form.
>>>>
>>>> It sets correct dependency on an existing ebuild in tree. The depend=
ency
>>>> is only build time, users can upgrade linux-headers again afterwards=
.
>>>> The application itself is v4l2 compatible.
>>>
>>> common sense...
>>>
>>> http://bugs.gentoo.org/show_bug.cgi?id=3D311241#c2
>>> http://bugs.gentoo.org/show_bug.cgi?id=3D311241#c5
>>
>> linux-headers is not a library, it is strictly a build time dependency
>> for all packages which I am aware of.
>=20
> forcing downgrades of random packages is extremely poor behavior. it d=
oesn't=20
> matter if it's DEPEND or RDEPEND behavior. if your one package is the =
last=20
> thing to get installed, then you leave the system in a poor state.

I agree that a downgrade is a bit inconvenient for users. But if another
package is built later with DEPEND on newer linux-headers or emerge
--deep option, then it will get upgraded again. As no package runtime
depends on it, the system will not function any worse with old
linux-headers.

I propose that if linux-headers maintainers want to make downgrades
"illegal" (as the new summary to bug 361181 suggests) or restricted in
some way, they do so in policy or code. Not by surprise treecleaning of
packages.

> further, when the newer version gets stabilized and then the older ones=
=20
> dropped, what then ? your package is broken.

Yes, when the older one is dropped _that_ would be reason for
masking+removal. However I have not seen any plans of doing so. Actually
the current amd64 stable 2.6 versions are 35, 26 and 10 months old
respectively, I wouldn't expect that to happen any time soon.

> skipping 30 days is a bit premature, but re-adding it at this point doe=
sn't=20
> make sense. fix it and re-add it, or don't re-add it at all.

That seems to be the only sensible solution at this point. I will
probably find time next week or so to create and test a new ebuild.

Though I must say that allowing this clear policy violation to stand
while I don't see any policy violation in my ebuild leaves a bad impressi=
on.


Best regards,
Ch=C3=AD-Thanh Christopher Nguy=E1=BB=85n
 
Old 11-24-2011, 02:15 AM
Jonathan Nieder
 
Default Bug#538158: BUG: soft lockup - CPU#5 stuck for 62s with 2.6.26-2-686-bigmem kernel

Jonathan Nieder wrote:

> - Arcady, are you still experiencing the soft lockups in default_idle?
> If so, can you confirm Juan's finding that 2.6.32 from
> lenny-backports avoids trouble?

Ping. If you no longer have access to a system that produced these
problems or time to debug it, that's fine, but please let us know so
we can stop tracking it.



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20111124031506.GA17354@elie.hsd1.il.comcast.net">h ttp://lists.debian.org/20111124031506.GA17354@elie.hsd1.il.comcast.net
 
Old 01-01-2012, 09:02 AM
Jonathan Nieder
 
Default Bug#538158: BUG: soft lockup - CPU#5 stuck for 62s with 2.6.26-2-686-bigmem kernel

Jonathan Nieder wrote:
> Jonathan Nieder wrote:

>> - Arcady, are you still experiencing the soft lockups in default_idle?
>> If so, can you confirm Juan's finding that 2.6.32 from
>> lenny-backports avoids trouble?
>
> Ping. If you no longer have access to a system that produced these
> problems or time to debug it, that's fine, but please let us know so
> we can stop tracking it.

Trying again with a different email address.



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120101100237.GA5558@elie.hsd1.il.comcast.net">ht tp://lists.debian.org/20120101100237.GA5558@elie.hsd1.il.comcast.net
 
Old 01-05-2012, 01:09 PM
"Arcady Genkin"
 
Default Bug#538158: BUG: soft lockup - CPU#5 stuck for 62s with 2.6.26-2-686-bigmem kernel

Jonathan Nieder <jrnieder@gmail.com> writes:

> Ping. If you no longer have access to a system that produced these
> problems or time to debug it, that's fine, but please let us know so
> we can stop tracking it.

Hi, sorry for being unresponsive.

We have not seen this bug in a very long while now. I can't tell for
sure, but it feels like at least a year.

The servers are currently running linux-image-2.6-686-bigmem
2.6.26+17+lenny1 kernel.

That said, these servers are always quite busy, even during the
holidays, so if the bug is related to prolonged idle periods, as was
hypothesized before, then there is no wonder it is not happening.
--
Arcady Genkin : CDF Systems Administrator
http://www.cdf.toronto.edu/~agenkin/



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: uosjjudyw5.fsf@cdf.toronto.edu">http://lists.debian.org/uosjjudyw5.fsf@cdf.toronto.edu
 

Thread Tools




All times are GMT. The time now is 03:30 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org