FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Kernel

 
 
LinkBack Thread Tools
 
Old 02-02-2012, 07:00 PM
Aman Gupta
 
Default Bug#620297: base: vmstat and /proc/loadavg disagree

I'm seeing these issues as well.

The Ubuntu guys pulled the two upstream patches into their 2.6.32
build, and the RedHat guys did the same in their 2.6.35 and 2.6.38
builds. Is there anything preventing us from pulling in the patches
too?

Aman



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: CAK=uwuzLojGY0bCWHYfpvtu5ZmGJNKQuW_v7DvDsff-0YWKQuQ@mail.gmail.com">http://lists.debian.org/CAK=uwuzLojGY0bCWHYfpvtu5ZmGJNKQuW_v7DvDsff-0YWKQuQ@mail.gmail.com
 
Old 02-02-2012, 08:43 PM
Jonathan Nieder
 
Default Bug#620297: base: vmstat and /proc/loadavg disagree

found 620297 linux-2.6/2.6.32-30
found 620297 linux-2.6/2.6.32-36
tags 620297 + patch moreinfo
quit

Hi,

Aman Gupta wrote:

> The Ubuntu guys pulled the two upstream patches into their 2.6.32
> build, and the RedHat guys did the same in their 2.6.35 and 2.6.38
> builds. Is there anything preventing us from pulling in the patches
> too?

I'm coming in late; please forgive my ignorance.

I assume the two patches in question are

v2.6.35-rc1~521^2~16 (sched: Cure load average vs NO_HZ woes, 2010-08-22)
v2.6.37-rc7~13^2~5 (sched: Cure more NO_HZ load average woes, 2010-11-30)

The first does not apply cleanly to the 2.6.32.y tree but Lesław Kopeć
backported it. The other applies cleanly on top (and can be retrieved
in patch form with "git show v2.6.37-rc7~13^2~5").

So what would be most useful is:

1. Try a pristine 2.6.32.y kernel, to make sure it reproduces the problem.
If you already have a git checkout of the kernel:

cd linux
git remote add stable
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
git fetch stable

Otherwise:

git clone -o stable
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
linux
cd linux

Then build and test:

git checkout stable/linux-2.6.32.y
cp /boot/config-$(uname -r) .config; # stock configuration
make localmodconfig; # optional: minimize configuration
make deb-pkg; # optionally with -j<n> for parallel build
dpkg -i ../<name of package>
reboot

2. Try the patches, to make sure they fix it.

cd linux
git apply --index patch1
git apply --index patch2
make deb-pkg; # maybe with -j4
dpkg -i ../<name of package>
reboot

If the problem is reproducible with current 2.6.32.y and the patches fix it,
we can submit them upstream so everyone benefits.

Thanks and hope that helps,
Jonathan



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120202214311.GB23410@burratino">http://lists.debian.org/20120202214311.GB23410@burratino
 
Old 02-03-2012, 04:24 PM
Aman Gupta
 
Default Bug#620297: base: vmstat and /proc/loadavg disagree

I tried a number of kernels on the lenny machine where I'm seeing this
issue, and my results match Lesław's. The load average issue still
exists in 2.6.32.y. The two upstream patches correct the load average
calculation, but only if I also set CONFIG_NO_HZ=n. See the attached
image.

Incorrect load average:
2.6.32-bpo.5-amd64
2.6.32.55
2.6.32.55-620297patch
2.6.32.55-620297patch (nohz=off)

Correct load average:
2.6.32.55-620297patch (CONFIG_NO_HZ=n)

Since CONFIG_NO_HZ=y is default on the debian kernels, the two
upstream patches alone are not going to be enough to fix this issue. I
tried to find more upstream commits related to load averages and
NO_HZ, but came up empty.

If someone can find other commits that might fix load averages in the
NO_HZ=y case, I am happy to try them on my setup.

Aman

On Thu, Feb 2, 2012 at 1:43 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> found 620297 linux-2.6/2.6.32-30
> found 620297 linux-2.6/2.6.32-36
> tags 620297 + patch moreinfo
> quit
>
> Hi,
>
> Aman Gupta wrote:
>
>> The Ubuntu guys pulled the two upstream patches into their 2.6.32
>> build, and the RedHat guys did the same in their 2.6.35 and 2.6.38
>> builds. Is there anything preventing us from pulling in the patches
>> too?
>
> I'm coming in late; please forgive my ignorance.
>
> I assume the two patches in question are
>
> *v2.6.35-rc1~521^2~16 (sched: Cure load average vs NO_HZ woes, 2010-08-22)
> *v2.6.37-rc7~13^2~5 (sched: Cure more NO_HZ load average woes, 2010-11-30)
>
> The first does not apply cleanly to the 2.6.32.y tree but Lesław Kopeć
> backported it. *The other applies cleanly on top (and can be retrieved
> in patch form with "git show v2.6.37-rc7~13^2~5").
>
> So what would be most useful is:
>
> *1. Try a pristine 2.6.32.y kernel, to make sure it reproduces the problem.
> * *If you already have a git checkout of the kernel:
>
> * * * *cd linux
> * * * *git remote add stable
> * * * * *git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
> * * * *git fetch stable
>
> * *Otherwise:
>
> * * * *git clone -o stable
> * * * * *git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
> * * * * *linux
> * * * *cd linux
>
> * *Then build and test:
>
> * * * *git checkout stable/linux-2.6.32.y
> * * * *cp /boot/config-$(uname -r) .config; # stock configuration
> * * * *make localmodconfig; # optional: minimize configuration
> * * * *make deb-pkg; # optionally with -j<n> for parallel build
> * * * *dpkg -i ../<name of package>
> * * * *reboot
>
> *2. Try the patches, to make sure they fix it.
>
> * * * *cd linux
> * * * *git apply --index patch1
> * * * *git apply --index patch2
> * * * *make deb-pkg; # maybe with -j4
> * * * *dpkg -i ../<name of package>
> * * * *reboot
>
> If the problem is reproducible with current 2.6.32.y and the patches fix it,
> we can submit them upstream so everyone benefits.
>
> Thanks and hope that helps,
> Jonathan
 
Old 02-03-2012, 06:05 PM
Jonathan Nieder
 
Default Bug#620297: base: vmstat and /proc/loadavg disagree

tags 620297 + upstream
quit

Aman Gupta wrote:

> Incorrect load average:
> 2.6.32-bpo.5-amd64
> 2.6.32.55
> 2.6.32.55-620297patch
> 2.6.32.55-620297patch (nohz=off)
>
> Correct load average:
> 2.6.32.55-620297patch (CONFIG_NO_HZ=n)
>
> Since CONFIG_NO_HZ=y is default on the debian kernels, the two
> upstream patches alone are not going to be enough to fix this issue.

Thanks, Aman. So it looks like this will need more investigation.

I assume kernels from sid do not have the same bug, right? If so,
here's a quick way to narrow the problem down, if you'd like.

1. Try the upstream kernel that introduced the "Cure more NO_HZ load
average woes" fix:

cd linux
git checkout v2.6.37-rc7~13^2~5
make silentoldconfig; # reuse configuration
make deb-pkg; # optionally with -j8 or so
dpkg -i ../<name of package>
reboot

2. Hopefully it does not reproduce the problem. So try its parent:

cd linux
git checkout HEAD^
make silentoldconfig; # reuse configuration
make -j8 deb-pkg
dpkg -i ../<name of package>
reboot

Hopefully it reproduces the problem.

If so, another test to try:

cd linux
git checkout v2.6.35-rc1~521^2~16
git cherry-pick -x -s v2.6.37-rc7~13^2~5
make silentoldconfig
make -j8 deb-pkg
dpkg -i ../<name of package>
reboot

If that also works fine, the problem was introduced in backporting the
fix from 2.6.35 to 2.6.32.y --- either it has a missing prerequisite,
or there might be some small textual error.



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120203190501.GE29532@burratino">http://lists.debian.org/20120203190501.GE29532@burratino
 
Old 02-04-2012, 05:11 PM
Aman Gupta
 
Default Bug#620297: base: vmstat and /proc/loadavg disagree

I tested the following kernels in tickless mode (CONFIG_NO_HZ=y), and
they all contain the same load average reporting issue:

2.6.37-rc5-cure-more
2.6.39.4
3.2.2
3.2.4

Lesław, can you confirm that newer kernels still surface the bug in
your environment?

Aman

On Fri, Feb 3, 2012 at 11:05 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> tags 620297 + upstream
> quit
>
> Aman Gupta wrote:
>
>> Incorrect load average:
>> 2.6.32-bpo.5-amd64
>> 2.6.32.55
>> 2.6.32.55-620297patch
>> 2.6.32.55-620297patch (nohz=off)
>>
>> Correct load average:
>> 2.6.32.55-620297patch (CONFIG_NO_HZ=n)
>>
>> Since CONFIG_NO_HZ=y is default on the debian kernels, the two
>> upstream patches alone are not going to be enough to fix this issue.
>
> Thanks, Aman. *So it looks like this will need more investigation.
>
> I assume kernels from sid do not have the same bug, right? *If so,
> here's a quick way to narrow the problem down, if you'd like.
>
> *1. Try the upstream kernel that introduced the "Cure more NO_HZ load
> * *average woes" fix:
>
> * * * *cd linux
> * * * *git checkout v2.6.37-rc7~13^2~5
> * * * *make silentoldconfig; # reuse configuration
> * * * *make deb-pkg; # optionally with -j8 or so
> * * * *dpkg -i ../<name of package>
> * * * *reboot
>
> *2. Hopefully it does not reproduce the problem. *So try its parent:
>
> * * * *cd linux
> * * * *git checkout HEAD^
> * * * *make silentoldconfig; # reuse configuration
> * * * *make -j8 deb-pkg
> * * * *dpkg -i ../<name of package>
> * * * *reboot
>
> * *Hopefully it reproduces the problem.
>
> If so, another test to try:
>
> * * * *cd linux
> * * * *git checkout v2.6.35-rc1~521^2~16
> * * * *git cherry-pick -x -s v2.6.37-rc7~13^2~5
> * * * *make silentoldconfig
> * * * *make -j8 deb-pkg
> * * * *dpkg -i ../<name of package>
> * * * *reboot
>
> If that also works fine, the problem was introduced in backporting the
> fix from 2.6.35 to 2.6.32.y --- either it has a missing prerequisite,
> or there might be some small textual error.
 
Old 02-04-2012, 08:09 PM
Jonathan Nieder
 
Default Bug#620297: base: vmstat and /proc/loadavg disagree

found 620297 linux-2.6/3.2.2-1
quit

Aman Gupta wrote:

> I tested the following kernels in tickless mode (CONFIG_NO_HZ=y), and
> they all contain the same load average reporting issue:
>
> 2.6.37-rc5-cure-more
> 2.6.39.4
> 3.2.2
> 3.2.4

Ah, excellent. ;-)

Please report this to linux-kernel@vger.kernel.org, cc-ing Peter
Zijlstra <peterz@infradead.org>, Chase Douglas
<chase.douglas@canonical.com>, Damien Wyart <damien.wyart@free.fr>,
Kyle McMartin <kyle@redhat.com>, Venkatesh Pallipadi
<venki@google.com>, and either me or this bug log so we can track it.

Be sure to mention:

- steps to reproduce
- expected results, actual results, and how the difference indicates
a bug
- which kernel versions and configurations you have tested and
results for each
- "dmesg" output from booting an affected kernel, as an attachment
- any other thoughts or weird observations
- a link to http://bugs.debian.org/620297 for the full story

They may have ideas for commands or patches to try to help track the
problem down.

Sorry for the trouble, and hope that helps,
Jonathan



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120204210951.GA3278@burratino">http://lists.debian.org/20120204210951.GA3278@burratino
 
Old 02-14-2012, 04:46 AM
Jonathan Nieder
 
Default Bug#620297: base: vmstat and /proc/loadavg disagree

(cc-ing the bug log. I hope that's okay.)
Aman Gupta wrote:

> Hey Jonathan. I emailed lkml more than a week ago

Yep, it went through. You can see a link to your message in the
"Forwarded to" field of <http://bugs.debian.org/620297>. As for why
no one responded, I don't know --- probably no one was interested.
(Peter in particular seems to be a busy sort of person that is always
doing useful things and does not pay much attention to mail from
people not named Linus as far as I can tell. ;-))

If you send a reply upstream, someone might be able to give hints
about what information would be useful in order to garner more
responses. Following up with additional information when a message
seems to have been forgotten (once every week and a half, say) is
considered to be a useful way to avoid falling off the radar without
annoying people too much.

The LKML FAQ[*] also tells me:

> One hint: if you attach to your mail a genuinely useful piece of good
> quality code that you wrote, there are good chances that it will be
> answered (choose a good subject line, too).

So if you can come up with a test script or some hacky kernel patch
that demonstrates where the problem seems to lie, that might help.
[*] http://www.tux.org/lkml/



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120214054623.GA9262@burratino">http://lists.debian.org/20120214054623.GA9262@burratino
 
Old 02-23-2012, 03:08 PM
Lesław Kopeć
 
Default Bug#620297: base: vmstat and /proc/loadavg disagree

On 02/04/2012 07:11 PM, Aman Gupta wrote:
> I tested the following kernels in tickless mode (CONFIG_NO_HZ=y), and
> they all contain the same load average reporting issue:
>
> 2.6.37-rc5-cure-more
> 2.6.39.4
> 3.2.2
> 3.2.4
>
> Lesław, can you confirm that newer kernels still surface the bug in
> your environment?

My appologies for the huge delay in responding. I finally finished
testing my batch of kernels and I can confirm your results - load is
still low on tickless kernels.

I have tested the 2.6.32 and 2.6.37 vanilla kernels with combinations of
CONFIG_NO_HZ set to 'y' and 'n'. Each kernel was compiled without any
patches, with just 74f5187ac8 and finally with 74f5187ac8 and 0f004f5a69
applied.

I have replied to your message to linux-kernel mailing list with a more
detailed description of my findings. I hope it will pique someone's
interest.

--
Lesław Kopeć
 

Thread Tools




All times are GMT. The time now is 08:35 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org