SRU: KVM: add schedule check to napi_enable call
SRU Justification:
Impact: Under heavy network I/O load virtio-net driver crashes making VM guest unusable. Testcase: I left a current Lucid VM running two concurrent "scp -r" of > 200 GB from NFS read-only source to a physical remote host overnight. VM quickly started emitting "page allocation errors" in the system log. Next morning when I checked the VM I could still ping it but could not establish an SSH connection. Fix: This patch from Bruce Rogers at Novell  * [PATCH] KVM: add schedule check to napi_enable call - http://kerneltrap.org/mailarchive/linux-netdev/2010/6/4/6278660 BugLink: https://bugs.launchpad.net/bugs/579276 -- kernel-team mailing list kernel-team@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/kernel-team |
SRU: KVM: add schedule check to napi_enable call
On 02/05/2011 09:20 PM, Ken Stailey wrote:
> SRU Justification: > > Impact: Under heavy network I/O load virtio-net driver crashes making VM guest unusable. > > Testcase: I left a current Lucid VM running two concurrent "scp -r" of > 200 GB from NFS read-only source to a physical remote host overnight. VM quickly started emitting "page allocation errors" in the system log. Next morning when I checked the VM I could still ping it but could not establish an SSH connection. > > Fix: This patch from Bruce Rogers at Novell > >  * [PATCH] KVM: add schedule check to napi_enable call > - http://kerneltrap.org/mailarchive/linux-netdev/2010/6/4/6278660 > > BugLink: https://bugs.launchpad.net/bugs/579276 > > > The patch itself looks reasonable. But this has not made its way upstream. The mail thread seems to be reasonably old, so the question would be why it is still missing. We need patches upstream before they can be SRUed. Have you tried contacting Bruce or Olaf to ask what happened there? -Stefan -- kernel-team mailing list kernel-team@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/kernel-team |
SRU: KVM: add schedule check to napi_enable call
Hi Bruce,
I would like to thank you for your contribution to virtio-net, specifically the "[PATCH] KVM: add schedule check to napi_enable call" as it appears to stabilize virtio-net on Ubuntu Lucid 10.04 LTS. Stefan Bader is curious to know why that patch is not appearing in upstream linux kernels. Can you offer any explanation? Thank you, Ken Stailey --- On Mon, 2/7/11, Stefan Bader <stefan.bader@canonical.com> wrote: > From: Stefan Bader <stefan.bader@canonical.com> > Subject: Re: SRU: [PATCH] KVM: add schedule check to napi_enable call > To: kernel-team@lists.ubuntu.com > Date: Monday, February 7, 2011, 7:50 AM > On 02/05/2011 09:20 PM, Ken Stailey > wrote: > > SRU Justification: > > > > Impact: Under heavy network I/O load virtio-net driver > crashes making VM guest unusable. > > > > Testcase: I left a current Lucid VM running two > concurrent "scp -r" of > 200 GB from NFS read-only source > to a physical remote host overnight.Â* VM quickly > started emitting "page allocation errors" in the system > log.Â* Next morning when I checked the VM I could still > ping it but could not establish an SSH connection. > > > > Fix: This patch from Bruce Rogers at Novell > > > >  * [PATCH] KVM: add schedule check to napi_enable > call > >Â* Â*Â*Â*- http://kerneltrap.org/mailarchive/linux-netdev/2010/6/4/6278660 > > > > BugLink: https://bugs.launchpad.net/bugs/579276 > > > > > > > The patch itself looks reasonable. But this has not made > its way upstream. The > mail thread seems to be reasonably old, so the question > would be why it is still > missing. We need patches upstream before they can be SRUed. > Have you tried > contacting Bruce or Olaf to ask what happened there? > > -Stefan > > > -- > kernel-team mailing list > kernel-team@lists.ubuntu.com > https://lists.ubuntu.com/mailman/listinfo/kernel-team > -- kernel-team mailing list kernel-team@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/kernel-team |
SRU: KVM: add schedule check to napi_enable call
--- On Mon, 2/7/11, Stefan Bader <stefan.bader@canonical.com> wrote:
> From: Stefan Bader <stefan.bader@canonical.com> > Subject: Re: SRU: [PATCH] KVM: add schedule check to napi_enable call > To: kernel-team@lists.ubuntu.com > Date: Monday, February 7, 2011, 7:50 AM > On 02/05/2011 09:20 PM, Ken Stailey > wrote: > > SRU Justification: > > > > Impact: Under heavy network I/O load virtio-net driver > crashes making VM guest unusable. > > > > Testcase: I left a current Lucid VM running two > concurrent "scp -r" of > 200 GB from NFS read-only source > to a physical remote host overnight. VM quickly > started emitting "page allocation errors" in the system > log. Next morning when I checked the VM I could still > ping it but could not establish an SSH connection. > > > > Fix: This patch from Bruce Rogers at Novell > > > > * [PATCH] KVM: add schedule check to napi_enable > call > > - http://kerneltrap.org/mailarchive/linux-netdev/2010/6/4/6278660 > > > > BugLink: https://bugs.launchpad.net/bugs/579276 > > > > > > > The patch itself looks reasonable. But this has not made > its way upstream. The mail thread seems to be reasonably old, so the > question would be why it is still missing. I have reason to believe that the absence of this patch in upstream kernels is a critical oversight. I used "apt-add-repository ppa:kernel-ppa/ppa" to put the "Natty" kernel on my Lucid test VM $ uname -a Linux dubnium 2.6.38-2-server #29~lucid1-Ubuntu SMP Mon Feb 7 15:09:10 UTC 2011 x86_64 GNU/Linux The stress test crashed the VM's network driver after copying only 63 GB. The test consists of running "scp -r /nfs_read_only/1 remote:/dir/1" concurrently with "scp -r /nfs_read_only/2 remote:/dir/2" The NFS mount options on the client are: ro,tcp,hard,intr,sloppy,addr=10.1.1.1 > We need patches upstream before they can be SRUed. > Have you tried contacting Bruce or Olaf to ask what happened there? > > -Stefan Who is Olaf? Thanks, Ken -- kernel-team mailing list kernel-team@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/kernel-team |
SRU: KVM: add schedule check to napi_enable call
>>> On 2/7/2011 at 09:23 AM, Ken Stailey <kstailey@yahoo.com> wrote:
> Hi Bruce, > > I would like to thank you for your contribution to virtio-net, specifically > the "[PATCH] KVM: add schedule check to napi_enable call" as it appears to > stabilize virtio-net on Ubuntu Lucid 10.04 LTS. > > Stefan Bader is curious to know why that patch is not appearing in upstream > linux kernels. Can you offer any explanation? > > Thank you, > Ken Stailey > I thought it had gone upstream, but apparently not. I was working with Greg K.H. on this and it must have fallen through the cracks between the two of us. I apologize for not following through with that better. Bruce -- kernel-team mailing list kernel-team@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/kernel-team |
SRU: KVM: add schedule check to napi_enable call
--- On Tue, 2/8/11, Bruce Rogers <brogers@novell.com> wrote:
> From: Bruce Rogers <brogers@novell.com> > Subject: Re: SRU: [Lucid] KVM: add schedule check to napi_enable call > To: "Ken Stailey" <kstailey@yahoo.com> > Cc: "Stefan Bader" <stefan.bader@canonical.com>, kernel-team@lists.ubuntu.com > Date: Tuesday, February 8, 2011, 1:24 PM > >>> On 2/7/2011 at 09:23 > AM, Ken Stailey <kstailey@yahoo.com> > wrote: > > Hi Bruce, > > > > I would like to thank you for your contribution to > virtio-net, specifically > > the "[PATCH] KVM: add schedule check to napi_enable > call" as it appears to > > stabilize virtio-net on Ubuntu Lucid 10.04 LTS. > > > > Stefan Bader is curious to know why that patch is not > appearing in upstream > > linux kernels.* Can you offer any explanation? > > > > Thank you, > > Ken Stailey > > > > I thought it had gone upstream, but apparently not. I was > working with Greg K.H. on this and it must have fallen > through the cracks between the two of us. > > I apologize for not following through with that better. > > Bruce > Thank you very much for looking into this issue. Ken -- kernel-team mailing list kernel-team@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/kernel-team |
SRU: KVM: add schedule check to napi_enable call
--- On Tue, 2/8/11, Bruce Rogers <brogers@novell.com> wrote:
> From: Bruce Rogers <brogers@novell.com> > Subject: Re: SRU: [Lucid] KVM: add schedule check to napi_enable call > To: "Ken Stailey" <kstailey@yahoo.com> > Cc: "Stefan Bader" <stefan.bader@canonical.com>, kernel-team@lists.ubuntu.com > Date: Tuesday, February 8, 2011, 1:24 PM > >>> On 2/7/2011 at 09:23 > AM, Ken Stailey <kstailey@yahoo.com> > wrote: > > Hi Bruce, > > > > I would like to thank you for your contribution to > virtio-net, specifically > > the "[PATCH] KVM: add schedule check to napi_enable > call" as it appears to > > stabilize virtio-net on Ubuntu Lucid 10.04 LTS. > > > > Stefan Bader is curious to know why that patch is not > appearing in upstream > > linux kernels.* Can you offer any explanation? > > > > Thank you, > > Ken Stailey > > > > I thought it had gone upstream, but apparently not. I was > working with Greg K.H. on this and it must have fallen > through the cracks between the two of us. > > I apologize for not following through with that better. > > Bruce > I touched up the patch for 2.6.38 --- drivers/net/virtio_net.c.orig 2011-02-08 14:34:51.444099190 -0500 +++ drivers/net/virtio_net.c 2011-02-08 14:18:00.484400134 -0500 @@ -446,6 +446,20 @@ } } +static void virtnet_napi_enable(struct virtnet_info *vi) +{ + napi_enable(&vi->napi); + + /* If all buffers were filled by other side before we napi_enabled, we + * won't get another interrupt, so process any outstanding packets + * now. virtnet_poll wants re-enable the queue, so we disable here. + * We synchronize against interrupts via NAPI_STATE_SCHED */ + if (napi_schedule_prep(&vi->napi)) { + virtqueue_disable_cb(vi->rvq); + __napi_schedule(&vi->napi); + } +} + static void refill_work(struct work_struct *work) { struct virtnet_info *vi; @@ -454,7 +468,7 @@ vi = container_of(work, struct virtnet_info, refill.work); napi_disable(&vi->napi); still_empty = !try_fill_recv(vi, GFP_KERNEL); - napi_enable(&vi->napi); + virtnet_napi_enable(vi); /* In theory, this can happen: if we don't get any buffers in * we will *never* try to fill again. */ @@ -638,16 +652,7 @@ { struct virtnet_info *vi = netdev_priv(dev); - napi_enable(&vi->napi); - - /* If all buffers were filled by other side before we napi_enabled, we - * won't get another interrupt, so process any outstanding packets - * now. virtnet_poll wants re-enable the queue, so we disable here. - * We synchronize against interrupts via NAPI_STATE_SCHED */ - if (napi_schedule_prep(&vi->napi)) { - virtqueue_disable_cb(vi->rvq); - __napi_schedule(&vi->napi); - } + virtnet_napi_enable(vi); return 0; } -- kernel-team mailing list kernel-team@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/kernel-team |
SRU: KVM: add schedule check to napi_enable call
On 02/08/2011 07:24 PM, Bruce Rogers wrote:
> >>> On 2/7/2011 at 09:23 AM, Ken Stailey <kstailey@yahoo.com> wrote: >> Hi Bruce, >> >> I would like to thank you for your contribution to virtio-net, specifically >> the "[PATCH] KVM: add schedule check to napi_enable call" as it appears to >> stabilize virtio-net on Ubuntu Lucid 10.04 LTS. >> >> Stefan Bader is curious to know why that patch is not appearing in upstream >> linux kernels. Can you offer any explanation? >> >> Thank you, >> Ken Stailey >> > > I thought it had gone upstream, but apparently not. I was working with Greg K.H. on this and it must have fallen through the cracks between the two of us. > > I apologize for not following through with that better. > > Bruce > Hi Bruce, thanks a lot for checking up on this again (I know getting Greg's attention sometimes is hard). Btw, when reading through the bug report that Ken was working on, I also saw this patch mentioned as being helpul for older releases (which seems to be valid at least for 2.6.32) that apparently had the same fate: http://article.gmane.org/gmane.comp.emulators.kvm.devel/53653 Btw, thank you Ken for driving this. -Stefan -- kernel-team mailing list kernel-team@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/kernel-team |
SRU: KVM: add schedule check to napi_enable call
On 02/08/2011 12:50 AM, Ken Stailey wrote:
> --- On Mon, 2/7/11, Stefan Bader <stefan.bader@canonical.com> wrote: > >> From: Stefan Bader <stefan.bader@canonical.com> >> Subject: Re: SRU: [PATCH] KVM: add schedule check to napi_enable call >> To: kernel-team@lists.ubuntu.com >> Date: Monday, February 7, 2011, 7:50 AM >> On 02/05/2011 09:20 PM, Ken Stailey >> wrote: >>> SRU Justification: >>> >>> Impact: Under heavy network I/O load virtio-net driver >> crashes making VM guest unusable. >>> >>> Testcase: I left a current Lucid VM running two >> concurrent "scp -r" of > 200 GB from NFS read-only source >> to a physical remote host overnight. VM quickly >> started emitting "page allocation errors" in the system >> log. Next morning when I checked the VM I could still >> ping it but could not establish an SSH connection. >>> >>> Fix: This patch from Bruce Rogers at Novell >>> >>> * [PATCH] KVM: add schedule check to napi_enable >> call >>> - http://kerneltrap.org/mailarchive/linux-netdev/2010/6/4/6278660 >>> >>> BugLink: https://bugs.launchpad.net/bugs/579276 >>> >>> >>> >> The patch itself looks reasonable. But this has not made >> its way upstream. The mail thread seems to be reasonably old, so the >> question would be why it is still missing. > > I have reason to believe that the absence of this patch in upstream kernels is a critical oversight. > > I used "apt-add-repository ppa:kernel-ppa/ppa" to put the "Natty" kernel on my Lucid test VM > > $ uname -a > Linux dubnium 2.6.38-2-server #29~lucid1-Ubuntu SMP Mon Feb 7 15:09:10 UTC 2011 x86_64 GNU/Linux > > The stress test crashed the VM's network driver after copying only 63 GB. > > The test consists of running "scp -r /nfs_read_only/1 remote:/dir/1" concurrently with "scp -r /nfs_read_only/2 remote:/dir/2" > > The NFS mount options on the client are: > ro,tcp,hard,intr,sloppy,addr=10.1.1.1 > >> We need patches upstream before they can be SRUed. >> Have you tried contacting Bruce or Olaf to ask what happened there? >> >> -Stefan > > Who is Olaf? > > Thanks, > Ken > Hi Ken, sorry, late reply. I was referring to Olaf Kirch (the other guy in the signed off by). But Bruce already had been responding. As you expected this is some oversight and it is good that we get that fixed upstream and on the related stable trees. Thanks again for helping in driving this home. -Stefan -- kernel-team mailing list kernel-team@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/kernel-team |
SRU: KVM: add schedule check to napi_enable call
On 02/05/2011 09:20 PM, Ken Stailey wrote:
> SRU Justification: > > Impact: Under heavy network I/O load virtio-net driver crashes making VM guest unusable. > > Testcase: I left a current Lucid VM running two concurrent "scp -r" of > 200 GB from NFS read-only source to a physical remote host overnight. VM quickly started emitting "page allocation errors" in the system log. Next morning when I checked the VM I could still ping it but could not establish an SSH connection. > > Fix: This patch from Bruce Rogers at Novell > >  * [PATCH] KVM: add schedule check to napi_enable call > - http://kerneltrap.org/mailarchive/linux-netdev/2010/6/4/6278660 > > BugLink: https://bugs.launchpad.net/bugs/579276 > > > The change is now upstream as commit 3e9d08ec0a68f6faf718d5a7e050fe5ca0ba004f Author: Bruce Rogers <brogers@novell.com> Date: Thu Feb 10 11:03:31 2011 -0800 virtio_net: Add schedule check to napi_enable call The change looks reasonable and has been verified. So I would add this to Maverick and Lucid. Acked-by: Stefan Bader <stefan.bader@canonical.com> -- kernel-team mailing list kernel-team@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/kernel-team |
| All times are GMT. The time now is 04:55 AM. |
VBulletin, Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.