Bug#625914: linux-image-2.6.38-2-amd64: bridging is not interacting well with multicast in 2.6.38-4
Package: linux-2.6
Version: 2.6.38-3
Severity: normal
Hi. I've got a system that hosts several kvm virtual hosts. The VMs
access the network via tap devices bridged with a physical interface.
After upgrading to linux-image-2.6.38-2-amd64_2.6.38-4, I noticed that
the virtualhosts were not autoconfiguring their IPv6 interfaces.
Debugging revealed that no multicast was passing over the bridge.
The bridge configuration is:
bridge name bridge id STP enabled interfaces
br0 8000.0002e3080eb5 no eth1
tap0
tap1
tap2
If I attach tcpdump to br0, I can see multicast (e.g. IPv6 Neighbor
Solicitation) packets. However, if I attach tcpdump to eth1, I do not
see multicast packets sourced from one of the VMs.
Downgrading to 2.6.38-3 solves the problem.
noah
-- Package-specific info:
** Version:
Linux version 2.6.38-2-amd64 (Debian 2.6.38-3) (ben@decadent.org.uk) (gcc version 4.4.5 (Debian 4.4.5-15) ) #1 SMP Thu Apr 7 06:43:20 UTC 2011
** Model information
sys_vendor: System manufacturer
product_name: System Product Name
product_version: System Version
chassis_vendor: Chassis Manufacture
chassis_version: Chassis Version
bios_vendor: American Megatrends Inc.
bios_version: 0206
board_vendor: ASUSTeK Computer INC.
board_name: M2A74-AM
board_version: Rev X.0x
01:05.0 VGA compatible controller [0300]: ATI Technologies Inc Radeon 2100 [1002:796e] (prog-if 00 [VGA controller])
Subsystem: ASUSTeK Computer Inc. Device [1043:835b]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 64, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 18
Region 0: Memory at f0000000 (64-bit, prefetchable) [size=128M]
Region 2: Memory at fbdf0000 (64-bit, non-prefetchable) [size=64K]
Region 4: I/O ports at c000 [size=256]
Region 5: Memory at fbc00000 (32-bit, non-prefetchable) [size=1M]
Expansion ROM at <unassigned> [disabled]
Capabilities: <access denied>
Kernel driver in use: radeon
02:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller [10ec:8168] (rev 01)
Subsystem: ASUSTeK Computer Inc. Device [1043:8385]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 41
Region 0: I/O ports at d800 [size=256]
Region 2: Memory at fbeff000 (64-bit, non-prefetchable) [size=4K]
Expansion ROM at fbec0000 [disabled] [size=128K]
Capabilities: <access denied>
Kernel driver in use: r8169
03:06.0 Ethernet controller [0200]: Lite-On Communications Inc LNE100TX [11ad:0002] (rev 20)
Subsystem: Netgear FA310TX [1385:f004]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 64
Interrupt: pin A routed to IRQ 20
Region 0: I/O ports at e800 [size=256]
Region 1: Memory at fbfffc00 (32-bit, non-prefetchable) [size=256]
Expansion ROM at fbf80000 [disabled] [size=256K]
Kernel driver in use: tulip
** USB devices:
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 006 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 007 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 003 Device 002: ID 0472:0065 Chicony Electronics Co., Ltd PFU-65 Keyboard
Bus 004 Device 002: ID 046d:c00e Logitech, Inc. M-BJ58/M-BJ69 Optical Wheel Mouse
Bus 003 Device 003: ID 0472:0065 Chicony Electronics Co., Ltd PFU-65 Keyboard
Kernel: Linux 2.6.38-2-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Versions of packages linux-image-2.6.38-2-amd64 depends on:
ii debconf [debconf-2.0] 1.5.39 Debian configuration management sy
ii initramfs-tools [linux-initra 0.98.8 tools for generating an initramfs
ii linux-base 3.2 Linux image base package
ii module-init-tools 3.12-1 tools for managing Linux kernel mo
Versions of packages linux-image-2.6.38-2-amd64 recommends:
ii firmware-linux-free 3 Binary firmware for various driver
ii libc6-i686 2.13-2 Embedded GNU C Library: Shared lib
Versions of packages linux-image-2.6.38-2-amd64 suggests:
ii grub-pc 1.99~rc1-13 GRand Unified Bootloader, version
pn linux-doc-2.6.38 <none> (no description available)
Versions of packages linux-image-2.6.38-2-amd64 is related to:
pn firmware-bnx2 <none> (no description available)
pn firmware-bnx2x <none> (no description available)
pn firmware-ipw2x00 <none> (no description available)
pn firmware-ivtv <none> (no description available)
pn firmware-iwlwifi <none> (no description available)
ii firmware-linux 0.29 Binary firmware for various driver
ii firmware-linux-nonfree 0.29 Binary firmware for various driver
pn firmware-qlogic <none> (no description available)
pn firmware-ralink <none> (no description available)
pn xen-hypervisor <none> (no description available)
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20110506201234.6297.70279.reportbug@ip6-localhost">http://lists.debian.org/20110506201234.6297.70279.reportbug@ip6-localhost
05-10-2011, 02:38 AM
Ben Hutchings
Bug#625914: linux-image-2.6.38-2-amd64: bridging is not interacting well with multicast in 2.6.38-4
On Fri, 2011-05-06 at 13:12 -0700, Noah Meyerhans wrote:
> Package: linux-2.6
> Version: 2.6.38-3
> Severity: normal
>
> Hi. I've got a system that hosts several kvm virtual hosts. The VMs
> access the network via tap devices bridged with a physical interface.
> After upgrading to linux-image-2.6.38-2-amd64_2.6.38-4, I noticed that
> the virtualhosts were not autoconfiguring their IPv6 interfaces.
> Debugging revealed that no multicast was passing over the bridge.
>
> The bridge configuration is:
> bridge name bridge id STP enabled interfaces
> br0 8000.0002e3080eb5 no eth1
> tap0
> tap1
> tap2
>
> If I attach tcpdump to br0, I can see multicast (e.g. IPv6 Neighbor
> Solicitation) packets. However, if I attach tcpdump to eth1, I do not
> see multicast packets sourced from one of the VMs.
>
> Downgrading to 2.6.38-3 solves the problem.
This is pretty weird. Debian version 2.6.38-3 has a few bridging
changes from stable 2.6.38.3 and 2.6.38.4, but they don't look like they
would cause this.
Ben.
--
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.
05-10-2011, 03:15 AM
Stephen Hemminger
Bug#625914: linux-image-2.6.38-2-amd64: bridging is not interacting well with multicast in 2.6.38-4
On Tue, 10 May 2011 03:38:44 +0100
Ben Hutchings <ben@decadent.org.uk> wrote:
> On Fri, 2011-05-06 at 13:12 -0700, Noah Meyerhans wrote:
> > Package: linux-2.6
> > Version: 2.6.38-3
> > Severity: normal
> >
> > Hi. I've got a system that hosts several kvm virtual hosts. The VMs
> > access the network via tap devices bridged with a physical interface.
> > After upgrading to linux-image-2.6.38-2-amd64_2.6.38-4, I noticed that
> > the virtualhosts were not autoconfiguring their IPv6 interfaces.
> > Debugging revealed that no multicast was passing over the bridge.
> >
> > The bridge configuration is:
> > bridge name bridge id STP enabled interfaces
> > br0 8000.0002e3080eb5 no eth1
> > tap0
> > tap1
> > tap2
> >
> > If I attach tcpdump to br0, I can see multicast (e.g. IPv6 Neighbor
> > Solicitation) packets. However, if I attach tcpdump to eth1, I do not
> > see multicast packets sourced from one of the VMs.
> >
> > Downgrading to 2.6.38-3 solves the problem.
>
> This is pretty weird. Debian version 2.6.38-3 has a few bridging
> changes from stable 2.6.38.3 and 2.6.38.4, but they don't look like they
> would cause this.
>
> Ben.
There are two possible explainations:
1. In 2.6.37 and kernels the bridge uses IGMP snooping, there were several
fixes to that in the stable kernel; especially related to IPv6.
2. There was also a recent change to block link local multicast
address. But that should impact what you are doing.
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20110509201528.52d87ec4@nehalam">http://lists.debian.org/20110509201528.52d87ec4@nehalam
05-10-2011, 04:38 AM
Noah Meyerhans
Bug#625914: linux-image-2.6.38-2-amd64: bridging is not interacting well with multicast in 2.6.38-4
On Tue, May 10, 2011 at 03:38:44AM +0100, Ben Hutchings wrote:
> This is pretty weird. Debian version 2.6.38-3 has a few bridging
> changes from stable 2.6.38.3 and 2.6.38.4, but they don't look like they
> would cause this.
I have apparently filed the bug against the wrong version of Debian's
kernel. 2.6.38-3 is not affected, and works as expected. The change
was introduced in -4. That may have been clear from the report itself,
but the report was filed against -3. I've fixed that in the BTS.
I've also confirmed that -5 is affected, to no great surprise.
I'll investigate further.
noah
05-10-2011, 12:42 PM
Ben Hutchings
Bug#625914: linux-image-2.6.38-2-amd64: bridging is not interacting well with multicast in 2.6.38-4
On Mon, 2011-05-09 at 21:38 -0700, Noah Meyerhans wrote:
> On Tue, May 10, 2011 at 03:38:44AM +0100, Ben Hutchings wrote:
> > This is pretty weird. Debian version 2.6.38-3 has a few bridging
> > changes from stable 2.6.38.3 and 2.6.38.4, but they don't look like they
> > would cause this.
>
> I have apparently filed the bug against the wrong version of Debian's
> kernel. 2.6.38-3 is not affected, and works as expected. The change
> was introduced in -4. That may have been clear from the report itself,
> but the report was filed against -3. I've fixed that in the BTS.
I gathered that, and then made the same mistake in writing the above!
The version with the regression, 2.6.38-4, includes the changes from
stable 2.6.38.3 and 2.6.38.4
Ben.
> I've also confirmed that -5 is affected, to no great surprise.
>
> I'll investigate further.
>
> noah
>
--
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.
05-10-2011, 06:05 PM
Noah Meyerhans
Bug#625914: linux-image-2.6.38-2-amd64: bridging is not interacting well with multicast in 2.6.38-4
On Tue, May 10, 2011 at 01:42:49PM +0100, Ben Hutchings wrote:
> > > This is pretty weird. Debian version 2.6.38-3 has a few bridging
> > > changes from stable 2.6.38.3 and 2.6.38.4, but they don't look like they
> > > would cause this.
> >
> > I have apparently filed the bug against the wrong version of Debian's
> > kernel. 2.6.38-3 is not affected, and works as expected. The change
> > was introduced in -4. That may have been clear from the report itself,
> > but the report was filed against -3. I've fixed that in the BTS.
>
> I gathered that, and then made the same mistake in writing the above!
> The version with the regression, 2.6.38-4, includes the changes from
> stable 2.6.38.3 and 2.6.38.4
With a little help from git bisect, I've tracked this regression down to
the following commit to the stable-2.6.38.y tree:
commit 5f1c356a3fadc0c19922d660da723b79bcc9aad7
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Fri Mar 18 05:27:28 2011 +0000
bridge: Reset IPCB when entering IP stack on NF_FORWARD
Whenever we enter the IP stack proper from bridge netfilter we
need to ensure that the skb is in a form the IP stack expects
it to be in.
The entry point on NF_FORWARD did not meet the requirements of
the IP stack, therefore leading to potential crashes/panics.
This patch fixes the problem.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The diff is
diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
index 4b5b66d..49d50ea 100644
--- a/net/bridge/br_netfilter.c
+++ b/net/bridge/br_netfilter.c
@@ -741,6 +741,9 @@ static unsigned int br_nf_forward_ip(unsigned int
hook, struct sk_buff *skb,
nf_bridge->mask |= BRNF_PKT_TYPE;
}
+ if (br_parse_ip_options(skb))
+ return NF_DROP;
+
/* The physdev module checks on this */
nf_bridge->mask |= BRNF_BRIDGED;
nf_bridge->physoutdev = skb->dev;
If I revert this change, network connectivity functions as expected for
the VMs on this host.
I don't know enough about this change or the problem it was supposed to
solve to be able to guess about what's going wrong.