Bug#375149: Linux kernel IPv6 : random TCP connection failure
On Fri, Jun 23, 2006 at 09:16:44PM +0200, Benoit Branciard wrote:
> Package: linux-source-2.6.16
> Version: 2.6.16-2
> When a great number of IPv6 TCP connections are initiated from the Linux
> machine at high rate, some of them get stalled in SYN_SENT state and
> eventually time out after tcp_syn_retries (about 3 minutes).
> The remote server does NOT seem to see the connection at all (no
> SYN_RECV report with netstat).
> This behaviour was noticed initially using LDAP queries. Further
> investigations reported the same problem with SMTP requests, but NOT
> with HTTP (maybe related to the short-living TIME_WAIT state of HTTP
> connections ?).
> The failure rate is about 1-2 to 5000 on a busy machine (for example one
> hosting a web server), and harder to obtain on a quiet one.
> How to reproduce :
> - have a dual-stack LDAP or SMTP server ready, on a IPv6-enabled network
> (let's call it myserver)
> - on the Linux client to be tested, launch a loop of quick TCP
> connections to myserver :
> --> example 1 : loop of 5000 anonymous LDAP searches from a bash shell :
> $ i=0; while [ $i -lt 5000 ] ; do ldapsearch -H ldap://myserver -x -b
> dc=mydomain,dc=myroot '(uid=someuid)' > /dev/null ; i=$((i+1)) ; [
> $((i%100)) -eq 0 ] && echo $i ; done
> --> example 2 : loop of 5000 SMTP connexions from a bash shell (uses the
> echoping package) :
> $ i=0; while [ $i -lt 5000 ] ; do echoping -6 -S myserver >/dev/null ;
> i=$((i+1)) ; [ $((i%100)) -eq 0 ] && echo $i ; done
> Both examples should print the query number every hundred connections.
> If a connection gets stalled, the query count hangs, and a netstat
> command (in another shell) should display the SYN_SENT stalled connection :
> tcp6 0 0 myclient.mydomain:51930 myserver.mydomain:ldap TIME_WAIT
> (.. a bunch of other TIME_WAIT closing connexions ..)
> tcp6 0 1 myclient.mydomain:51940 myserver.mydomain:ldap SYN_SENT
> The number of TIME_WAIT connections in our case is about a few hundreds,
> so the tcp_max_tw_buckets value should not be an issue.
> The same experiments have NOT shown any stalling connections when using
> IPv4 in the same conditions (either by explicitly specifying the IPv4
> address of myserver, or by means of the "-4" option of echoping).
> We are using Debian GNU/Linux 3.1, libc6 2.3.2.ds1-22sarge3, and a
> compiled linux-source-2.6.16 (2.6.16-2) kernel with the stock
> 2.6.16-1-686-smp (or amd64-k8) unmodified config file.
> Same results have been achieved using several physical Debian client
> machines with similar config and different ethernet adapters (e1000 and
> tg3), against several LDAP or SMTP servers, and with various ethernet
> Also noted on a Mandriva Linux 2006.0 client with 2.6.12-18mdk kernel
> and glibc-2.3.5-5mdk.
> So this sounds like a general bug in the Linux 2.6 IPv6 TCP stack.
Does this error still occur with more recent kernel versions?
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact email@example.com