Bug#498141: general network corruption running linux-image-2.6.26-1-686
Package: linux-image-2.6.26-1-686
Version: 2.6.26-3
Severity: normal
I installed linux-image-2.6.26-1-686 on my etch system to try and help vet
it a bit. I immediately had two problems that look network related. The
first was that xterms I opened on another local machine were garbled so
badly that I could not use them. The fonts were all garbled, the prompt
was messed up, etc.
The second issue was that my IMAP connection to a local machine was also
having serious corruption issues. It would show the headers, but when I
went to look at the messages, everything would go blank. (This was in
kmail.) Reverting back to 2.6.22-6~bpo40+2 made everything happy again.
So I realize I'm assuming two things - 1) that the kernel was at fault - it
was the only thing that changed, and 2) that it was the networking that was
wonky, and not something else.
Here is some info about the hardware:
AMD Athlon(tm) 64 X2 Dual Core Processor 3600+
~> lspci
00:00.0 Host bridge: nVidia Corporation nForce3 250Gb Host Bridge (rev a1)
00:01.0 ISA bridge: nVidia Corporation nForce3 250Gb LPC Bridge (rev a2)
00:01.1 SMBus: nVidia Corporation nForce 250Gb PCI System Management (rev
a1)
00:02.0 USB Controller: nVidia Corporation CK8S USB Controller (rev a1)
00:02.1 USB Controller: nVidia Corporation CK8S USB Controller (rev a1)
00:02.2 USB Controller: nVidia Corporation nForce3 EHCI USB 2.0 Controller
(rev a2)
00:06.0 Multimedia audio controller: nVidia Corporation nForce3 250Gb AC'97
Audio Controller (rev a1)
00:08.0 IDE interface: nVidia Corporation CK8S Parallel ATA Controller
(v2.5) (rev a2)
00:0a.0 IDE interface: nVidia Corporation CK8S Serial ATA Controller (v2.5)
(rev a2)
00:0b.0 PCI bridge: nVidia Corporation nForce3 250Gb AGP Host to PCI Bridge
(rev a2)
00:0e.0 PCI bridge: nVidia Corporation nForce3 250Gb PCI-to-PCI Bridge (rev
a2)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
Miscellaneous Control
01:00.0 VGA compatible controller: ATI Technologies Inc RV350 AP [Radeon
9600]
01:00.1 Display controller: ATI Technologies Inc RV350 AP [Radeon 9600]
(Secondary)
02:07.0 FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host
Controller (rev 80)
02:08.0 Ethernet controller: Marvell Technology Group Ltd. 88E8001 Gigabit
Ethernet Controller (rev 13)
-- System Information:
Debian Release: 4.0
APT prefers testing
APT policy: (500, 'testing'), (500, 'stable')
Architecture: i386 (i686)
Shell: /bin/sh linked to /bin/bash
Kernel: Linux 2.6.22-4-k7
Locale: LANG=en_US, LC_CTYPE=en_US (charmap=ISO-8859-1)
Versions of packages linux-image-2.6.26-1-686 depends on:
ii debconf [debconf-2.0] 1.5.11etch2 Debian configuration management sy
ii initramfs-tools [linux-initr 0.85i tools for generating an initramfs
ii module-init-tools 3.3-pre4-2 tools for managing Linux kernel mo
Versions of packages linux-image-2.6.26-1-686 recommends:
ii libc6-i686 2.3.6.ds1-13etch7 GNU C Library: Shared libraries [i
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
09-07-2008, 08:26 PM
Bastian Blank
Bug#498141: general network corruption running linux-image-2.6.26-1-686
tags 498141 moreinfo
thanks
On Sun, Sep 07, 2008 at 10:15:21AM -0400, Dale E. Martin wrote:
> I installed linux-image-2.6.26-1-686 on my etch system to try and help vet
> it a bit. I immediately had two problems that look network related. The
> first was that xterms I opened on another local machine were garbled so
> badly that I could not use them. The fonts were all garbled, the prompt
> was messed up, etc.
Please use a protocol which uses cryptographic checksums like ssh and
see if something pops up. As the packets go through the TCP checksum,
which is computed over the complete tcp header and text, is _not_
broken, which means that either a higher layer introduces the
corruption, which would show up on much more machines, or that the
devices uses checksum offloading which is broken.
> The second issue was that my IMAP connection to a local machine was also
> having serious corruption issues. It would show the headers, but when I
> went to look at the messages, everything would go blank. (This was in
> kmail.) Reverting back to 2.6.22-6~bpo40+2 made everything happy again.
> So I realize I'm assuming two things - 1) that the kernel was at fault - it
> was the only thing that changed, and 2) that it was the networking that was
> wonky, and not something else.
Can you please use wireshark and get a dump from both sides of such a
corrupted connection?
Bastian
--
Punishment becomes ineffective after a certain point. Men become insensitive.
-- Eneg, "Patterns of Force", stardate 2534.7
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
09-07-2008, 09:31 PM
"Dale E. Martin"
Bug#498141: general network corruption running linux-image-2.6.26-1-686
The xterm was running over an "ssh -X" connection:
ssh -X -P -C -f $HOST "xterm -ls -n $HOST"
The NIC is hooked to a gigabit switch but the machine I was connected to
(for both the xterm and imap) is on a 100Mbit switch.
The client machine is a Shuttle SN95G5. (V2.0 I think.)
I'll do the wireshark trick next time I see this.
Thanks,
Dale
--
Dale E. Martin - dale@the-martins.org
http://the-martins.org/~dmartin
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
09-07-2008, 10:04 PM
Bastian Blank
Bug#498141: general network corruption running linux-image-2.6.26-1-686
On Sun, Sep 07, 2008 at 05:31:35PM -0400, Dale E. Martin wrote:
> The xterm was running over an "ssh -X" connection:
> ssh -X -P -C -f $HOST "xterm -ls -n $HOST"
Then I doubt that it is a kernel problem at all. The ssh connection
would not survive corruption in deeper layers.
Please use memtest86 to test if you don't have broken memory.
Bastian
--
Those who hate and fight must stop themselves -- otherwise it is not stopped.
-- Spock, "Day of the Dove", stardate unknown
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
09-07-2008, 11:05 PM
"Dale E. Martin"
Bug#498141: general network corruption running linux-image-2.6.26-1-686
> Then I doubt that it is a kernel problem at all. The ssh connection
> would not survive corruption in deeper layers.
>
> Please use memtest86 to test if you don't have broken memory.
I'm not going to say it's not possible - I've been using Linux a long time,
and know about compiling floppy.c as a torture test. On the other hand,
this machine has been stable for 3 years, with a variety of kernels, 64
bit, 32 bit, also in windows, is used with high cpu load frequently, etc.
It has been rock solid.
I installed a new kernel, immediately experienced problems, boot back to
the old kernel, and problems go away. Occam points at the new kernel.
(And yes, I get the new kernel could point to an underlying issue, could be
heating the CPU differently, or something - but it doesn't feel like the
most likely scenario to me.)
It could also be that it's not in the network stack but some other kernel
issue.
Thanks,
Dale
--
Dale E. Martin - dale@the-martins.org
http://the-martins.org/~dmartin
--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org