FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Kernel

 
 
LinkBack Thread Tools
 
Old 04-07-2008, 02:11 AM
Lou Poppler
 
Default Bug#419950: eth ini. vs. ide ini.

As long as I have had this computer, since some 2.6.8 sarge kernel,
I have occasional problems where the network goes bad, with these lines
repeating forever in the syslog:
Feb 14 06:44:10 legba kernel: NETDEV WATCHDOG: eth0: transmit timed out
Feb 14 06:44:10 legba kernel: 0000:00:0f.0: tulip_stop_rxtx() failed

Sometimes the problem does not occur, and everything runs just fine until
I reboot the system, even if I pound on the network, trying to make it fail.
Sometimes the problem shows up, even with moderate network load, and the
network is _very_ sluggish until I reboot.

So far, this does not seem to depend on the kernel version. Each kernel
I've tried is bad sometimes, and occasionally will boot up OK.
After combing through the logs, I have found a pattern which correlates
with my problems. It looks like when I have the problem, there is some
overlapping of the initialization messages for hda and for eth0; and when
the machine is booting OK and will not have a problem, these initialization
messages are separated in the logs. Here are some sample logs:

Here is an extract from dmesg on 2008-04-01, still running today with
no problems:

Linux Tulip driver version 1.1.13-NAPI (May 11, 2002)
PCI: Enabling device 0000:00:0f.0 (0114 -> 0117)
ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 10
PCI: setting IRQ 10 as level-triggered
ACPI: PCI Interrupt 0000:00:0f.0[A] -> Link [LNKC] -> GSI 10 (level, low) -> IRQ 10
tulip0: MII transceiver #1 config 1000 status 786d advertising 05e1.
eth0: ADMtek Comet rev 17 at 00011400, 00:14:BF:5C:E1:35, IRQ 10.
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
PIIX4: IDE controller at PCI slot 0000:00:07.1
PIIX4: chipset revision 1
PIIX4: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0x1000-0x1007, BIOS settings: hdaio, hdbMA
ide1: BM-DMA at 0x1008-0x100f, BIOS settings: hdcMA, hddio
Probing IDE interface ide0...
usb 1-2: new full speed USB device using uhci_hcd and address 2
usb 1-2: configuration #1 chosen from 1 choice
hub 1-2:1.0: USB hub found
hub 1-2:1.0: 4 ports detected
hda: WDC WD800JB-00CRA1, ATA DISK drive
Time: acpi_pm clocksource has been installed.
hdb: ST3250623A, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
hdc: Hewlett-Packard CD-Writer Plus 9100, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
hda: max request size: 128KiB
hda: 156301488 sectors (80026 MB) w/8192KiB Cache, CHS=65535/16/63, UDMA(33)
hda: cache flushes not supported
hda: hda1 hda2 hda3 hda4 < hda5 hda6 hda7 hda8 hda9 hda10 >
hdb: max request size: 512KiB
hdb: 488397168 sectors (250059 MB) w/16384KiB Cache, CHS=30401/255/63, UDMA(33)
hdb: cache flushes supported
hdb: hdb1 hdb2


For contrast, here is a similar extract from 2008-03-16 dmesg, after which
the network became bad under light bittorrent pressure:

Linux Tulip driver version 1.1.13-NAPI (May 11, 2002)
hda: WDC WD800JB-00CRA1, ATA DISK drive
Time: acpi_pm clocksource has been installed.
hdb: ST3250623A, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
hdc: Hewlett-Packard CD-Writer Plus 9100, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 9
PCI: setting IRQ 9 as level-triggered
ACPI: PCI Interrupt 0000:00:07.2[D] -> Link [LNKD] -> GSI 9 (level, low) -> IRQ 9
uhci_hcd 0000:00:07.2: UHCI Host Controller
uhci_hcd 0000:00:07.2: new USB bus registered, assigned bus number 1
uhci_hcd 0000:00:07.2: irq 9, io base 0x00001020
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
hda: max request size: 128KiB
hda: 156301488 sectors (80026 MB) w/8192KiB Cache, CHS=65535/16/63, UDMA(33)
hda: cache flushes not supported
hda:PCI: Enabling device 0000:00:0f.0 (0114 -> 0117)
ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 10
PCI: setting IRQ 10 as level-triggered
ACPI: PCI Interrupt 0000:00:0f.0[A] -> Link [LNKC] -> GSI 10 (level, low) -> IRQ 10
tulip0: MII transceiver #1 config 1000 status 786d advertising 05e1.
eth0: ADMtek Comet rev 17 at 00011400, 00:14:BF:5C:E1:35, IRQ 10.
hda1 hda2 hda3 hda4 < hda5 hda6 hda7 hda8 hda9 hda10 >
hdb: max request size: 512KiB
hdb: 488397168 sectors (250059 MB) w/16384KiB Cache, CHS=30401/255/63, UDMA(33)
hdb: cache flushes supported
hdb: hdb1 hdb2


What I notice here is that the log message that should be 1 line like this:

hda: hda1 hda2 hda3 hda4 < hda5 hda6 hda7 hda8 hda9 hda10 >

is split after the " hda:" in all the cases of an unsuccessful boot,
with some of the ethernet initialization messages printed before the
remaining part of the hda message
" hda1 hda2 hda3 hda4 < hda5 hda6 hda7 hda8 hda9 hda10 >"


From what I observe, this corresponds 100% with the bad network behavior.

The kernel version currently running here is:
Linux version 2.6.18-6-686 (Debian 2.6.18.dfsg.1-18etch1) (waldi@debian.org)

I'm willing to try other kernel versions or parameters, and willing to
provide any other info that might help someone understand this problem.

For now, I at least have a clumsy workaround of rebooting until I see that
the eth0 and hda initializations are not intermingled in dmesg.




--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 

Thread Tools




All times are GMT. The time now is 01:49 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org