FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Kernel

 
 
LinkBack Thread Tools
 
Old 03-15-2010, 04:20 PM
stephen mulcahy
 
Default Bug#572201: Further queries

Hi,

Any further thoughts on this?

In the ethtool output, I notice the following

rx_pause: 46798
rx_drop_frame: 46798

I've checked some other machines and I don't see any of either stat -
possibly because these are specific to some nic drivers? Anyway, is it
normal for those numbers to be the same?


As I said, I'm not seeing the behaviour with the 2.6.30 kernel - so
wondering what has changed.


I see Linux 2.6.32.10 was just released, is it worth my while building
that and seeing if I can reproduce the problem?


-stephen

--
Stephen Mulcahy Atlantic Linux http://www.atlanticlinux.ie
Registered in Ireland, no. 376591 (144 Ros Caoin, Roscam, Galway)



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4B9E6C60.7030300@atlanticlinux.ie">http://lists.debian.org/4B9E6C60.7030300@atlanticlinux.ie
 
Old 03-15-2010, 05:22 PM
Ben Hutchings
 
Default Bug#572201: Further queries

On Mon, Mar 15, 2010 at 05:20:32PM +0000, stephen mulcahy wrote:
> Hi,
>
> Any further thoughts on this?
>
> In the ethtool output, I notice the following
>
> rx_pause: 46798
> rx_drop_frame: 46798
>
> I've checked some other machines and I don't see any of either stat -
> possibly because these are specific to some nic drivers?

The statistics available through ethtool are entirely driver-dependent.
There is a small set of standard statistics which are shown in
/proc/net/dev and under /sys/class/net/<name>/statistics/.

> Anyway, is it normal for those numbers to be the same?

All pause frames should be dropped, either by the hardware or the driver.
So it's not unexpected that these are equal.

It might be interesting to see what happens if you disable pause frame
handling with this command:

ethtool -A eth0 autoneg off rx off tx off

> As I said, I'm not seeing the behaviour with the 2.6.30 kernel - so
> wondering what has changed.

I can't see any major changes in the forcedeth driver since 2.6.30.

> I see Linux 2.6.32.10 was just released, is it worth my while building
> that and seeing if I can reproduce the problem?

We will shortly update the official kernel packages to incorporate this
release, so you could just wait a day or two and update. However I'm not
aware of any changes in 2.6.32.10 that would fix this sort of bug.

Ben.

--
Ben Hutchings
It is a miracle that curiosity survives formal education. - Albert Einstein



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20100315182220.GQ2763@decadent.org.uk">http://lists.debian.org/20100315182220.GQ2763@decadent.org.uk
 
Old 03-16-2010, 09:33 AM
stephen mulcahy
 
Default Bug#572201: Further queries

Ben Hutchings wrote:

On Mon, Mar 15, 2010 at 05:20:32PM +0000, stephen mulcahy wrote:
All pause frames should be dropped, either by the hardware or the driver.
So it's not unexpected that these are equal.


Ok, thanks for the clarification.


It might be interesting to see what happens if you disable pause frame
handling with this command:

ethtool -A eth0 autoneg off rx off tx off


I tried this and re-ran my hadoop test and I'm seeing the same drop-outs
from systems as with this enabled. Running ethtool -S eth0 on a
dropped out system gives the following output.


NIC statistics:
tx_bytes: 45900034824
tx_zero_rexmt: 40968086
tx_one_rexmt: 0
tx_many_rexmt: 0
tx_late_collision: 0
tx_fifo_errors: 0
tx_carrier_errors: 0
tx_excess_deferral: 0
tx_retry_error: 0
rx_frame_error: 0
rx_extra_byte: 0
rx_late_collision: 0
rx_runt: 0
rx_frame_too_long: 0
rx_over_errors: 0
rx_crc_errors: 0
rx_frame_align_error: 0
rx_length_error: 0
rx_unicast: 42104294
rx_multicast: 897
rx_broadcast: 564
rx_packets: 42105755
rx_errors_total: 0
tx_errors_total: 0
tx_deferral: 0
tx_packets: 40968086
rx_bytes: 48159336484
tx_pause: 0
rx_pause: 0
rx_drop_frame: 0
tx_unicast: 3322
tx_multicast: 4392
tx_broadcast: 23998478524

and no messages in the system logs.

These systems are running with DHCP (and have Avahi installed) - is it
possible these are related to the problem (but again, why is it only
showing up when running the 2.6.32 kernel).



I can't see any major changes in the forcedeth driver since 2.6.30.


I scanned what changelogs I could find also and nothing jumped out at me
that could be the cause of this.



We will shortly update the official kernel packages to incorporate this
release, so you could just wait a day or two and update. However I'm not
aware of any changes in 2.6.32.10 that would fix this sort of bug.


Again, I scanned the changelogs and nothing jumped out at me. I'll try
the updated package when you release it to see if it makes a difference.


Let me know if there's any further testing I can do before I roll the
systems back to 2.6.30 and put them back into production.


Thanks,

-stephen

--
Stephen Mulcahy Atlantic Linux http://www.atlanticlinux.ie
Registered in Ireland, no. 376591 (144 Ros Caoin, Roscam, Galway)



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4B9F5E5E.2060209@atlanticlinux.ie">http://lists.debian.org/4B9F5E5E.2060209@atlanticlinux.ie
 
Old 04-04-2010, 03:12 PM
Ben Hutchings
 
Default Bug#572201: Further queries

On Tue, 2010-03-16 at 10:33 +0000, stephen mulcahy wrote:
[...]
> > We will shortly update the official kernel packages to incorporate this
> > release, so you could just wait a day or two and update. However I'm not
> > aware of any changes in 2.6.32.10 that would fix this sort of bug.
>
> Again, I scanned the changelogs and nothing jumped out at me. I'll try
> the updated package when you release it to see if it makes a difference.
[...]

Have you done this and did it help?

Ben.

--
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.
 
Old 04-07-2010, 03:15 PM
stephen mulcahy
 
Default Bug#572201: Further queries

Ben Hutchings wrote:

On Tue, 2010-03-16 at 10:33 +0000, stephen mulcahy wrote:
[...]

We will shortly update the official kernel packages to incorporate this
release, so you could just wait a day or two and update. However I'm not
aware of any changes in 2.6.32.10 that would fix this sort of bug.
Again, I scanned the changelogs and nothing jumped out at me. I'll try
the updated package when you release it to see if it makes a difference.

[...]

Have you done this and did it help?


Hi,

Just tried this now and it doesn't help. Still getting nodes dropping
out after running the Hadoop Terasort (behaviour which doesn't happen
with the 2.6.30 kernel).


Still no messages in the logs - but as usual, ifdown followed by ifup
makes things right.


Anything else I can run for diagnostics?

-stephen

--
Stephen Mulcahy Atlantic Linux http://www.atlanticlinux.ie
Registered in Ireland, no. 376591 (144 Ros Caoin, Roscam, Galway)



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4BBCA19C.5080204@atlanticlinux.ie">http://lists.debian.org/4BBCA19C.5080204@atlanticlinux.ie
 

Thread Tools




All times are GMT. The time now is 11:55 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org