FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian User

 
 
LinkBack Thread Tools
 
Old 07-30-2012, 01:43 PM
Andrew Peng
 
Default PCIe debugging error messages - kern.log

I've been working with the Intel E1000 development team in trying to
find the cause of a hardware hang in my kern.log:

Jul 24 02:49:45 gaia kernel: [806292.204500] e1000e 0000:02:00.0:
eth1: Detected Hardware Unit Hang:
Jul 24 02:49:45 gaia kernel: [806292.204503] TDH <8c>
Jul 24 02:49:45 gaia kernel: [806292.204504] TDT <8f>
Jul 24 02:49:45 gaia kernel: [806292.204505] next_to_use <8f>
Jul 24 02:49:45 gaia kernel: [806292.204506] next_to_clean <8c>
Jul 24 02:49:45 gaia kernel: [806292.204508] buffer_info[next_to_clean]:
Jul 24 02:49:45 gaia kernel: [806292.204509] time_stamp <10c029ca3>
Jul 24 02:49:45 gaia kernel: [806292.204510] next_to_watch <8c>
Jul 24 02:49:45 gaia kernel: [806292.204511] jiffies <10c029dc2>
Jul 24 02:49:45 gaia kernel: [806292.204512] next_to_watch.status <0>
Jul 24 02:49:45 gaia kernel: [806292.204513] MAC Status <80383>
Jul 24 02:49:45 gaia kernel: [806292.204514] PHY Status <792d>
Jul 24 02:49:45 gaia kernel: [806292.204516] PHY 1000BASE-T Status <3800>
Jul 24 02:49:45 gaia kernel: [806292.204517] PHY Extended Status <3000>
Jul 24 02:49:45 gaia kernel: [806292.204518] PCI Status <10>


One of the steps to find the cause of this is to enable extended error
reporting by using ethtool:
sudo ethtool -s eth1 msglvl 0x2c01

This will tell the driver to dump extended debugging (a PCIe Ring
Dump) info to the kernel log with another error is detected. However,
after enabling this extended logging, the next time an error occurs, I
still don't get the debug dump in the kern.log. I get basically the
same info as above.

Is there anything Debian specific that would cause this to not get
logged? I've tried searching all of the log files in /var/log for the
dump information in case it gets logged somewhere else, but could not
find anything. I have checked to make sure that the debug message
level is set correctly:

Settings for eth1:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
MDI-X: off
Supports Wake-on: pumbag
Wake-on: g
Current message level: 0x00002c01 (11265)
Link detected: yes


Any help would be appreciated.

Thanks

--Andrew


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: http://lists.debian.org/CABrZ_K007dSpo9pRdQ-A1=6=2OCJRgjCvF0cc=DrM+304nM+uQ@mail.gmail.com
 
Old 07-30-2012, 04:11 PM
Camaleón
 
Default PCIe debugging error messages - kern.log

On Mon, 30 Jul 2012 08:43:56 -0500, Andrew Peng wrote:

> I've been working with the Intel E1000 development team in trying to
> find the cause of a hardware hang in my kern.log:

(...)

I've read the report you posted at "e1000-devel" mailing list.

> One of the steps to find the cause of this is to enable extended error
> reporting by using ethtool:
> sudo ethtool -s eth1 msglvl 0x2c01
>
> This will tell the driver to dump extended debugging (a PCIe Ring Dump)
> info to the kernel log with another error is detected. However, after
> enabling this extended logging, the next time an error occurs, I still
> don't get the debug dump in the kern.log. I get basically the same info
> as above.

(...)

It could be something specific to the kernel module in use (e1000e).
Consider sending this question also at "debian-kernel" where Debian
kernel hackers use to be.

Greetings,

--
Camaleón


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: http://lists.debian.org/jv6bns$ltk$12@dough.gmane.org
 
Old 07-30-2012, 05:51 PM
Henrique de Moraes Holschuh
 
Default PCIe debugging error messages - kern.log

On Mon, 30 Jul 2012, Andrew Peng wrote:
> I've been working with the Intel E1000 development team in trying to
> find the cause of a hardware hang in my kern.log:
>
> Jul 24 02:49:45 gaia kernel: [806292.204500] e1000e 0000:02:00.0:
> eth1: Detected Hardware Unit Hang:

Meh, we've just seen those in one of our servers at work, for the first
time ever.

> This will tell the driver to dump extended debugging (a PCIe Ring
> Dump) info to the kernel log with another error is detected. However,
> after enabling this extended logging, the next time an error occurs, I
> still don't get the debug dump in the kern.log. I get basically the
> same info as above.

Check whether loglevel DEBUG is being redirected somewhere, in
/etc/rsyslog.conf or /etc/syslog.conf, etc.

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120730175125.GA18813@khazad-dum.debian.net">http://lists.debian.org/20120730175125.GA18813@khazad-dum.debian.net
 
Old 08-01-2012, 07:47 PM
Andrew Peng
 
Default PCIe debugging error messages - kern.log

I have checked to make sure Debug hasn't been redirected; I will ask
the debian-kernel list and see if I can find anything there. Thanks
for the help folks.

--Andrew

On Mon, Jul 30, 2012 at 12:51 PM, Henrique de Moraes Holschuh
<hmh@debian.org> wrote:
> On Mon, 30 Jul 2012, Andrew Peng wrote:
>> I've been working with the Intel E1000 development team in trying to
>> find the cause of a hardware hang in my kern.log:
>>
>> Jul 24 02:49:45 gaia kernel: [806292.204500] e1000e 0000:02:00.0:
>> eth1: Detected Hardware Unit Hang:
>
> Meh, we've just seen those in one of our servers at work, for the first
> time ever.
>
>> This will tell the driver to dump extended debugging (a PCIe Ring
>> Dump) info to the kernel log with another error is detected. However,
>> after enabling this extended logging, the next time an error occurs, I
>> still don't get the debug dump in the kern.log. I get basically the
>> same info as above.
>
> Check whether loglevel DEBUG is being redirected somewhere, in
> /etc/rsyslog.conf or /etc/syslog.conf, etc.
>
> --
> "One disk to rule them all, One disk to find them. One disk to bring
> them all and in the darkness grind them. In the Land of Redmond
> where the shadows lie." -- The Silicon Valley Tarot
> Henrique Holschuh


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: CABrZ_K2P+zCHF1GGEYLEjJ4Mr8415hLO2dn4pJ6A45305V9Gp Q@mail.gmail.com">http://lists.debian.org/CABrZ_K2P+zCHF1GGEYLEjJ4Mr8415hLO2dn4pJ6A45305V9Gp Q@mail.gmail.com
 
Old 08-06-2012, 08:24 PM
Andrew Peng
 
Default PCIe debugging error messages - kern.log

I've been working with the Intel E1000 development team in trying to
find the cause of a hardware hang in my kern.log. They suggested
contacting the Debian User list for extra help, whom suggested that I
ask the Kernel list to see if I could get any insight.

This is the error in my kern.log:

Jul 24 02:49:45 gaia kernel: [806292.204500] e1000e 0000:02:00.0:
eth1: Detected Hardware Unit Hang:
Jul 24 02:49:45 gaia kernel: [806292.204503] TDH <8c>
Jul 24 02:49:45 gaia kernel: [806292.204504] TDT <8f>
Jul 24 02:49:45 gaia kernel: [806292.204505] next_to_use <8f>
Jul 24 02:49:45 gaia kernel: [806292.204506] next_to_clean <8c>
Jul 24 02:49:45 gaia kernel: [806292.204508] buffer_info[next_to_clean]:
Jul 24 02:49:45 gaia kernel: [806292.204509] time_stamp <10c029ca3>
Jul 24 02:49:45 gaia kernel: [806292.204510] next_to_watch <8c>
Jul 24 02:49:45 gaia kernel: [806292.204511] jiffies <10c029dc2>
Jul 24 02:49:45 gaia kernel: [806292.204512] next_to_watch.status <0>
Jul 24 02:49:45 gaia kernel: [806292.204513] MAC Status <80383>
Jul 24 02:49:45 gaia kernel: [806292.204514] PHY Status <792d>
Jul 24 02:49:45 gaia kernel: [806292.204516] PHY 1000BASE-T Status <3800>
Jul 24 02:49:45 gaia kernel: [806292.204517] PHY Extended Status <3000>
Jul 24 02:49:45 gaia kernel: [806292.204518] PCI Status <10>


One of the steps to find the cause of this is to enable extended error
reporting by using ethtool:
sudo ethtool -s eth1 msglvl 0x2c01

This will tell the driver to dump extended debugging (a PCIe Ring
Dump) info to the kernel log with another error is detected. However,
after enabling this extended logging, the next time an error occurs, I
still don't get the debug dump in the kern.log. I get basically the
same info as above.

Is there anything Debian specific that would cause this to not get
logged? I've tried searching all of the log files in /var/log for the
dump information in case it gets logged somewhere else, but could not
find anything. I have checked to make sure that the debug message
level is set correctly:

Settings for eth1:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
MDI-X: off
Supports Wake-on: pumbag
Wake-on: g
Current message level: 0x00002c01 (11265)
Link detected: yes


Any help would be appreciated.

Thanks


--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: http://lists.debian.org/CABrZ_K2orMgc-3oDgM8mm1Ew=8YTUVRDWO95JXLB2qPB5q-Qtw@mail.gmail.com
 
Old 08-12-2012, 11:53 PM
Ben Hutchings
 
Default PCIe debugging error messages - kern.log

On Mon, 2012-08-06 at 15:24 -0500, Andrew Peng wrote:
> I've been working with the Intel E1000 development team in trying to
> find the cause of a hardware hang in my kern.log. They suggested
> contacting the Debian User list for extra help, whom suggested that I
> ask the Kernel list to see if I could get any insight.
>
> This is the error in my kern.log:
>
> Jul 24 02:49:45 gaia kernel: [806292.204500] e1000e 0000:02:00.0:
> eth1: Detected Hardware Unit Hang:
[...]
> One of the steps to find the cause of this is to enable extended error
> reporting by using ethtool:
> sudo ethtool -s eth1 msglvl 0x2c01
>
> This will tell the driver to dump extended debugging (a PCIe Ring
> Dump) info to the kernel log with another error is detected. However,
> after enabling this extended logging, the next time an error occurs, I
> still don't get the debug dump in the kern.log. I get basically the
> same info as above.
>
> Is there anything Debian specific that would cause this to not get
> logged?
[...]

The ring dump is only shown in case the driver resets the chip, and it
doesn't do that in the case of Hardware Unit Hang. So I think whichever
developer told you this was confused.

Ben.

--
Ben Hutchings
I say we take off; nuke the site from orbit. It's the only way to be sure.
 

Thread Tools




All times are GMT. The time now is 04:39 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2007 - 2008, www.linux-archive.org