So, I've spent all weekend looking into this, and I'm still no closer to
I've tried replacing the NICs, swapping the switches, removing the
switches, isolating the machines, replacing the wiring, and logging the
That last one was quite interesting actually. I added log rules for ICMP
traffic to the nat table's prerouting and postrouting chains, and the
filter table's input, forward, and output chains.
When an outage is occurring, pinging the internal NIC from my
workstation shows up packets. Pinging the external NIC from my
workstation doesn't show a thing. The packets don't even seem to be
reaching the prerouting chain. Once the outage finishes, they start
appearing as normal. The server is able to ping both it's interfaces at
all times, as you'd expect. A laptop on the external network is able to
ping the external NIC, (but obviously not the internal one).
I get no reported dropped packets anywhere. I did notice some
rx_crc_errors on the internal NIC (using ethtool), (which is why I tried
replacing the wiring), but these don't seem to go up when the problem
occurs (i.e. they didn't increment at all during outages) - I'm going to
hazard that they're another issue entirely. ~1000 errors out of
~100000000 good receives doesn't suggest anything major
NIC-wise, I tried swapping to both being r8169, and both being e1000.
Identical results regardless of hardware involved.
My next port of call I guess has to be trying older kernels and seeing
if I get the same symptoms.
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact firstname.lastname@example.org