Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Debian ISP (http://www.linux-archive.org/debian-isp/)
-   -   Weird routing / arp / ppp problem - low upload after debian upgrade (http://www.linux-archive.org/debian-isp/463035-weird-routing-arp-ppp-problem-low-upload-after-debian-upgrade.html)

Wojciech Ziniewicz 12-08-2010 02:26 PM

Weird routing / arp / ppp problem - low upload after debian upgrade
 
Hi,
After upgrade from old patched etch, my clients cannot browse internet anymore* (upload is ok but download not bigger than* few kbps ) - problem occurs randomly - other services that use small packets like voip work perfectly.



Here's my detailed problem.

I have pppoe concentrator serving several hundreds of computers . Every user can have public IP (directly on the pppoe tunnel without snat/dnat) or snated private IP. Those clients with public IP are proxy_arp'ed so world can see them. Incoming traffic goes on imq0 and outgoing on eth0 - traffic shaping looks fine


This is typical example of firewall rule generated for public IP :

iptables :
*iptables -t filter -A FORWARD -i ppp+* -s 217.17.10.250 -j ACCEPT
*iptables -t filter -A FORWARD -d 217.17.10.250 -j ACCEPT
shaping :


*iptables -t mangle -A UPLOAD -p all -o eth0 -s 217.17.10.250 -j CLASSIFY --set-class 2:246
*tc filter add dev imq0 parent 1: protocol ip u32 match ip dst 217.17.38.250 flowid 1:246
*tc class add dev imq0 parent 1:2 classid 1:246 htb rate 128kbit ceil 4096kbit burst 4096kbit prio 5 quantum 8


*tc qdisc add dev imq0 parent 1:246 handle 246:0 sfq perturb 10
*tc class add dev eth0 parent 2:2 classid 2:246 htb rate 128kbit ceil 4096kbit burst 4096kbit prio 5 quantum 8
*tc qdisc add dev eth0 parent 2:246* handle 246:0 sfq perturb 10



for private IP we have almost the same but there's SNAT in the iptables part. Every client has the same formula for generating iptables firewall.

My problem is following - totally random clients are having problems with download. If I use mikrotik bandwidth tester from internet to their computer it gives transfer like Xmbits upload (from their side) and 10-15kbps in direction to the client. Problem ONLY occurs when they are behind their client router. If they connect via pppoe directly to my server - problem disappeares.



Bandwidth tester uses big packets so they are fragmented. If I use packets like ping - they have nice transfer and everything is reachable from them.* The problem on the side of client looks like they can browse internet but google.com loads for like 20 minutes but voip works fine. Moreover if i take client router and place it in the other place of my "lan" (my lan is 100% bridged with mikrotik) , it usually works.



What i've triple checked :
- generation of iptables/tc rules
- pppoe MTU (1480 or 1492 - both working )
- mss - path mtu discovery packets are not blocked, everything looks fine

What i suspect :


- some arp problem maybe ?

Problem began right after i've changed my old Etch server on 2.6.15 witch patched iptables and kernel with patch-o-matic into clean 2.6.32 squeeze with everything from apt. My sysctl.conf along with pppoe-server-options is attached at the end of this message.


I've done also tcpdump sniff on the clients interface many times and nothing drags my attention.

Typical arp entry for snated IP is :
? (10.100.0.25) at <incomplete> on eth1
? (10.100.0.25) at <from_interface> PERM PUB on eth1



Typical arp entry for public IP looks like :
? (217.17.10.250) at <from_interface> PERM PUB on eth0

Any clues will be VERY appreciated.
debian-firewall , please Cc to me as I'm not subscribed please.



Regards
WZ

------------sysctl.conf---------------
kernel.panic = 3
net.core.rmem_max = 131071
net.core.wmem_max = 131071
net.ipv4.conf.all.arp_announce = 0
net.ipv4.conf.all.arp_ignore = 0


net.ipv4.conf.all.proxy_arp = 1
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.default.arp_announce = 0
net.ipv4.conf.default.arp_filter = 0
net.ipv4.conf.default.arp_ignore = 0
net.ipv4.conf.default.rp_filter = 0


net.ipv4.conf.eth0.arp_announce = 0
net.ipv4.conf.eth0.arp_ignore = 0
net.ipv4.conf.eth0.proxy_arp = 1
net.ipv4.conf.eth1.arp_announce = 0
net.ipv4.conf.eth1.arp_ignore = 0
net.ipv4.conf.eth1.proxy_arp = 0


net.ipv4.conf.eth2.proxy_arp = 0
net.ipv4.ip_forward = 1
net.ipv4.ip_local_port_range = 1024 4999
net.ipv4.neigh.default.base_reachable_time = 1036800
net.ipv4.neigh.default.gc_thresh1 = 1024
net.ipv4.neigh.default.gc_thresh2 = 8192


net.ipv4.neigh.default.gc_thresh3 = 32768
net.ipv4.neigh.default.ucast_solicit = 4
net.ipv4.neigh.eth0.base_reachable_time = 1036800
net.ipv4.neigh.eth0.ucast_solicit = 4
net.ipv4.neigh.eth1.ucast_solicit = 4


net.ipv4.neigh.eth2.ucast_solicit = 4
net.ipv4.neigh.imq0.ucast_solicit = 4
net.ipv4.neigh.imq1.ucast_solicit = 4
net.ipv4.neigh.lo.ucast_solicit = 4
net.ipv4.netfilter.ip_conntrack_max = 132760
net.ipv4.netfilter.ip_conntrack_tcp_timeout_close_ wait = 10


net.ipv4.netfilter.ip_conntrack_tcp_timeout_close = 5
net.ipv4.netfilter.ip_conntrack_tcp_timeout_establ ished = 43200
net.ipv4.netfilter.ip_conntrack_tcp_timeout_fin_wa it = 30
net.ipv4.netfilter.ip_conntrack_tcp_timeout_last_a ck = 30


net.ipv4.netfilter.ip_conntrack_tcp_timeout_syn_re cv = 60
net.ipv4.netfilter.ip_conntrack_tcp_timeout_syn_se nt = 120
net.ipv4.netfilter.ip_conntrack_tcp_timeout_time_w ait = 20
net.ipv4.tcp_dsack = 0
net.ipv4.tcp_ecn = 0


net.ipv4.tcp_fack = 1
net.ipv4.tcp_mem = 393216****** 524288* 786432
net.ipv4.tcp_rmem = 4096******* 87380** 174760
net.ipv4.tcp_sack = 0
net.ipv4.tcp_syncookies = 0
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_timestamps = 0


net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_wmem = 4096******* 16384** 131072
-----------------------------------tcpdump of client -------------(myptr is the client's public IP)------------
23:11:45.110682 IP ew-in-f104.1e100.net.www > 190.myptr.com.33773: Flags [.], seq 966809112:966810542, ack 3627825205, win 122, length 1430


23:11:49.509890 IP 190.myptr.com.32876 > sip.voice.gtsenergis.pl.sip: SIP, length: 369
23:11:49.512774 IP sip.voice.gtsenergis.pl.sip > 190.myptr.com.32876: SIP, length: 366
23:11:53.811644 IP 190.myptr.com.49186 > 94.245.115.184.3544: UDP, length 61


23:11:53.853959 IP 94.245.115.184.3544 > 190.myptr.com.49186: UDP, length 109
23:11:55.110798 IP ew-in-f104.1e100.net.www > 190.myptr.com.33773: Flags [.], seq 0:1430, ack 1, win 122, length 1430
23:11:56.383790 IP 151.59.26.182.46119 > 190.myptr.com.33260: Flags [F.], seq 3286477939, ack 3609663927, win 65364, length 0


23:11:56.622196 IP 158.129.20.136.35137 > 190.myptr.com.32923: Flags [F.], seq 840311840, ack 4022529708, win 65373, length 0
23:12:00.277498 IP 190.myptr.com.isakmp > ip-89.171.11.42.static.crowley.pl.isakmp: isakmp: phase 1 I ident


23:12:04.529779 IP 190.myptr.com.32876 > sip.voice.gtsenergis.pl.sip: SIP, length: 369
23:12:04.532416 IP sip.voice.gtsenergis.pl.sip > 190.myptr.com.32876: SIP, length: 366
23:12:05.111138 IP ew-in-f104.1e100.net.www > 190.myptr.com.33773: Flags [.], seq 0:1430, ack 1, win 122, length 1430


23:12:05.659030 IP 130pc240.sshunet.nl.https > 190.myptr.com.32786: Flags [R.], seq 3389053229, ack 421162466, win 0, length 0
23:12:09.608646 IP 190.myptr.com.isakmp > ip-89.171.11.42.static.crowley.pl.isakmp: isakmp: phase 1 I ident


23:12:10.174398 IP 190.myptr.com.33580 > 10.10.123.30.www: Flags [S], seq 3710035758, win 8192, options [mss 1452,nop,wscale 8,nop,nop,sackOK], length 0
23:12:13.187047 IP 190.myptr.com.33580 > 10.10.123.30.www: Flags [S], seq 3710035758, win 8192, options [mss 1452,nop,wscale 8,nop,nop,sackOK], length 0


23:12:15.111714 IP ew-in-f104.1e100.net.www > 190.myptr.com.33773: Flags [.], seq 0:1430, ack 1, win 122, length 1430
23:12:19.548671 IP 190.myptr.com.32876 > sip.voice.gtsenergis.pl.sip: SIP, length: 369
23:12:19.551750 IP sip.voice.gtsenergis.pl.sip > 190.myptr.com.32876: SIP, length: 366


23:12:23.548358 IP 190.myptr.com.33568 > 10.10.123.30.www: Flags [S], seq 2757323080, win 8192, options [mss 1452,nop,wscale 8,nop,nop,sackOK], length 0
23:12:25.112024 IP ew-in-f104.1e100.net.www > 190.myptr.com.33773: Flags [.], seq 0:1430, ack 1, win 122, length 1430


23:12:26.548632 IP 190.myptr.com.33568 > 10.10.123.30.www: Flags [S], seq 2757323080, win 8192, options [mss 1452,nop,wscale 8,nop,nop,sackOK], length 0
23:12:32.191530 IP 139.91.70.35.19609 > 190.myptr.com.33196: Flags [F.], seq 604234015, ack 3123515410, win 17520, length 0


-----------------pppoe server options-----------------------
plugin radius.so
plugin radattr.so
auth
require-chap
lcp-echo-interval 10
lcp-echo-failure 5
ms-dns 217.17.10.208
ms-dns 217.17.10.10


proxyarp
noipx
mtu 1460
mru 1460



--
Wojciech Ziniewicz
http://www.rfc-editor.org/rfc/rfc2324.txt

Brett Parker 12-08-2010 03:47 PM

Weird routing / arp / ppp problem - low upload after debian upgrade
 
On 08 Dec 16:26, Wojciech Ziniewicz wrote:
> Hi,
> After upgrade from old patched etch, my clients cannot browse internet
> anymore (upload is ok but download not bigger than few kbps ) - problem
> occurs randomly - other services that use small packets like voip work
> perfectly.

Between that system and the outside world, is there another
router/firewall?

My initial guess would be that you've hit the tcp window scale problem,
you can (quickly) check this by doing:
sysctl net.ipv4.tcp_window_scaling=0

On the box that they're going through - if that works then you've got a
box between that and the internet that doesn't watch the window scaling
flag as it goes past, and therefore mangles packets later on because it
doesn't know that they can get through.

Hope that helps,
--
Brett Parker http://www.sommitrealweird.co.uk/
PGP Fingerprint 1A9E C066 EDEE 6746 36CB BD7F 479E C24F 95C7 1D61


--
To UNSUBSCRIBE, email to debian-isp-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20101208164714.GD4830@sommitrealweird.co.uk">http://lists.debian.org/20101208164714.GD4830@sommitrealweird.co.uk

Wojciech Ziniewicz 12-08-2010 05:09 PM

Weird routing / arp / ppp problem - low upload after debian upgrade
 
2010/12/8 Brett Parker <iDunno@sommitrealweird.co.uk>


On 08 Dec 16:26, Wojciech Ziniewicz wrote:

> Hi,

> After upgrade from old patched etch, my clients cannot browse internet

> anymore *(upload is ok but download not bigger than *few kbps ) - problem

> occurs randomly - other services that use small packets like voip work

> perfectly.



Between that system and the outside world, is there another

router/firewall?
*there's only my router with BGP session acting as a gateway
*



My initial guess would be that you've hit the tcp window scale problem,

you can (quickly) check this by doing:

* *sysctl net.ipv4.tcp_window_scaling=0

I did some tests with both settings :
1.
telneting on a host behind my client's router :
listening on ppp296, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes


18:58:39.988961 IP 1.mydomain.com.3718 > 10.100.0.194.telnet: Flags [S], seq 3615516481, win 5808, options [mss 1452,nop,wscale 2], length 0
18:58:39.991914 IP 10.100.0.194.telnet > 1.mydomain.com.3718: Flags [S.], seq 1759632665, ack 3615516482, win 5840, options [mss 1452,nop,wscale 0], length 0


18:58:39.991975 IP 1.mydomain.com.3718 > 10.100.0.194.telnet: Flags [.], ack 1, win 1452, length 0
18:58:39.992118 IP 1.mydomain.com.3718 > 10.100.0.194.telnet: Flags [P.], seq 1:25, ack 1, win 1452, length 24


18:58:40.020668 IP 10.100.0.194.telnet > 1.mydomain.com.3718: Flags [P.], seq 1:13, ack 25, win 5840, length 12
18:58:40.020742 IP 1.mydomain.com.3718 > 10.100.0.194.telnet: Flags [.], ack 13, win 1452, length 0


18:58:40.020847 IP 1.mydomain.com.3718 > 10.100.0.194.telnet: Flags [P.], seq 25:28, ack 13, win 1452, length 3
18:58:40.023064 IP 10.100.0.194.telnet > 1.mydomain.com.3718: Flags [P.], seq 13:28, ack 25, win 5840, length 15


18:58:40.054093 IP 1.mydomain.com.3718 > 10.100.0.194.telnet: Flags [.], ack 28, win 1452, length 0
18:58:40.056014 IP 10.100.0.194.telnet > 1.mydomain.com.3718: Flags [P.], seq 28:46, ack 28, win 5840, length 18


--- from now on my router tries to get response from the box behind firewall
18:58:40.056068 IP 1.mydomain.com.3718 > 10.100.0.194.telnet: Flags [P.], seq 28:37, ack 46, win 1452, length 9
18:58:40.284102 IP 1.mydomain.com.3718 > 10.100.0.194.telnet: Flags [P.], seq 28:37, ack 46, win 1452, length 9


18:58:40.744084 IP 1.mydomain.com.3718 > 10.100.0.194.telnet: Flags [P.], seq 28:37, ack 46, win 1452, length 9
18:58:41.664196 IP 1.mydomain.com.3718 > 10.100.0.194.telnet: Flags [P.], seq 28:37, ack 46, win 1452, length 9


18:58:43.504119 IP 1.mydomain.com.3718 > 10.100.0.194.telnet: Flags [P.], seq 28:37, ack 46, win 1452, length 9
18:58:47.184092 IP 1.mydomain.com.3718 > 10.100.0.194.telnet: Flags [P.], seq 28:37, ack 46, win 1452, length 9



output of telnet is :

root@beta2:/home/wojtek# telnet 10.100.0.194
Trying 10.100.0.194...
Connected to 10.100.0.194.
Escape character is '^]'.

*it should be prompt for login and password.




2. after doing the tcp window scaling change i repeated the telnet procedure and here's another sniff from my pppoe-server

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

listening on ppp296, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes

19:03:12.045452 IP 1.mydomain.com.1048 > 10.100.0.194.telnet: Flags [S], seq 3534691937, win 5808, options [mss 1452], length 0
19:03:12.047495 IP 10.100.0.194.4119 > 1.mydomain.com.1048: Flags [R.], seq 0, ack 3534691938, win 0, length 0


19:03:18.092283 IP 1.mydomain.com.1048 > 10.100.0.194.telnet: Flags [S], seq 3534691937, win 5808, options [mss 1452], length 0
19:03:18.094212 IP 10.100.0.194.4119 > 1.mydomain.com.1048: Flags [R.], seq 0, ack 1, win 0, length 0



syn with reset all the time - totally no connectivity.

so with tcp scaling on my server we have packets going thru client's nat but big packets cannot go thru . on the other hand when I turn tcp window scaling to "on" i can't even connect (reset + syn all the time), but icmp goes thruough both in 1 and 2 case



frankly i have no clue why O_o

--
Wojciech Ziniewicz
http://www.rfc-editor.org/rfc/rfc2324.txt


All times are GMT. The time now is 04:23 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.