Munging the http urls since the listserver claims they are on a spam list . See if this gets through. Also switched to plain text since html didn't edit the URls properly.
Op 24-08-2023 12:50 CEST schreef Boudewijn Visser (nlnog) <bvisser-nlnog@xs4all.nl>:
Hi Stefan,
While I'm quite old skool, I just never really got into irc, so I missed the conversation.
I've had a look at your packet capture . It doesn't seem to be an MTU issue .
Filtering for the traffic captured on the server side : (ip.src_host == 192.46.232.6 && ip.dst_host == 84.28.119.251 ) ||( ip.dst_host == 192.46.232.6 && ip.src_host == 84.28.119.251 )
So it seems your Ziggo public IP is 84.28.119.251 . And filtering for the capture from the inside client side (ip.src_host == 192.46.232.6 && ip.dst_host == 192.168.0.107 ) ||( ip.dst_host == 192.46.232.6 && ip.src_host == 192.168.0.107 )
I see an OK session using source port 50006 , and then a session that seems to have severe packet loss issues with source port 50007 .
See al the TCP retransmissions for the source-port 50007 session - and rarely that a packet gets through.
If you still can use this client (same public IP) try curl --local-port 50006 http://192.46.232.six curl --local-port 50007 http://192.46.232.six
that should replicate the problem exactly, first one always OK, second one always major problems. Note : some socket timeouts when trying multiple times shorty after each other.(bind failure socket already in use )
And - the specific local port that fails or works very likely also depends on the client source IP.
Sabri's suggestion on for tcp-traceroute is also valuable .
(normally , traceroute is done using UDP (classic Unix, Cisco) or ICMP - but it can be done with TCP too. ) With some luck , tcp-traceroute may give a hint for a node or path where the failure starts.
I've done a quick test (I happen to be behind Ziggo at the moment) but a tcp traceroute isn't too conclusive . Generally load balancing within a network is deterministic - based on ip/port combination for example.
IMO, the whole problem still looks like a network link that has severe issues (probably corrups large amount of packets which are then dropped at the neighbor node) , and traffic is load balanced over this link . So some session flows are impacted and others are not .
Since it seems limited to Ziggo clients it would likely be somewhere in the Ziggo network . Something at an exchange point is a remoter possibility - depending on what (other) destinations are impacted it might just not have been noticed either .
(some caveats : NAT in the Ziggo modem may change source port , esp with repeated tests )
I think that to get anything more it will need a quite senior Ziggo network engineer to investigate further.
Best regards, Boudewijn
Op 24-08-2023 08:01 CEST schreef Stefan van den Oord <stefan+nlnog@medicinemen.eu>:
Thanks Boudewijn!
There was a lively conversation about this on #nlnog yesterday, so I forgot to respond to you. I tried changing the MTU to 1420, that didn’t make a difference. I did a packet capture as well. This was between server 192.46.232.6 and client 192.168.0.107. Command used on the server was:
tcpdump -Aennvvi eth0 -w server.pcap port not 22
And on the client (because I was connected through VNC):
sudo tcpdump -Aennvvi en1 -w client.pcap port not 22 and port not 5900
During this capture I did two requests (using curl) to http://192.46.232.six, the first one succeeded and the second one I aborted after half a minute. The result is here: http://192.46.232.six/client+server.pcap
I lack the experience to properly analyse this. Does this contain any clues to you?
-- Stefan van den Oord CTO @ Medicine Men B.V.
Not in the office on Wednesdays
Regulierenring 22 3981 LB Bunnik The Netherlands