sirwifi
Regular Contributor
Posting here because I discovered this on Merlin but the latest stock firmware has the same issue. See the bottom of the post for some versions and flavors that have the issue and that do not.
Background:
I have a very simple typical setup, Asus RT-AC68U with a few wired clients and a few wireless clients. Nothing special, no extra routers, VLANs, etc. just as common of a setup as it gets. My main wireless client is a Mac laptop and there's a Linux server (wired) on the LAN that serves multiple purposes.
I discovered that ssh connections from the Mac to the Linux server would just die randomly. Otherwise there were no other symptoms, strong wifi signal, VNC sessions would be fine, no problems streaming media on the LAN or from various services, everything seemed fine. I blamed it on ssh settings so I messed around with ~/.ssh/config settings, keep alives, etc. Nothing helped. Very unusual for an ssh connection to not be stable on my LAN, it used to be OK and the connection could stay up for days, but can't pinpoint exactly when it started.
Problem:
Trying to troubleshoot the above I fired up Wireshark on my Mac trying to find some clues why the connection is dying. While sifting through the traffic, I noticed something unusual: once in a while the Linux server would respond to the Mac with an ICMP Redirect which basically said that the Mac sent a packet to the Linux server that was not meant from it. The router itself would also send ICMP Redirect packets back to the Mac, traffic that was meant for the LAN was being routed through the router instead.
Example: Let's say the router is at .1, Mac is at .10 and Linux server is at .20 on the network (just a typical 192.168.1.0/24 subnet). The Mac has an https connection to some site on the Internet, let's assume gmail.com. All traffic from Mac to gmail should go via the router at .1 But once in a while the Linux server at .20 would get a packet with the source IP of the Mac .10 and the dest IP of gmail.com Of course it would not know what to do with it so it would respond with an ICMP Redirect "ugh, next time route this packet via the gateway at .1". Same deal from the router, some packets from Mac at .10 meant for Linux server at .20 were sent to the router at .1 when they should have been sent directly, no router involved. Of course, net masks on each device were correct otherwise all traffic would be broken. This would happen only for a small number of packets, say 5-6 every 5-10 min, though it's not a time thing, some other sort of problem.
On a normal and simple setup you should never see these ICMP Redirect messages unless something is really messed up. This explains the unstable ssh connections I was seeing, they were breaking because some packets were not sent correctly.
If you want to play with Wireshark yourself, you can use this capture filter:
icmp[icmptype]==icmp-redirect
I swapped in a different old N router, captured a bunch of traffic, no sign of the these ICMP Redirects. ssh connection was also stable. Same experiment on a busy network at work, no sign of ICMP Redirect where I would expect them because of the more complicated setups involving multiple routers, etc.
So I looked at some firmware versions.
Firmware versions that have this problem:
Merlin 384.8_2
Merlin 380.70_0 (tried this one thinking the problem may be introduced by moving to 384)
Stock 3.0.0.4_384_45149-g467037b
Firmware versions that do NOT have this problem:
Stock 3004_376_3626 (yes, very old, I think it's what my router came with)
john9527's fork 374.43_37EAj9527
DD-WRT Brainslayer build 01-10-2019-r38253
DD-WRT Kong build 2018-12-27
This is clearly a software, not a hardware issue.
I didn't try to bisect it and find where it exactly it broke in the Merlin/Asus firmware. For now I'm using DD-WRT Kong that seems stable with very good WiFi performance, but at some point I'd like to get back on Merlin.
Maybe @RMerlin can investigate and/or report it to Asus, stock firmware is broken too.
Background:
I have a very simple typical setup, Asus RT-AC68U with a few wired clients and a few wireless clients. Nothing special, no extra routers, VLANs, etc. just as common of a setup as it gets. My main wireless client is a Mac laptop and there's a Linux server (wired) on the LAN that serves multiple purposes.
I discovered that ssh connections from the Mac to the Linux server would just die randomly. Otherwise there were no other symptoms, strong wifi signal, VNC sessions would be fine, no problems streaming media on the LAN or from various services, everything seemed fine. I blamed it on ssh settings so I messed around with ~/.ssh/config settings, keep alives, etc. Nothing helped. Very unusual for an ssh connection to not be stable on my LAN, it used to be OK and the connection could stay up for days, but can't pinpoint exactly when it started.
Problem:
Trying to troubleshoot the above I fired up Wireshark on my Mac trying to find some clues why the connection is dying. While sifting through the traffic, I noticed something unusual: once in a while the Linux server would respond to the Mac with an ICMP Redirect which basically said that the Mac sent a packet to the Linux server that was not meant from it. The router itself would also send ICMP Redirect packets back to the Mac, traffic that was meant for the LAN was being routed through the router instead.
Example: Let's say the router is at .1, Mac is at .10 and Linux server is at .20 on the network (just a typical 192.168.1.0/24 subnet). The Mac has an https connection to some site on the Internet, let's assume gmail.com. All traffic from Mac to gmail should go via the router at .1 But once in a while the Linux server at .20 would get a packet with the source IP of the Mac .10 and the dest IP of gmail.com Of course it would not know what to do with it so it would respond with an ICMP Redirect "ugh, next time route this packet via the gateway at .1". Same deal from the router, some packets from Mac at .10 meant for Linux server at .20 were sent to the router at .1 when they should have been sent directly, no router involved. Of course, net masks on each device were correct otherwise all traffic would be broken. This would happen only for a small number of packets, say 5-6 every 5-10 min, though it's not a time thing, some other sort of problem.
On a normal and simple setup you should never see these ICMP Redirect messages unless something is really messed up. This explains the unstable ssh connections I was seeing, they were breaking because some packets were not sent correctly.
If you want to play with Wireshark yourself, you can use this capture filter:
icmp[icmptype]==icmp-redirect
I swapped in a different old N router, captured a bunch of traffic, no sign of the these ICMP Redirects. ssh connection was also stable. Same experiment on a busy network at work, no sign of ICMP Redirect where I would expect them because of the more complicated setups involving multiple routers, etc.
So I looked at some firmware versions.
Firmware versions that have this problem:
Merlin 384.8_2
Merlin 380.70_0 (tried this one thinking the problem may be introduced by moving to 384)
Stock 3.0.0.4_384_45149-g467037b
Firmware versions that do NOT have this problem:
Stock 3004_376_3626 (yes, very old, I think it's what my router came with)
john9527's fork 374.43_37EAj9527
DD-WRT Brainslayer build 01-10-2019-r38253
DD-WRT Kong build 2018-12-27
This is clearly a software, not a hardware issue.
I didn't try to bisect it and find where it exactly it broke in the Merlin/Asus firmware. For now I'm using DD-WRT Kong that seems stable with very good WiFi performance, but at some point I'd like to get back on Merlin.
Maybe @RMerlin can investigate and/or report it to Asus, stock firmware is broken too.