What's new

what can cause outgoing ssh to fail at once?

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

Pila

Regular Contributor
I have 3 separate networks demanding unmanned availablity 24/7. For years now, they check each other and the Internet life several ways and fix all the problems on their own.

lan.1 is the Master and it will ssh to touch a file at lan.2 and lan.3 at certain interval. This test is new and is used for 3 months. Regardles of this test, ssh is used into and out of each location for 100s times every day. lan.1 and lan.2 are constantly connected by VPN (two way), while the lan.3 is only accesible by an ssh to the nonstandard port using the key.

What can cause outgoing ssh to stop working from the router at lan.1 towards the other two? After working normally for 3 months? Tests were repeated 10+ times over half an hour.

The only difference at the time at the log (repated 200 times in 30 sec interval), I can not explain:

Code:
Sep 19 23:14:25 watchdog: start ddns.
Sep 19 23:14:25 rc_service: watchdog 445:notify_rc start_ddns
Sep 19 23:14:25 custom script: Running /jffs/scripts/ddns-start (args: 192.168.10.10)

ddns-start is my main managing program. I used it manually many times during the incident. It is performing these ssh to remote machines. It was working perfectly all the time. Custom DDNS is the only router function related to it and running it. But my complex ddns-start takes over and refuses to do anything unnecessary, will not run two instances and will clear remains should any instance hung.

What was not an issue? Interet was OK at all 3 locations. VPNs were working. I was able to connect from lan.2 to both lan.1 (ssh via VPN) and lan.3 (ssh). Other local computers were ssh-ing to lan.1 router with no problems. So, only the outgoing ssh from lan.1 had a problem. Free memory at router at lan.1 was at regular levels. All my other tests noticed nothing! Speedtest was normal.

Later I found out: less than 1 hour prior to alarms being sent out to me and prior to this above long entries, another user connected VPN several times there, but was unable to connect to a .7 computer on that LAN. Never happend before, so that could be related. Funny thing is: that same LAN .7 computer had no problems with ssh to the router during the incident!

As lan.1 must work, after half an hour of investigating into the problem, I did not find anything and I rebooted the modem (not the router!) and everything went back to normal. Rebooting the modem obviously resets the DDNS (double NAT). Now, 6 days after the ssh incident, all is well and the router is 91 days up with 164 mb free ram and 18044 left in NVRAM.

The router at lan.1 is an Asus RT-AC68U with old fw (380.59). Please, unless there is something related precisely documented, do not suggest fw as a problem. I have an RT-AC66U_B1 380.70 and its ssh is VERY problematic. My other routers with 380.59 do not have any problems for the last 3 years.
 
The router at lan.1 is an Asus RT-AC68U with old fw (380.59). Please, unless there is something related precisely documented, do not suggest fw as a problem. I have an RT-AC66U_B1 380.70 and its ssh is VERY problematic. My other routers with 380.59 do not have any problems for the last 3 years.
Only in case you think the 66U_B1 makes the problem you could try same firmware as on your 68U, they are only named differently (irrelevant) but use exactly same firmware since it has been published.
 
Only in case you think the 66U_B1 makes the problem you could try same firmware as on your 68U.

I prefer to adapt if possible :) 380.70 works very well and has some improvements (NTP server, nano, time, naming VPN clients), I adapted to its quirks and there are several. The worst is: ocassionaly 66U_B1 380.70 refuses to connect ssh for some time.

I did try the RT-AC68U_380.59 fw on my frist 66U_B1 (which died after few days of use). This old fw had some new problems on B1, not existing on my 68 and 56 with that same fw, so I decied against it quite fast.
 

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top