What's new

Mysterious loss of connectivity

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

Lynx

Senior Member
I have the following setup:
ASUS RT-AX86U (192.168.1.1) -> Huawei 4G modem (192.168.8.1)
On the Asus router I have setup NordVPN via OpenVPN.
After a few days (weeks?) of connectivity something eventually breaks and I lose internet connectivity until I reboot my router, which restores connectivity.
I am not sure if this is to do with the VPN breaking, but on the VPN client page I think it states 'Service state': OFF with public IP: 0.0.0.0 and then when I click the radio button to try to achieve 'Service state': ON in the GUI, then I see that in syslog the following entries are generated in respect of ovpn-client:
Jul 8 08:42:38 ovpn-client1[25257]: TCP/UDP: Preserving recently used remote address: [AF_INET]178.239.162.243:1194​
Jul 8 08:42:38 ovpn-client1[25257]: Socket Buffers: R=[524288->524288] S=[524288->524288]​
Jul 8 08:42:38 ovpn-client1[25257]: UDP link local: (not bound)​
Jul 8 08:42:38 ovpn-client1[25257]: UDP link remote: [AF_INET]178.239.162.243:1194​
Jul 8 08:43:38 ovpn-client1[25257]: TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity)
Jul 8 08:43:38 ovpn-client1[25257]: TLS Error: TLS handshake failed
Jul 8 08:43:38 ovpn-client1[25257]: SIGUSR1[soft,tls-error] received, process restarting​
In case this is relevant, I think I am in a kind of double NAT scenario because my 4G ISP employs CG-NAT? My router gets assigned as its WAN address a local IP address by my ISP in the range:
10.0.0.0 — 10.255.255.255.​
Without VPN, when I traceroute to 8.8.8.8 it goes via ISP's internal addresses and then external.
With VPN, I get something like:
traceroute to 8.8.8.8
10.8.1.1 (next hop, which is presumably the ISP gateway)
IP of NordVPN
etc.
Any idea what my problem might be? Could there be a DHCP issue or something?
If this happens again how can I diagnose when I ssh in? This happened sometime last night and I obtained syslog files, but I don't think they go back far enough to show the issue or, if they do, I am not sure what to look for.
 
Last edited:
Thank you for your response ColinTaylor. I did capture the complete syslog as soon as I identified the issue this morning - here it is:
syslog.log:
syslog.log-1:
Is there anything present? Or is the problem that the syslog does not go back far enough?
 
Thanks. There's nothing to indicate any problems in the logs, but as you say they don't go back to when the problem first occurred. There's also nothing suggesting there's an issue with internet connectivity generally when you tried to reconnect.

My first guess would be that it's just an issue with that particular NordVPN server becoming unavailable for a short period of time, it happens. You could try configuring the NordVPN desktop client to connect to the same server as the router (UK #2220) so that the next time the router can't connect you can see whether the desktop client has the same problem.
 
Thank you. For the next time that my connection goes into this bad state, are there any diagnostic tests you think I should run? Perhaps traceroute 8.8.8.8 and anything else? When it goes into this state it seems that a reboot of my router fixes it. I wonder if it is not VPN related and rather related to my specific setup of having:
Asus RT-AX86U (192.168.1.1) -> Huawei 4G router in bridge mode (192.168.8.1).
If my ISP assigns my Asus RT-AX86U a local IP address, could it be that that goes 'stale' after a while, e.g. if the same address is assigned to another user? The fact that reboot fixes the issue makes me wonder if it is something DHCP related.
Or could it simply be to do with WAN IP changing as the ISP assigns the Asus router a different IP? Would the OpenVPN session handle a change in WAN IP properly?
I tried just now entering:
/sbin/service restart_wan
And I lost internet connectivity.
Does this syslog reveal anything useful:
This seems significant:
Jul 8 15:20:55 WAN_Connection: ISP's DHCP did not function properly.
 
Last edited:
Thank you. For the next time that my connection goes into this bad state, are there any diagnostic tests you think I should run?
Just check that you can browse the internet as normal, assuming you haven't configured the VPN client to block internet access if the tunnel goes down.

If my ISP assigns my Asus RT-AX86U a local IP address, could it be that that goes 'stale' after a while, e.g. if the same address is assigned to another user? The fact that reboot fixes the issue makes me wonder if it is something DHCP related.
Or could it simply be to do with WAN IP changing as the ISP assigns the Asus router a different IP? Would the OpenVPN session handle a change in WAN IP properly?
None of that should be a problem.

I tried just now entering:
/sbin/service restart_wan
And I lost internet connectivity.
Does this syslog reveal anything useful:
I can't access that file.

This seems significant:
Jul 8 15:20:55 WAN_Connection: ISP's DHCP did not function properly.
That's normal. It will be followed later by "WAN_Connection: WAN was restored."
 
Thank you. For the next time that my connection goes into this bad state, are there any diagnostic tests you think I should run? Perhaps traceroute 8.8.8.8 and anything else? When it goes into this state it seems that a reboot of my router fixes it. I wonder if it is not VPN related and rather related to my specific setup of having:
Asus RT-AX86U (192.168.1.1) -> Huawei 4G router in bridge mode (192.168.8.1).
If my ISP assigns my Asus RT-AX86U a local IP address, could it be that that goes 'stale' after a while, e.g. if the same address is assigned to another user? The fact that reboot fixes the issue makes me wonder if it is something DHCP related.
Or could it simply be to do with WAN IP changing as the ISP assigns the Asus router a different IP? Would the OpenVPN session handle a change in WAN IP properly?
I tried just now entering:

And I lost internet connectivity.
Does this syslog reveal anything useful:
This seems significant:
Sorry link is here:
Could it be related to this:
There users also describe problems with being behind LTE 4G modem and needing to enable DHCP continuous mode?
 
Sorry link is here:
So is there a problem here? I can see you restarting the WAN interface which breaks the VPN tunnel connection. I can also see that as intended the VPN client waits 180 seconds before determining that the the tunnel is down and then attempts to reconnect. The first attempt appears to be unsuccessful so it tries again 5 seconds later and that appears to succeed.

Could it be related to this:
There users also describe problems with being behind LTE 4G modem and needing to enable DHCP continuous mode?
That's not relevant. You don't have a problem with DHCP.
 
OK perhaps not. Just trying to determine what is causing my loss in internet connectivity about every once in a week or so.
 
Do you have the VPN client configured to block internet access if the tunnel goes down?
 
Do you have the VPN client configured to block internet access if the tunnel goes down?
I have: 'Force Internet traffic through tunnel' to: Yes - I presume that this means the answer is yes? I believe that that option forces everything on the router to go through the VPN tunnel, including DNS queries..
Mindful of your helpful suggestion:
My first guess would be that it's just an issue with that particular NordVPN server becoming unavailable for a short period of time, it happens.
should I perhaps set 'Connection retry attempts' from 15 to 0 for infinite? Would the short downtime be enough to result in all 15 connection retry attempts failing, resulting in giving up, and 'Force internet traffic through tunnel' then blocking internet access?
Perhaps this explains everything.
 
Last edited:
I have: 'Force Internet traffic through tunnel' to: Yes - I presume that has this effect?
I don't think so. IIRC internet access will only be blocked if you're using policy rules. Just make sure you're not dependent on NordVPN DNS servers.


Should I perhaps set 'Connection retry attempts' from 15 to 0 for infinite?
If you can capture the syslog at the time the problem occurs that should indicate whether something needs to be changed.
 
What does 'Force Internet traffic through tunnel: YES' do in that case? I thought it made primary routing table such that everything goes through the VPN. Whereas PBR makes default routing table go through WAN and then secondary table through VPN, which renders leaks more probable?
Any idea how much time would elapse between those connection retry attempts?
 
I managed to recreate (or perhaps hurry along) the issue by restarting my 4G modem in bridge mode. Loss of connectivity until I rebooted the ASUS router. Unfortunately I managed to overwrite the syslog before posting it here. But does that tell us anything in of itself? My gut is that this has something to do with the WAN DHCP on my ASUS router. Frustratingly rebooting the modem several times now and I cannot recreate the issue. So it presumably only happens after a long uptime and then a reboot later (the modem auto reboots every now and then)? So perhaps I can try rebooting my modem tomorrow. Perhaps that will again recreate this bad state.
Or could it be to do with the VPN not being able to get setup properly after reboot of modem and change of WAN IP?
Should these OpenVPN settings (resulting from NordVPN config file for server I connect to) be able to tolerate my ISP altering the local WAN IP of my modem on its network (CG-NAT I think):
resolv-retry infinite
remote-random
tun-mtu 1500
tun-mtu-extra 32
mssfix 1450
ping 15
ping-restart 0
ping-timer-rem
remote-cert-tls server
pull
fast-io
cipher AES-256-CBC
route 192.168.8.1 255.255.255.255 net_gateway
The last line is just so that I can access my modem's GUI from my LAN. In the weird state I cannot access my modem's GUI, which I presume means the VPN tunnel does not get properly created. And I get the above-posted VPN errors.
 
Last edited:
Today I received once again loss of internet connectivity (despite no red light on router / and GUI showing assigned WAN IP) - this time during videoconference call with business client - argh!
Here is the full syslog which covers the loss of internet connection:
Seems to me that the relevant portion (right before I force reboot router) is as follows:
Jul 16 18:08:11 kernel: eth0 (Int switch port: 3) (Logical Port: 3) (phyId: c) Link DOWN.
Jul 16 18:08:12 WAN_Connection: ISP's DHCP did not function properly.
Jul 16 18:08:15 kernel: eth0 (Int switch port: 3) (Logical Port: 3) (phyId: c) Link Up at 1000 mbps full duplex
Jul 16 18:08:18 WAN_Connection: WAN(0) link up.
Jul 16 18:08:18 rc_service: wanduck 1119:notify_rc restart_wan_if 0
Jul 16 18:08:18 lldpd[1128]: removal request for address of 10.241.204.243%11, but no knowledge of it
Modem log is as follows. Seems to me that this loss in internet connectivity coincides with my modem dropping connection and re-obtaining connection every 48 hours?
2021-07-16 18:08:10 SystemNotice
WAN connection INTERNET_R_UMTS1:IPv4 connected
2021-07-16 18:08:10 SecurityWarning
Detect UDP port scan attack, and the attack has been blocked, scan packet from 195.206.183.101
2021-07-16 18:08:09 SystemNotice
WAN connection INTERNET_R_UMTS1:IPv4 disconnected
2021-07-14 18:08:10 SystemNotice
WAN connection INTERNET_R_UMTS1:IPv4 connected
2021-07-14 18:08:08 SystemNotice
WAN connection INTERNET_R_UMTS1:IPv4 disconnected
So does this mean that my WAN lease is 48 hours and the Asus-Router is not properly picking up the new WAN IP? Does this mean I am using a stale WAN IP right at the moment? In case it is relevant, under WAN internet status at the moment it states that my WAN IP is: 10.241.204.243 [just internal IP on ISP network] and that lease expires in 21 hours 23 mins. Time now is 20:45.
Also, see:
admin@RT-AX86U-4168:/tmp/home/root# ip addr show
1: lo: <LOOPBACK,MULTICAST,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
valid_lft forever preferred_lft forever
inet 127.0.1.1/8 brd 127.255.255.255 scope host secondary lo:0
valid_lft forever preferred_lft forever
2: ifb0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 32
link/ether ee:f3:e5:6f:8d:3a brd ff:ff:ff:ff:ff:ff
3: ifb1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 32
link/ether 2e:c9:ac:db:1e:c6 brd ff:ff:ff:ff:ff:ff
4: imq0: <NOARP> mtu 16000 qdisc noop state DOWN group default qlen 11000
link/void
5: imq1: <NOARP> mtu 16000 qdisc noop state DOWN group default qlen 11000
link/void
6: imq2: <NOARP> mtu 16000 qdisc noop state DOWN group default qlen 11000
link/void
7: sit0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default
link/sit 0.0.0.0 brd 0.0.0.0
8: ip6tnl0@NONE: <NOARP> mtu 1452 qdisc noop state DOWN group default
link/tunnel6 :: brd ::
9: ip6gre0@NONE: <NOARP> mtu 1448 qdisc noop state DOWN group default
link/gre6 :: brd ::
10: bcmsw: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether f0:2f:74:92:41:68 brd ff:ff:ff:ff:ff:ff
11: eth0: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether f0:2f:74:92:41:68 brd ff:ff:ff:ff:ff:ff
inet 10.241.204.243/8 brd 10.255.255.255 scope global eth0
valid_lft forever preferred_lft forever
12: eth1: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br0 state UP group default qlen 1000
link/ether f0:2f:74:92:41:68 brd ff:ff:ff:ff:ff:ff
13: eth2: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br0 state UP group default qlen 1000
link/ether f0:2f:74:92:41:68 brd ff:ff:ff:ff:ff:ff
14: eth3: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br0 state UP group default qlen 1000
link/ether f0:2f:74:92:41:68 brd ff:ff:ff:ff:ff:ff
15: eth4: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br0 state UP group default qlen 1000
link/ether f0:2f:74:92:41:68 brd ff:ff:ff:ff:ff:ff
16: eth5: <NO-CARRIER,BROADCAST,MULTICAST,ALLMULTI,UP> mtu 1500 qdisc pfifo_fast master br0 state DOWN group default qlen 1000
link/ether f0:2f:74:92:41:68 brd ff:ff:ff:ff:ff:ff
17: bcmswlpbk0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether f0:2f:74:92:41:68 brd ff:ff:ff:ff:ff:ff
18: spu_us_dummy: <NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 100
link/none 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
19: spu_ds_dummy: <NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 100
link/none 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
20: dpsta: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
21: eth6: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br0 state UP group default qlen 1000
link/ether f0:2f:74:92:41:68 brd ff:ff:ff:ff:ff:ff
22: eth7: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br0 state UP group default qlen 1000
link/ether f0:2f:74:92:41:6c brd ff:ff:ff:ff:ff:ff
23: br0: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether f0:2f:74:92:41:68 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.1/24 brd 192.168.1.255 scope global br0
valid_lft forever preferred_lft forever
admin@RT-AX86U-4168:/tmp/home/root# ip route show table all
default via 10.0.0.1 dev eth0
10.0.0.0/8 dev eth0 proto kernel scope link src 10.241.204.243
10.0.0.1 dev eth0 proto kernel scope link
127.0.0.0/8 dev lo scope link
192.168.1.0/24 dev br0 proto kernel scope link src 192.168.1.1
239.0.0.0/8 dev br0 scope link
broadcast 10.0.0.0 dev eth0 table local proto kernel scope link src 10.241.204.243
local 10.241.204.243 dev eth0 table local proto kernel scope host src 10.241.204.243
broadcast 10.255.255.255 dev eth0 table local proto kernel scope link src 10.241.204.243
broadcast 127.0.0.0 dev lo table local proto kernel scope link src 127.0.0.1
local 127.0.0.0/8 dev lo table local proto kernel scope host src 127.0.0.1
local 127.0.0.1 dev lo table local proto kernel scope host src 127.0.0.1
local 127.0.1.1 dev lo table local proto kernel scope host src 127.0.0.1
broadcast 127.255.255.255 dev lo table local proto kernel scope link src 127.0.0.1
broadcast 192.168.1.0 dev br0 table local proto kernel scope link src 192.168.1.1
local 192.168.1.1 dev br0 table local proto kernel scope host src 192.168.1.1
broadcast 192.168.1.255 dev br0 table local proto kernel scope link src 192.168.1.1
unreachable default dev lo proto kernel metric 4294967295 error 4294967195 pref medium
unreachable default dev lo proto kernel metric 4294967295 error 4294967195 pref medium
multicast 239.255.255.250/32 from 192.168.1.74/32 table default proto 17
 
Last edited:
Yes - seems to do this every 48 hours. Is this because the WAN lease expires from ISP or something? And then Asus router isn't properly updating on expired WAN IP? Why do I lose internet connectivity then until I reboot Asus router?
What is the significance of:
Jul 16 18:08:18 lldpd[1128]: removal request for address of 10.241.204.243%11, but no knowledge of it
Is this message from my modem relevant (or just a red herring?):
2021-07-16 18:08:10 SystemNotice
WAN connection INTERNET_R_UMTS1:IPv4 connected
2021-07-16 18:08:10 SecurityWarning
Detect UDP port scan attack, and the attack has been blocked, scan packet from 195.206.183.101
2021-07-16 18:08:09 SystemNotice
WAN connection INTERNET_R_UMTS1:IPv4 disconnected
2021-07-14 18:08:10 SystemNotice
WAN connection INTERNET_R_UMTS1:IPv4 connected
2021-07-14 18:08:08 SystemNotice
WAN connection INTERNET_R_UMTS1:IPv4 disconnected
Very confusing to me: 195.206.183.101 is the IP of my VPN (NordVPN). How would that get through CG-NAT? What is that?
... could this be to do with VPN just pinging and my modem blocks it, ping fails and VPN gets screwed up? All just speculation here. I hope logs above reveal true source of problem.
If this is related to problem then could the resolve be as simple as disabling firewall on modem?
Hope some clever person here can make sense of all of this.
 
Last edited:
The lldpd messages can be ignored.

CG-NAT is not a problem for the VPN client.

The problem does indeed appear to be triggered by your modem resetting the link, which is not unusual. But the connection comes back a few seconds later so that's not a problem.

I would expect to see more processes (like the VPN) being restarted after the WAN comes back up. As it is it restarts dnsmasq and then seems to pause. Unfortunately you rebooted the router 28 seconds later so we don't know whether it would have eventually recovered.

However I do notice that dnsmasq is still trying to use the NordVPN DNS servers even though they're likely unreachable at this point. That might be the problem. I suggest that you don't use NordVPN's servers and use some public servers like Google as an experiment to see if that allows the router to recover the next time it happens.
 

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top