Alexander K
New Around Here
Hello everyone,
I have an esoteric problem with my WAN. Due to circumstances outside my control, the network equipment on the other end of my WAN connection is power-cycled 3 times a day (as a clumsy "maybe it will go away" solution to another problem). Unfortunately, when it happens, and the rack is powered up afterwards, my RT-AC68U running asus-merlin 3.0.0.4.378.51_0 fails to get its WAN interface up properly. Same thing happened with stock ASUS firmware. I suspect some race condition with how the switches get back up..
The symptom is that WAN is reported as being back up but the routes table is empty save for LAN entry.
If I wait for about a minute and go to WAN settings in the UI and click [Apply], it goes back to normal. If I don't, the WAN stays in this abnormal state for over an hour before correcting itself.
Log output, where N.N.N.N is the censored-out IP of my OpenVPN server (RT-AC68U is set to connect to it at all times), and repeated dnsmasq entries shortened:
After this, the routing table is left in this state:
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.20.0 * 255.255.255.0 U 0 0 0 LAN
Eventually, it recovers. No idea what triggers it, log:
If I manually do an [Apply] of WAN configuration, it looks like this:
I would like to have a workaround. At the simplest, since power cycles occur at predefined times, I could schedule a script to run to restart WAN at those times +1 minute.
However, my problem is: I don't know what to put in the script to trigger the same effect as applying the WAN setting, basically how to trigger a "proper" WAN restart.
Of course, a more intelligent solution is welcome.
Please note that since I do not control the WAN equipment, I cannot live-test it more than 1-3 times per day. Also, sorry in advance for noobishness.
I have an esoteric problem with my WAN. Due to circumstances outside my control, the network equipment on the other end of my WAN connection is power-cycled 3 times a day (as a clumsy "maybe it will go away" solution to another problem). Unfortunately, when it happens, and the rack is powered up afterwards, my RT-AC68U running asus-merlin 3.0.0.4.378.51_0 fails to get its WAN interface up properly. Same thing happened with stock ASUS firmware. I suspect some race condition with how the switches get back up..
The symptom is that WAN is reported as being back up but the routes table is empty save for LAN entry.
If I wait for about a minute and go to WAN settings in the UI and click [Apply], it goes back to normal. If I don't, the WAN stays in this abnormal state for over an hour before correcting itself.
Log output, where N.N.N.N is the censored-out IP of my OpenVPN server (RT-AC68U is set to connect to it at all times), and repeated dnsmasq entries shortened:
Apr 16 04:59:59 WAN Connection: Ethernet link down.
Apr 16 05:00:59 WAN Connection: Ethernet link up.
Apr 16 05:00:59 rc_service: wanduck 30514:notify_rc restart_wan_if 0
Apr 16 05:00:59 dnsmasq[30517]: read /etc/hosts - 5 addresses
Apr 16 05:00:59 dnsmasq[30517]: using nameserver 8.8.4.4#53 for domain local
Apr 16 05:00:59 dnsmasq[30517]: using nameserver 8.8.8.8#53 for domain local
Apr 16 05:00:59 dnsmasq[30517]: using nameserver 8.8.8.8#53
Apr 16 05:00:59 dnsmasq[30517]: using nameserver 8.8.4.4#53
Apr 16 05:01:00 kernel: Attempt to kill tasklet from interrupt
Apr 16 05:01:00 kernel: br0: port 1(vlan1) entering forwarding state
Apr 16 05:01:00 kernel: device eth0 left promiscuous mode
Apr 16 05:01:00 kernel: device eth0 entered promiscuous mode
Apr 16 05:01:00 kernel: br0: topology change detected, propagating
Apr 16 05:01:00 kernel: br0: port 1(vlan1) entering forwarding state
Apr 16 05:01:00 kernel: br0: port 1(vlan1) entering forwarding state
Apr 16 05:01:00 [same dnsmasq output]
Apr 16 05:01:00 [same dnsmasq output]
Apr 16 05:01:02 rc_service: udhcpc 14948:notify_rc start_firewall
Apr 16 05:01:02 [same dnsmasq output]
Apr 16 05:01:03 start_nat_rules: apply the nat_rules(/tmp/nat_rules_eth0_eth0)!
Apr 16 05:01:03 wan: finish adding multi routes
Apr 16 05:01:03 rc_service: udhcpc 14948:notify_rc stop_upnp
Apr 16 05:01:03 rc_service: waitting "start_firewall" via udhcpc ...
Apr 16 05:01:05 admin: sh /opt/S50asuslighttpd.1 firewall-start
Apr 16 05:01:05 admin: sh /opt/S50downloadmaster.1 firewall-start
Apr 16 05:01:06 rc_service: udhcpc 14948:notify_rc start_upnp
Apr 16 05:01:06 rc_service: waitting "stop_upnp" via udhcpc ...
Apr 16 05:01:06 WAN Connection: Ethernet link up.
Apr 16 05:01:06 rc_service: wanduck 30514:notify_rc restart_wan_if 0
Apr 16 05:01:07 [same dnsmasq output]
Apr 16 05:01:07 kernel: Attempt to kill tasklet from interrupt
Apr 16 05:01:07 kernel: br0: port 1(vlan1) entering forwarding state
Apr 16 05:01:07 kernel: device eth0 left promiscuous mode
Apr 16 05:01:07 kernel: device eth0 entered promiscuous mode
Apr 16 05:01:07 kernel: br0: topology change detected, propagating
Apr 16 05:01:07 kernel: br0: port 1(vlan1) entering forwarding state
Apr 16 05:01:07 kernel: br0: port 1(vlan1) entering forwarding state
Apr 16 05:01:09 rc_service: udhcpc 14948:notify_rc stop_vpnclient1
Apr 16 05:01:09 rc_service: udhcpc 14948:notify_rc start_vpnclient1
Apr 16 05:01:09 rc_service: waitting "stop_vpnclient1" via udhcpc ...
Apr 16 05:01:11 [same dnsmasq output]
Apr 16 05:01:12 rc_service: udhcpc 15209:notify_rc start_firewall
Apr 16 05:01:12 rc_service: waitting "stop_vpnclient1" via udhcpc ...
Apr 16 05:01:12 openvpn[6172]: /usr/sbin/ip route del 10.8.0.1/32
Apr 16 05:01:12 openvpn[6172]: /usr/sbin/ip route del N.N.N.N/32
Apr 16 05:01:12 openvpn[6172]: ERROR: Linux route delete command failed: external program exited with error status: 2
Apr 16 05:01:12 openvpn[6172]: /usr/sbin/ip route del 0.0.0.0/1
Apr 16 05:01:12 openvpn[6172]: /usr/sbin/ip route del 128.0.0.0/1
Apr 16 05:01:12 openvpn[6172]: Closing TUN/TAP interface
Apr 16 05:01:12 openvpn[6172]: /usr/sbin/ip addr del dev tun11 local 10.8.0.6 peer 10.8.0.5
Apr 16 05:01:12 openvpn[6172]: SIGTERM[hard,] received, process exiting
Apr 16 05:01:13 dhcp client: bound 192.168.2.103 via 192.168.2.1 during 6000 seconds.
Apr 16 05:01:14 WAN Connection: ISP's DHCP did not function properly.
Apr 16 05:01:14 kernel: tun: Universal TUN/TAP device driver, 1.6
Apr 16 05:01:14 kernel: tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
Apr 16 05:01:15 [same dnsmasq output]
Apr 16 05:01:15 openvpn[15265]: OpenVPN 2.3.6 arm-unknown-linux-gnu [SSL (OpenSSL)] [LZO] [EPOLL] [MH] [IPv6] built on Mar 6 2015
Apr 16 05:01:15 openvpn[15265]: library versions: OpenSSL 1.0.0q 15 Jan 2015, LZO 2.08
Apr 16 05:01:15 openvpn[15265]: Socket Buffers: R=[122880->131072] S=[122880->131072]
Apr 16 05:01:15 openvpn[15266]: UDPv4 link local: [undef]
Apr 16 05:01:15 openvpn[15266]: UDPv4 link remote: [AF_INET]N.N.N.N:1194
Apr 16 05:01:15 openvpn[15266]: write UDPv4: Network is unreachable (code=101)
Apr 16 05:00:59 WAN Connection: Ethernet link up.
Apr 16 05:00:59 rc_service: wanduck 30514:notify_rc restart_wan_if 0
Apr 16 05:00:59 dnsmasq[30517]: read /etc/hosts - 5 addresses
Apr 16 05:00:59 dnsmasq[30517]: using nameserver 8.8.4.4#53 for domain local
Apr 16 05:00:59 dnsmasq[30517]: using nameserver 8.8.8.8#53 for domain local
Apr 16 05:00:59 dnsmasq[30517]: using nameserver 8.8.8.8#53
Apr 16 05:00:59 dnsmasq[30517]: using nameserver 8.8.4.4#53
Apr 16 05:01:00 kernel: Attempt to kill tasklet from interrupt
Apr 16 05:01:00 kernel: br0: port 1(vlan1) entering forwarding state
Apr 16 05:01:00 kernel: device eth0 left promiscuous mode
Apr 16 05:01:00 kernel: device eth0 entered promiscuous mode
Apr 16 05:01:00 kernel: br0: topology change detected, propagating
Apr 16 05:01:00 kernel: br0: port 1(vlan1) entering forwarding state
Apr 16 05:01:00 kernel: br0: port 1(vlan1) entering forwarding state
Apr 16 05:01:00 [same dnsmasq output]
Apr 16 05:01:00 [same dnsmasq output]
Apr 16 05:01:02 rc_service: udhcpc 14948:notify_rc start_firewall
Apr 16 05:01:02 [same dnsmasq output]
Apr 16 05:01:03 start_nat_rules: apply the nat_rules(/tmp/nat_rules_eth0_eth0)!
Apr 16 05:01:03 wan: finish adding multi routes
Apr 16 05:01:03 rc_service: udhcpc 14948:notify_rc stop_upnp
Apr 16 05:01:03 rc_service: waitting "start_firewall" via udhcpc ...
Apr 16 05:01:05 admin: sh /opt/S50asuslighttpd.1 firewall-start
Apr 16 05:01:05 admin: sh /opt/S50downloadmaster.1 firewall-start
Apr 16 05:01:06 rc_service: udhcpc 14948:notify_rc start_upnp
Apr 16 05:01:06 rc_service: waitting "stop_upnp" via udhcpc ...
Apr 16 05:01:06 WAN Connection: Ethernet link up.
Apr 16 05:01:06 rc_service: wanduck 30514:notify_rc restart_wan_if 0
Apr 16 05:01:07 [same dnsmasq output]
Apr 16 05:01:07 kernel: Attempt to kill tasklet from interrupt
Apr 16 05:01:07 kernel: br0: port 1(vlan1) entering forwarding state
Apr 16 05:01:07 kernel: device eth0 left promiscuous mode
Apr 16 05:01:07 kernel: device eth0 entered promiscuous mode
Apr 16 05:01:07 kernel: br0: topology change detected, propagating
Apr 16 05:01:07 kernel: br0: port 1(vlan1) entering forwarding state
Apr 16 05:01:07 kernel: br0: port 1(vlan1) entering forwarding state
Apr 16 05:01:09 rc_service: udhcpc 14948:notify_rc stop_vpnclient1
Apr 16 05:01:09 rc_service: udhcpc 14948:notify_rc start_vpnclient1
Apr 16 05:01:09 rc_service: waitting "stop_vpnclient1" via udhcpc ...
Apr 16 05:01:11 [same dnsmasq output]
Apr 16 05:01:12 rc_service: udhcpc 15209:notify_rc start_firewall
Apr 16 05:01:12 rc_service: waitting "stop_vpnclient1" via udhcpc ...
Apr 16 05:01:12 openvpn[6172]: /usr/sbin/ip route del 10.8.0.1/32
Apr 16 05:01:12 openvpn[6172]: /usr/sbin/ip route del N.N.N.N/32
Apr 16 05:01:12 openvpn[6172]: ERROR: Linux route delete command failed: external program exited with error status: 2
Apr 16 05:01:12 openvpn[6172]: /usr/sbin/ip route del 0.0.0.0/1
Apr 16 05:01:12 openvpn[6172]: /usr/sbin/ip route del 128.0.0.0/1
Apr 16 05:01:12 openvpn[6172]: Closing TUN/TAP interface
Apr 16 05:01:12 openvpn[6172]: /usr/sbin/ip addr del dev tun11 local 10.8.0.6 peer 10.8.0.5
Apr 16 05:01:12 openvpn[6172]: SIGTERM[hard,] received, process exiting
Apr 16 05:01:13 dhcp client: bound 192.168.2.103 via 192.168.2.1 during 6000 seconds.
Apr 16 05:01:14 WAN Connection: ISP's DHCP did not function properly.
Apr 16 05:01:14 kernel: tun: Universal TUN/TAP device driver, 1.6
Apr 16 05:01:14 kernel: tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
Apr 16 05:01:15 [same dnsmasq output]
Apr 16 05:01:15 openvpn[15265]: OpenVPN 2.3.6 arm-unknown-linux-gnu [SSL (OpenSSL)] [LZO] [EPOLL] [MH] [IPv6] built on Mar 6 2015
Apr 16 05:01:15 openvpn[15265]: library versions: OpenSSL 1.0.0q 15 Jan 2015, LZO 2.08
Apr 16 05:01:15 openvpn[15265]: Socket Buffers: R=[122880->131072] S=[122880->131072]
Apr 16 05:01:15 openvpn[15266]: UDPv4 link local: [undef]
Apr 16 05:01:15 openvpn[15266]: UDPv4 link remote: [AF_INET]N.N.N.N:1194
Apr 16 05:01:15 openvpn[15266]: write UDPv4: Network is unreachable (code=101)
After this, the routing table is left in this state:
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.20.0 * 255.255.255.0 U 0 0 0 LAN
Eventually, it recovers. No idea what triggers it, log:
Apr 15 06:26:31 dnsmasq[30517]: read /etc/hosts - 5 addresses
Apr 15 06:26:31 dnsmasq[30517]: using nameserver 8.8.4.4#53 for domain local
Apr 15 06:26:31 dnsmasq[30517]: using nameserver 8.8.8.8#53 for domain local
Apr 15 06:26:31 dnsmasq[30517]: using nameserver 8.8.8.8#53
Apr 15 06:26:31 dnsmasq[30517]: using nameserver 8.8.4.4#53
Apr 15 06:26:31 rc_service: udhcpc 20915:notify_rc start_firewall
Apr 15 06:26:31 dnsmasq[30517]: read /etc/hosts - 5 addresses
Apr 15 06:26:31 [same dnsmasq output]
Apr 15 06:26:32 start_nat_rules: apply the nat_rules(/tmp/nat_rules_eth0_eth0)!
Apr 15 06:26:32 wan: finish adding multi routes
Apr 15 06:26:32 rc_service: udhcpc 20915:notify_rc stop_upnp
Apr 15 06:26:32 rc_service: waitting "start_firewall" via udhcpc ...
Apr 15 06:26:32 openvpn[19423]: TLS: Initial packet from [AF_INET]N.N.N.N:1194, sid=d5a087ec 7528d23b
Apr 15 06:26:33 WAN Connection: WAN was restored.
Apr 15 06:26:31 dnsmasq[30517]: using nameserver 8.8.4.4#53 for domain local
Apr 15 06:26:31 dnsmasq[30517]: using nameserver 8.8.8.8#53 for domain local
Apr 15 06:26:31 dnsmasq[30517]: using nameserver 8.8.8.8#53
Apr 15 06:26:31 dnsmasq[30517]: using nameserver 8.8.4.4#53
Apr 15 06:26:31 rc_service: udhcpc 20915:notify_rc start_firewall
Apr 15 06:26:31 dnsmasq[30517]: read /etc/hosts - 5 addresses
Apr 15 06:26:31 [same dnsmasq output]
Apr 15 06:26:32 start_nat_rules: apply the nat_rules(/tmp/nat_rules_eth0_eth0)!
Apr 15 06:26:32 wan: finish adding multi routes
Apr 15 06:26:32 rc_service: udhcpc 20915:notify_rc stop_upnp
Apr 15 06:26:32 rc_service: waitting "start_firewall" via udhcpc ...
Apr 15 06:26:32 openvpn[19423]: TLS: Initial packet from [AF_INET]N.N.N.N:1194, sid=d5a087ec 7528d23b
Apr 15 06:26:33 WAN Connection: WAN was restored.
If I manually do an [Apply] of WAN configuration, it looks like this:
Apr 15 21:01:11 start_nat_rules: apply the nat_rules(/tmp/nat_rules_eth0_eth0)!
Apr 15 21:01:11 wan: finish adding multi routes
Apr 15 21:01:11 rc_service: udhcpc 5183:notify_rc stop_upnp
Apr 15 21:01:11 rc_service: waitting "start_firewall" via udhcpc ...
Apr 15 21:01:12 WAN Connection: WAN was restored.
Normal routing table looks like this:Apr 15 21:01:11 wan: finish adding multi routes
Apr 15 21:01:11 rc_service: udhcpc 5183:notify_rc stop_upnp
Apr 15 21:01:11 rc_service: waitting "start_firewall" via udhcpc ...
Apr 15 21:01:12 WAN Connection: WAN was restored.
Destination Gateway Genmask Flags Metric Ref Use Iface
10.8.0.5 * 255.255.255.255 UH 0 0 0 tun11
10.8.0.1 10.8.0.5 255.255.255.255 UGH 0 0 0 tun11
192.168.2.1 * 255.255.255.255 UH 0 0 0 WAN
N.N.N.N 192.168.2.1 255.255.255.255 UGH 0 0 0 WAN
192.168.20.0 * 255.255.255.0 U 0 0 0 LAN
192.168.2.0 * 255.255.255.0 U 0 0 0 WAN
default 10.8.0.5 128.0.0.0 UG 0 0 0 tun11
128.0.0.0 10.8.0.5 128.0.0.0 UG 0 0 0 tun11
default 192.168.2.1 0.0.0.0 UG 0 0 0 WAN
10.8.0.5 * 255.255.255.255 UH 0 0 0 tun11
10.8.0.1 10.8.0.5 255.255.255.255 UGH 0 0 0 tun11
192.168.2.1 * 255.255.255.255 UH 0 0 0 WAN
N.N.N.N 192.168.2.1 255.255.255.255 UGH 0 0 0 WAN
192.168.20.0 * 255.255.255.0 U 0 0 0 LAN
192.168.2.0 * 255.255.255.0 U 0 0 0 WAN
default 10.8.0.5 128.0.0.0 UG 0 0 0 tun11
128.0.0.0 10.8.0.5 128.0.0.0 UG 0 0 0 tun11
default 192.168.2.1 0.0.0.0 UG 0 0 0 WAN
I would like to have a workaround. At the simplest, since power cycles occur at predefined times, I could schedule a script to run to restart WAN at those times +1 minute.
However, my problem is: I don't know what to put in the script to trigger the same effect as applying the WAN setting, basically how to trigger a "proper" WAN restart.
Of course, a more intelligent solution is welcome.
Please note that since I do not control the WAN equipment, I cannot live-test it more than 1-3 times per day. Also, sorry in advance for noobishness.
Last edited: