What's new

Router not getting internet back after WAN drop

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

ChkWAN helped on my end as well! I'm using it for several days and I immediately noticed results. I had issue where 70-80% of WAN IP renewals got stuck, but somehow other 20-30% was successful. After multiple firmwares, 2 months of testing, I finally stopped checking network logs etc...

I'm using 1-minute cron job:
Code:
cru a ChkWAN "*/1 * * * * /jffs/scripts/ChkWAN.sh nowait wan quiet forcesmall curlrate=100 ping=1.1.1.1,8.8.8.8 tries=2 fails=2"

If only there's a way to completely put script on quiet mode, without log entries except WAN fails...

And besides that I set up HealthCheck cron job every minute with grace period of 5min. It sends ping to healthchecks.io, and in case of ping fail, site notifies me via sms.
I can finally set up my router how I want (scripts, qos etc.) :)
 
I'm using 1-minute cron job:
Code:
cru a ChkWAN "*/1 * * * * /jffs/scripts/ChkWAN.sh nowait wan quiet forcesmall curlrate=100 ping=1.1.1.1,8.8.8.8 tries=2 fails=2"
If only there's a way to completely put script on quiet mode, without log entries except WAN fails...
When implementing a monitoring script, it is usually prudent to include at least one message to Syslog confirming the script actually ran at the designated time, otherwise how could you be sure that the (cron) schedule/monitoring is in place? - i.e. some unfortunate OPs have reported their cron jobs can, and have, seemingly mysteriously gone AWOL?

Installing scribe (syslog-ng) from amtm, allows 'noise' to be easily eliminated from the log but still retains the ability to track the execution of the monitoring

e.g. Setup scribe to send ALL ChkWAN.sh messages to '/opt/var/log/chkwan.log'

Create '/opt/etc/syslog-ng.d/ChkWAN'
Code:
#  Filter ChkWAN script messages

destination d_chkwan      { file("/opt/var/log/chkwan.log")        ; };

filter f_chkwan           { program("ChkWAN.sh")                   ; };

log { source(src); filter(f_chkwan)      ; destination(d_chkwan)   ; flags(final); };
then restart scribe
Code:
scribe
                            _
                         _ ( )          
       ___    ___  _ __ (_)| |_      __ 
     /',__) /'___)( '__)| || '_`\  /'__`\
     \__, \( (___ | |   | || |_) )(  ___/
     (____/`\____)(_)   (_)(_,__/'`\____)
     syslog-ng and logrotate installation
     v2.4_3 (master)  Coded by cynicastic

=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

     s.    Show scribe status
     rl.   Reload syslog-ng.conf
     lr.   Run logrotate now
     rs.   Restart syslog-ng
     st.   Stop syslog-ng & logrotate cron

     u.    Update scribe
     uf.   Update filters
     su.   scribe utilities
     e.    Exit scribe

     is.   Reinstall scribe
     zs.   Remove scribe

=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

Please select an option: rs

Restarting syslog-ng...
Shutting down syslog-ng...              done.
Starting syslog-ng...              done.

      checking syslog-ng daemon ... alive.
 
Hi Martineau, thank you for your reply. Scribe is next thing on my list. I'm taking baby steps since I'm (or I was) complete noob on scripts, CLI, and Linux commands in general. It took me few days to set all this up.


I have no intention to hyjack this thread so I hope OP won't mind sharing my experience with this issue.
I have mentioned two situations where my router's WAN cannot refresh WAN IP, and rare instances where WAN interface obviously "restart" itself and manages to refresh WAN IP.

Here are two recent examples:

AC86U - failed to request new WAN IP - if there was no ChkWAN script, internet would be down until I manually kill that WAN switch in Asus GUI:
Code:
May  7 01:43:02 (ChkWAN.sh): 11863 v1.17 Monitoring WAN connection using 2 target PING hosts (1.1.1.1 8.8.8.8) (Tries=2)
May  7 01:43:02 (ChkWAN.sh): 11863 cURL 500Byte transfer took: 00:00.22 secs @ 2262 B/sec
May  7 01:44:02 (ChkWAN.sh): 12150 v1.17 Monitoring WAN connection using 2 target PING hosts (1.1.1.1 8.8.8.8) (Tries=2)
May  7 01:44:02 (ChkWAN.sh): 12150 cURL 500Byte transfer took: 00:00.26 secs @ 1908 B/sec

May  7 01:44:12 kernel: eth0 (Int switch port: 3) (Logical Port: 3) Link DOWN.
May  7 01:44:15 kernel: eth0 (Int switch port: 3) (Logical Port: 3) Link UP 1000 mbps full duplex

May  7 01:45:02 (ChkWAN.sh): 12438 v1.17 Monitoring WAN connection using 2 target PING hosts (1.1.1.1 8.8.8.8) (Tries=2)
May  7 01:45:48 (ChkWAN.sh): 12438 ***ERROR cURL 'https://raw.githubusercontent.com/MartineauUK/Chk-WAN/master/Test500B.txt' transfer FAILED RC=6
May  7 01:45:48 (ChkWAN.sh): 12438 v1.17 Monitoring WAN connection cURL data retrieval check FAILED
May  7 01:46:34 (ChkWAN.sh): 12438 ***ERROR cURL 'https://raw.githubusercontent.com/MartineauUK/Chk-WAN/master/Test500B.txt' transfer FAILED RC=6
May  7 01:46:34 (ChkWAN.sh): 12438 v1.17 Monitoring WAN connection cURL data retrieval check FAILED
May  7 01:47:31 (ChkWAN.sh): 12438 ***ERROR cURL 'https://raw.githubusercontent.com/MartineauUK/Chk-WAN/master/Test500B.txt' transfer FAILED RC=6
May  7 01:47:31 (ChkWAN.sh): 12438 v1.17 Monitoring WAN connection cURL data retrieval check FAILED
May  7 01:48:17 (ChkWAN.sh): 12438 ***ERROR cURL 'https://raw.githubusercontent.com/MartineauUK/Chk-WAN/master/Test500B.txt' transfer FAILED RC=6
May  7 01:48:17 (ChkWAN.sh): 12438 v1.17 Monitoring WAN connection cURL data retrieval check FAILED

May  7 01:48:17 (ChkWAN.sh): 12438 Renewing DHCP and restarting WAN (Action=WANONLY)

May  7 01:48:27 rc_service: service 13128:notify_rc restart_wan
May  7 01:48:27 custom_script: Running /jffs/scripts/service-event (args: restart wan)
May  7 01:48:27 miniupnpd[9798]: shutting down MiniUPnPd
May  7 01:48:31 miniupnpd[13244]: HTTP listening on port 45974
May  7 01:48:31 miniupnpd[13244]: Listening for NAT-PMP/PCP traffic on port 5351
May  7 01:48:32 WAN_Connection: ISP's DHCP did not function properly.
May  7 01:48:32 DualWAN: skip single wan wan_led_control - WANRED off
May  7 01:48:32 custom_script: Running /jffs/scripts/firewall-start (args: eth0)
May  7 01:48:32 wan: finish adding multi routes
May  7 01:48:32 YazFi: Firewall restarted - sleeping 30s before running YazFi
May  7 01:48:34 miniupnpd[13244]: shutting down MiniUPnPd
May  7 01:48:34 miniupnpd[13368]: HTTP listening on port 51369
May  7 01:48:34 miniupnpd[13368]: Listening for NAT-PMP/PCP traffic on port 5351

May  7 01:48:37 WAN_Connection: WAN was restored.


AC86U - successfully requested WAN IP (you can ignore ChkWAN script, this is the case where script is not needed)
Code:
May  7 06:15:02 (ChkWAN.sh): 27609 v1.17 Monitoring WAN connection using 2 target PING hosts (1.1.1.1 8.8.8.8) (Tries=2)
May  7 06:15:02 (ChkWAN.sh): 27609 cURL 500Byte transfer took: 00:00.20 secs @ 2538 B/sec

May  7 06:15:06 kernel: eth0 (Int switch port: 3) (Logical Port: 3) Link DOWN.

May  7 06:15:07 WAN_Connection: ISP's DHCP did not function properly.
May  7 06:15:07 DualWAN: skip single wan wan_led_control - WANRED off
May  7 06:15:09 kernel: eth0 (Int switch port: 3) (Logical Port: 3) Link UP 1000 mbps full duplex
May  7 06:15:11 WAN_Connection: WAN(0) link up.
May  7 06:15:11 rc_service: wanduck 1028:notify_rc restart_wan_if 0
May  7 06:15:11 custom_script: Running /jffs/scripts/service-event (args: restart wan_if)
May  7 06:15:15 custom_script: Running /jffs/scripts/firewall-start (args: eth0)
May  7 06:15:15 YazFi: Firewall restarted - sleeping 30s before running YazFi
May  7 06:15:15 wan: finish adding multi routes
May  7 06:15:15 miniupnpd[13368]: shutting down MiniUPnPd
May  7 06:15:15 miniupnpd[28003]: HTTP listening on port 40707
May  7 06:15:15 miniupnpd[28003]: Listening for NAT-PMP/PCP traffic on port 5351

May  7 06:15:17 WAN_Connection: WAN was restored.


Meanwhile, modem's log:
Code:
2021-05-07 06:15:09 WAN connection INTERNET_VOICE_TR069_R_UMTS1:IPv4 connected
2021-05-07 06:15:05 WAN connection INTERNET_VOICE_TR069_R_UMTS1:IPv4 disconnected

2021-05-07 01:44:15 WAN connection INTERNET_VOICE_TR069_R_UMTS1:IPv4 connected
2021-05-07 01:44:11 WAN connection INTERNET_VOICE_TR069_R_UMTS1:IPv4 disconnected

My next goal is to try and optimize ChkWAN to react faster (setting it to tries=1 maybe) and to get downtime to 2-3 min max.
Fortunatelly, these several days I had zero false positive pings so I'm also considering using ping method only, one try, then restart WAN.
 
I have been watching this thread with interest because I experience the same intermittent problem of my router (RT-AX88U) sometimes staying disconnected after the WAN is lost. It's very hard to pin down since it happens in an interval of 1 to 5 days. I have to admit it had started when my ISP replaced my media converter with an ONT and also replaced all the network equipment on its side.

In my case it is definitely not a problem with WAN DHCP lease renewal. When the Internet goes down and the router stays disconnected, Internet connectivity is definitely at the ONT. For whatever reason, the router is not able to regain WAN connectivity. A router reboot solves this. An ONT reboot does not!

I think the ONT is at fault here but there also seems to be a bug in the Asus firmware. The ONT behavior might trigger something inside the router or vice versa. I know that some neighbors, with the same ONT, but different routers, have similar & different problems or none at all.

I will have to see how a different ONT model behaves.

I'd be interested if someone has any ideas how to track down in the Asus firmware code, where this issue might be buried. I looked into wanduck.c and wan.c a bit however nothing obvious pops out.
 
I think the ONT is at fault here but there also seems to be a bug in the Asus firmware.
Does your ISP use PPPoE or DHCP for the WAN ? I think ASUS has a glitch for where DHCP fails on the WAN, it doesn't properly try again, even in Continuous mode. It doesn't matter if the WAN is an ONT or a DOCSIS cable modem. This is where ChkWAN helps.
 
Does your ISP use PPPoE or DHCP for the WAN ? I think ASUS has a glitch for where DHCP fails on the WAN, it doesn't properly try again, even in Continuous mode. It doesn't matter if the WAN is an ONT or a DOCSIS cable modem. This is where ChkWAN helps.
My ISP uses DHCP for the WAN. I coudn't find any problems with the DHCP implementation. I tried all of different frequency options and ran a packet trace. I documented the results here.

I tried ChkWAN.sh and it is fantastic:) However, rebooting is very disruptive.
 
I tried ChkWAN.sh and it is fantastic:) However, rebooting is very disruptive.
You don't have to reboot with ChkWAN, I don't. Just restarts the WAN interface.
I still think the ASUS chosen DHCP client (udhcpc) has a bug, even with the tweaked options. My long term solution will be to replace the router with something else, and use my AX88U as an access point.
 
You don't have to reboot with ChkWAN, I don't. Just restarts the WAN interface.
I still think the ASUS chosen DHCP client (udhcpc) has a bug, even with the tweaked options. My long term solution will be to replace the router with something else, and use my AX88U as an access point.
Unfortunately, just restarting the WAN interface does not work for me. I don't know why.
Can you elaborate on the udhcpc bug?
I did a tcpdump trace and cannot find any fault at least in my case.
 
Unfortunately, just restarting the WAN interface does not work for me. I don't know why.
Can you elaborate on the udhcpc bug?
I did a tcpdump trace and cannot find any fault at least in my case.

We have a different fault then, interesting. I am assuming it is something in udhcpc because the router is still responding to external ping requests, and the public IP shows against the interface, there is just no outbound traffic.

I wouldn't know what to look for in a tcpdump trace, not can I tell when this problem is going to occur. Last time was about 3am, and ChkWAN restarted WAN and service was restored within 5 minutes. I'm now looking at optimising the script's timers.
 
I have/had a similar problem with my AX58U connected to a DOCSIS 3.0 ISP supplied modem which is in bridge mode,.

During the day, WAN light on the router turns red, WAN down redirection message appears in the browser, and when I log in to web UI it says the Internet is disconnected. Some websites can be visited tho, which means the internet connection is alive.

What I discovered after several trials and errors is: hitting the button Apply under the WAN page without changing a setting fixes the WAN connection and everything turns to normal for a few hours. But that's a workaround, not an actual solution.

What I've tried and failed miserably so far:

- Rebooted router and the modem several times
- Monitored WAN connection with "ChkWan wan" command
- Enabled Monitor WAN connection built in feature to see if that helps
- Changed WAN check type from normal to aggressive and then to continuous mode.
- Enabled/disabled Extend TTL value & Spoof LAN TTL
- Nuked the router, then set it up from scratch (reset, upload firmware, reset, reboot, reset, reboot, reboot, setup)

Lastly, after reading several posts on this forum, I've found a solution that solves the problem on my end once and for all.

1. Disable Extend TTL value & Spoof LAN value under WAN.
2. Disable Network Monitoring and Enable WAN down browser redirect notice under Administration
3. Save settings, power off both the router and the modem, unplug power cords, wait around 3 minutes and turn on the modem first and router after.

Result: It's been 48 hours and WAN is always up as it's used to be.

Hope this is relevant to this thread and helps people having problem with their WAN connection.
 
I still think the ASUS chosen DHCP client (udhcpc) has a bug, even with the tweaked

This sounds like the broken dhcp WAN client on pfsense, there its solution is to force a dhcp discovery broadcast before requesting a new lease and a patched dhcp client.

In this case there are 2 broken issues, 1 is forcing a dhcp discovery broadcast was not honored by the dhcp client (which required a patched version) and 2. the ISP dhcp server is responding with an invalid dhcp server as response to a new lease which results at WAN downtime or IP renewal that the dhcp client is requesting a renewal from an invalid dhcp server until this process times-out (which may take days) and a discovery broadcast is done (which may fail if the dhcp client is broken).

You will need much more verbose logging to see this happening IF this is also the case on ASUS hardware.

In my logging I am concluding this is happening with ISP's who have combined their router and bridge mode clients in one dhcp server/template where they are failing to provide proper templating for bridge mode clients because the dhcp server included in the renew response is often meant for router clients only.
 
Lastly, after reading several posts on this forum, I've found a solution that solves the problem on my end once and for all.

1. Disable Extend TTL value & Spoof LAN value under WAN.
2. Disable Network Monitoring and Enable WAN down browser redirect notice under Administration
3. Save settings, power off both the router and the modem, unplug power cords, wait around 3 minutes and turn on the modem first and router after.
Thanks, will give a try.
 
That seems to have done the trick @underdose , thank you very much. By stopping the ASUS checking for the connection being valid, even if the ISP has a 20 seconds of a problem, my work VPN drops, but to recover I just wait, do not need to reboot the ASUS. The ChkWAN script may not be needed, but will keep it for the useful logging.
 
.....

My next goal is to try and optimize ChkWAN to react faster (setting it to tries=1 maybe) and to get downtime to 2-3 min max.
Fortunatelly, these several days I had zero false positive pings so I'm also considering using ping method only, one try, then restart WAN.

I've changed my script to:

Code:
*/1 * * * * /jffs/scripts/ChkWAN.sh nowait wan quiet ping=1.1.1.1,8.8.8.8 tries=1 fails=1

After 20 days of testing, I can confirm:
- I had very few false-failed pings, and only from one address out of two. I'm happy with that.
- Script reaction time is great. As soon as both pings fail, restart WAN procedure starts. Worst case scenario - internet could be down for <=1 min since cron job is set to 1 min.

Healthcheck is also great!
May 2021: 3 downtimes, 5 minutes total

However, it is not perfect!
I have weird random issue that ChkWAN script (cron job) won't start and say "script is already running... aborting" every minute instead of doing ping checks. I had maybe 2-3 such situations. When this happens ChkWAN is not actually working (it fails to reset WAN and check pings).

Solution: restart router OR remove cron job and re-set same job again.

Next step: find solution (of mistake) and prevent this from happening OR make a script which will react on specific log message and recreate cron job. This would be fun since I have zero scripting experience :)
 
For those with this problem:
Hi Martineau, thank you for your reply. Scribe is next thing on my list. I'm taking baby steps since I'm (or I was) complete noob on scripts, CLI, and Linux commands in general. It took me few days to set all this up.


I have no intention to hyjack this thread so I hope OP won't mind sharing my experience with this issue.
I have mentioned two situations where my router's WAN cannot refresh WAN IP, and rare instances where WAN interface obviously "restart" itself and manages to refresh WAN IP.

Here are two recent examples:

AC86U - failed to request new WAN IP - if there was no ChkWAN script, internet would be down until I manually kill that WAN switch in Asus GUI:
Code:
May  7 01:43:02 (ChkWAN.sh): 11863 v1.17 Monitoring WAN connection using 2 target PING hosts (1.1.1.1 8.8.8.8) (Tries=2)
May  7 01:43:02 (ChkWAN.sh): 11863 cURL 500Byte transfer took: 00:00.22 secs @ 2262 B/sec
May  7 01:44:02 (ChkWAN.sh): 12150 v1.17 Monitoring WAN connection using 2 target PING hosts (1.1.1.1 8.8.8.8) (Tries=2)
May  7 01:44:02 (ChkWAN.sh): 12150 cURL 500Byte transfer took: 00:00.26 secs @ 1908 B/sec

May  7 01:44:12 kernel: eth0 (Int switch port: 3) (Logical Port: 3) Link DOWN.
May  7 01:44:15 kernel: eth0 (Int switch port: 3) (Logical Port: 3) Link UP 1000 mbps full duplex

May  7 01:45:02 (ChkWAN.sh): 12438 v1.17 Monitoring WAN connection using 2 target PING hosts (1.1.1.1 8.8.8.8) (Tries=2)
May  7 01:45:48 (ChkWAN.sh): 12438 ***ERROR cURL 'https://raw.githubusercontent.com/MartineauUK/Chk-WAN/master/Test500B.txt' transfer FAILED RC=6
May  7 01:45:48 (ChkWAN.sh): 12438 v1.17 Monitoring WAN connection cURL data retrieval check FAILED
May  7 01:46:34 (ChkWAN.sh): 12438 ***ERROR cURL 'https://raw.githubusercontent.com/MartineauUK/Chk-WAN/master/Test500B.txt' transfer FAILED RC=6
May  7 01:46:34 (ChkWAN.sh): 12438 v1.17 Monitoring WAN connection cURL data retrieval check FAILED
May  7 01:47:31 (ChkWAN.sh): 12438 ***ERROR cURL 'https://raw.githubusercontent.com/MartineauUK/Chk-WAN/master/Test500B.txt' transfer FAILED RC=6
May  7 01:47:31 (ChkWAN.sh): 12438 v1.17 Monitoring WAN connection cURL data retrieval check FAILED
May  7 01:48:17 (ChkWAN.sh): 12438 ***ERROR cURL 'https://raw.githubusercontent.com/MartineauUK/Chk-WAN/master/Test500B.txt' transfer FAILED RC=6
May  7 01:48:17 (ChkWAN.sh): 12438 v1.17 Monitoring WAN connection cURL data retrieval check FAILED

May  7 01:48:17 (ChkWAN.sh): 12438 Renewing DHCP and restarting WAN (Action=WANONLY)

May  7 01:48:27 rc_service: service 13128:notify_rc restart_wan
May  7 01:48:27 custom_script: Running /jffs/scripts/service-event (args: restart wan)
May  7 01:48:27 miniupnpd[9798]: shutting down MiniUPnPd
May  7 01:48:31 miniupnpd[13244]: HTTP listening on port 45974
May  7 01:48:31 miniupnpd[13244]: Listening for NAT-PMP/PCP traffic on port 5351
May  7 01:48:32 WAN_Connection: ISP's DHCP did not function properly.
May  7 01:48:32 DualWAN: skip single wan wan_led_control - WANRED off
May  7 01:48:32 custom_script: Running /jffs/scripts/firewall-start (args: eth0)
May  7 01:48:32 wan: finish adding multi routes
May  7 01:48:32 YazFi: Firewall restarted - sleeping 30s before running YazFi
May  7 01:48:34 miniupnpd[13244]: shutting down MiniUPnPd
May  7 01:48:34 miniupnpd[13368]: HTTP listening on port 51369
May  7 01:48:34 miniupnpd[13368]: Listening for NAT-PMP/PCP traffic on port 5351

May  7 01:48:37 WAN_Connection: WAN was restored.


AC86U - successfully requested WAN IP (you can ignore ChkWAN script, this is the case where script is not needed)
Code:
May  7 06:15:02 (ChkWAN.sh): 27609 v1.17 Monitoring WAN connection using 2 target PING hosts (1.1.1.1 8.8.8.8) (Tries=2)
May  7 06:15:02 (ChkWAN.sh): 27609 cURL 500Byte transfer took: 00:00.20 secs @ 2538 B/sec

May  7 06:15:06 kernel: eth0 (Int switch port: 3) (Logical Port: 3) Link DOWN.

May  7 06:15:07 WAN_Connection: ISP's DHCP did not function properly.
May  7 06:15:07 DualWAN: skip single wan wan_led_control - WANRED off
May  7 06:15:09 kernel: eth0 (Int switch port: 3) (Logical Port: 3) Link UP 1000 mbps full duplex
May  7 06:15:11 WAN_Connection: WAN(0) link up.
May  7 06:15:11 rc_service: wanduck 1028:notify_rc restart_wan_if 0
May  7 06:15:11 custom_script: Running /jffs/scripts/service-event (args: restart wan_if)
May  7 06:15:15 custom_script: Running /jffs/scripts/firewall-start (args: eth0)
May  7 06:15:15 YazFi: Firewall restarted - sleeping 30s before running YazFi
May  7 06:15:15 wan: finish adding multi routes
May  7 06:15:15 miniupnpd[13368]: shutting down MiniUPnPd
May  7 06:15:15 miniupnpd[28003]: HTTP listening on port 40707
May  7 06:15:15 miniupnpd[28003]: Listening for NAT-PMP/PCP traffic on port 5351

May  7 06:15:17 WAN_Connection: WAN was restored.


Meanwhile, modem's log:
Code:
2021-05-07 06:15:09 WAN connection INTERNET_VOICE_TR069_R_UMTS1:IPv4 connected
2021-05-07 06:15:05 WAN connection INTERNET_VOICE_TR069_R_UMTS1:IPv4 disconnected

2021-05-07 01:44:15 WAN connection INTERNET_VOICE_TR069_R_UMTS1:IPv4 connected
2021-05-07 01:44:11 WAN connection INTERNET_VOICE_TR069_R_UMTS1:IPv4 disconnected

My next goal is to try and optimize ChkWAN to react faster (setting it to tries=1 maybe) and to get downtime to 2-3 min max.
Fortunatelly, these several days I had zero false positive pings so I'm also considering using ping method only, one try, then restart WAN.
I think this may be exactly what I am experiencing.
@xlr, your router log shows this:
Code:
2021-05-07 06:15:09 WAN connection INTERNET_VOICE_TR069_R_UMTS1:IPv4 connected
2021-05-07 06:15:05 WAN connection INTERNET_VOICE_TR069_R_UMTS1:IPv4 disconnected
And you get corresponding eth0 down and up log entries on your router, and loss of internet connectivity.
My modem log shows this:
Code:
2021-07-26 17:47:20SystemNotice WAN connection INTERNET_R_UMTS1:IPv4 disconnected
2021-07-26 17:47:21SystemNotice WAN connection INTERNET_R_UMTS1:IPv4 connected
and I likewise see corresponding eth0 down and up log entries on my router, and loss of internet connectivity. I presume you have a Huawei modem? In my case it is a B818-263 LTE modem.
I discovered to my horror yesterday that even when pulling out and reconnecting the WAN cable causing eth0 down and eth0 to go back up then the Asus router does not even bother to renew (unless you pull out cable for more than 5 seconds or so). The OpenWrt guys have fixed this - I believe that it is a drawback associated with udhcpc (it does not renew upon disconnect). So with the way the Asus router is configured at present you could actually pull out your WAN cable and connect it to a different network and it would fail. So the short drops caused by our modems have no chance of prompting a renew - they happen too quickly.
Here is my proposed workaround - run the following script 'maintain_wan_lease':
Code:
#!/bin/bash

renew_wan_lease=0

ip monitor link dev eth0 | while read event; do

        logger "maintain-wan-lease detected eth0 event: "$event

        case $event in

        *'NO-CARRIER'* )
                if [ $renew_wan_lease -eq 0 ]; then
                        logger "maintain-wan-lease detected eth0 state change to: 'NO-CARRIER', so forcing udhcpc to release wan lease."
                        killall -SIGUSR2 udhcpc
                        renew_wan_lease=1
                fi
        ;;

        *'LOWER_UP'* )
                if [ $renew_wan_lease -eq 1 ]; then
                        logger "maintain-wan-lease detected eth0 state change from: 'NO-CARRIER' to: 'LOWER_UP', so forcing udhcpc to renew wan lease."
                        killall -SIGUSR1 udhcpc
                        renew_wan_lease=0
                fi
        ;;
        esac

done
Just put this in /jffs/scripts, chmod +x it, then launch it in post-mount.
I believe this deals with the heart of our problem and avoids unnecessary polling.
This forces a 'release' upon eth0 down and forces a 'renew' when the eth0 comes back up.
You could test whether the script would solve your problem in advance by disabling the WAN restart script and when the problem occurs issuing the "killall -SIGUSR1 udhcpc" command - this should force renew. Would you be able to test that? If you regain internet connectivity, then this script should fix your problem. Make sense?
I am very interested to see how you get on since I have been struggling with this problem for some time now.
 
Last edited:
Hi @Lynx

First of all, let me just say, in my case this issue is gone and it's not happening any more. How it got solved? I don't know, I did nothing. When it got "solved"? Exactly on June 22nd (~2 months ago). What was the cause? Probably ISP, they won't admit, of course.

I'm using same ISP on two locations in my city and had same issues with different equipment. One day I noticed I had no internet drops lately so I went to logs and checked - surprise, surprise! Huawei B818 modem was held connection for almost a week. Same situation was on second location where I use B535, it stopped reconnecting on daily basis.

I did some tests later and noticed connection is rock solid, WAN IP lasts 30-40 days (modem doesn't reconnect anymore unless I reboot it).
Few weeks ago I stopped ChkWAN script.

I hope this is it, no more issues.
 
I have same issue with ASUS RT-AX92U - WAN IP goes away every 3-4 days and I must reboot ASUS router to restore connection. I would love to try Chk-WAN script, but I am unsure how to install. I use a Mac to interface with router so I assume I'd use terminal. But using ASUS firmware and not Merlin, how do I install a script - is it possible? Is there a primer on this anywhere? Thanks in advance. (my internet provider is Optimum on a 1Gig coax line, cable modem in bridge mode)
 
@RMerlin appropriate udhcpc kill commands need to be sent on eth0 down but they are not at present. Issue can be easily reproduced. Unplug cable and see!
 
In my logging I am concluding this is happening with ISP's who have combined their router and bridge mode clients in one dhcp server/template where they are failing to provide proper templating for bridge mode clients because the dhcp server included in the renew response is often meant for router clients only.
If this is the case, what can I do an an end user with Optimum? Can I get the script to work without installing Merlin (on base ASUS firmware)? It would be great to resolve this issue
 

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top