What's new

WANFailover Dual WAN Failover ***v2 Release***

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

1. I do not have Entware nor any other addons installed; just your script and a couple of scripts self made for DDNS and local backup.
I've a couple of Asus routers running the 386.12 firmware version and both report the same output for the command /usr/sbin/ip -V :
ip utility, iproute2-ss150210

2. Considering that both of my wan connections are setup with Cloudfare DNS, would this values be ok as Target IP's ?
View attachment 53265

3. # /usr/sbin/ip -V
ip utility, iproute2-ss150210

# ls -g /usr/sbin/ip*
Code:
-rwxr-xr-x    1 root        111392 Sep  4 17:10 /usr/sbin/ipset
-rwxr-xr-x    1 root          7534 Sep  4 17:10 /usr/sbin/ipsec
-rwxr-xr-x    1 root        273716 Sep  4 17:09 /usr/sbin/ip
lrwxrwxrwx    1 root            13 Sep  4 17:09 /usr/sbin/ip6tables -> xtables-multi
lrwxrwxrwx    1 root            13 Sep  4 17:09 /usr/sbin/ip6tables-restore -> xtables-multi
lrwxrwxrwx    1 root            13 Sep  4 17:09 /usr/sbin/ip6tables-save -> xtables-multi
lrwxrwxrwx    1 root            13 Sep  4 17:09 /usr/sbin/iptables -> xtables-multi
lrwxrwxrwx    1 root            13 Sep  4 17:09 /usr/sbin/iptables-restore -> xtables-multi
lrwxrwxrwx    1 root            13 Sep  4 17:09 /usr/sbin/iptables-save -> xtables-multi
Hmm interesting, maybe a different set for your model then. Try starting with different Target IPs and collect new debug logs.
 
Here you have the captured log.
Guess we'll need to go deeper than the standard debug level.
What happens when you run this command? This should delete the default route on your main route table but your logs are saying it's failing, if you delete this you'll need to recreate it or reboot and let it auto recreate.
Code:
ip route del default
 
***v2.0.7 Released***

Release Notes:
v2.0.7 - 09/29/2023
Enhancements:
- Added metric values to IP Routes created for target IPs.
- Added additional debug logging to WAN Switch function.
- Added 386.12 to supported firmware list.
- Minor optimizations to increase performance.
- Added CRLF argument to email.
- Added restart option to Status Console.
- Major performance optimization for NVRAM Check function.
- Parent PID is now displayed on Status Console with Dev Mode enabled.
- Added error message if an invalid run argument is specified.
- Added Failover timeout setting. Default is 30 seconds

Fixes:
- Minor visual bug when WAN Failover kill command is being executed.
- WAN Failover will go to disabled state now if DNS Query or Failback are checked under Dual WAN Settings.
- Fixed issue causing PID File not to be deleted under /var/run/wan-failover.pid
- Failover will now properly timeout when the 30 second timeout timer has been reached.

Install:
- Warnings for DNS Query or Failback being enabled will now alert and log during installation

Deprecated:
- WAN0 Route Table and WAN1 Route Table configuration options have been deprecated and are now pulled directly from the Route Table file.
 
What happens when you run this command? This should delete the default route on your main route table but your logs are saying it's failing, if you delete this you'll need to recreate it or reboot and let it auto recreate.
Code:
ip route del default
The default route delete command works fine when executed from the command line, no errors.
During the past weekend, I attempted the execution of a manual fail-over and this was the output I got:

Code:
Make a selection:
14
Are you sure you want to switch Primary WAN? ***Enter Y for Yes or N for No***
> Y
wan-failover: Failback - Switching wan0 to Primary WAN
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleted default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Switched wan0 to Primary WAN
wan-failover: DNS Switch - Removing wan1 DNS1 Server: 1.1.1.1
wan-failover: DNS Switch - Removed wan1 DNS1 Server: 1.1.1.1
wan-failover: DNS Switch - Removing wan1 DNS2 Server: 1.0.0.1
wan-failover: DNS Switch - Removed wan1 DNS2 Server: 1.0.0.1
wan-failover: Service Restart - Restarting dnsmasq service
wan-failover: Service Restart - Restarting firewall service
wan-failover: Service Restart - Restarting leds service
wan-failover: Service Restart - Stopping qos service
wan-failover: Service Restart - Restarting OpenVPN Server 1
wan-failover: Service Restart - Waiting on services to finish restarting
wan-failover: Service Restart - Services have been restarted

So the 1st execution of the default route deletion seems to works fine, but then looks like it keeps attempting to delete the default route a few more times. No idea why.

Those results were obtained using the v2.0.7-beta4 version.
I'm going to install the recently published 2.0.7 final version and test again.
If the results are the same I'll try to debug the script execution.
Please let me know if you have any recommendation.

Thanks for your support.
 
The default route delete command works fine when executed from the command line, no errors.
During the past weekend, I attempted the execution of a manual fail-over and this was the output I got:

Code:
Make a selection:
14
Are you sure you want to switch Primary WAN? ***Enter Y for Yes or N for No***
> Y
wan-failover: Failback - Switching wan0 to Primary WAN
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleted default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Deleting default route via CCC.DDD.223.254 dev vlan3
RTNETLINK answers: No such process
wan-failover: Failback - ***Error*** Unable to delete default route via CCC.DDD.223.254 dev vlan3
wan-failover: Failback - Switched wan0 to Primary WAN
wan-failover: DNS Switch - Removing wan1 DNS1 Server: 1.1.1.1
wan-failover: DNS Switch - Removed wan1 DNS1 Server: 1.1.1.1
wan-failover: DNS Switch - Removing wan1 DNS2 Server: 1.0.0.1
wan-failover: DNS Switch - Removed wan1 DNS2 Server: 1.0.0.1
wan-failover: Service Restart - Restarting dnsmasq service
wan-failover: Service Restart - Restarting firewall service
wan-failover: Service Restart - Restarting leds service
wan-failover: Service Restart - Stopping qos service
wan-failover: Service Restart - Restarting OpenVPN Server 1
wan-failover: Service Restart - Waiting on services to finish restarting
wan-failover: Service Restart - Services have been restarted

So the 1st execution of the default route deletion seems to works fine, but then looks like it keeps attempting to delete the default route a few more times. No idea why.

Those results were obtained using the v2.0.7-beta4 version.
I'm going to install the recently published 2.0.7 final version and test again.
If the results are the same I'll try to debug the script execution.
Please let me know if you have any recommendation.

Thanks for your support.
It probably will, I didn't change anything in that part of the code, reviewing that section now and it does check to see if the default route exists with the old gateway and interface and then executes the command to delete which is where the error is occuring. Is it possible your router is readding the route as you delete it?
 
It probably will, I didn't change anything in that part of the code, reviewing that section now and it does check to see if the default route exists with the old gateway and interface and then executes the command to delete which is where the error is occuring. Is it possible your router is readding the route as you delete it?
Hi ...

If the router is reading the default route while the delete attempt is being done, is for sure part of the router normal operations (and/or the wan-failover.sh script actions), as I'm not manipulating the routing table either directly or indirectly by using some sort of mechanism.

Is this the code block that does the default route deletion ?
1696092060912.png


Thanks.
 
Last edited:
Hi ...

If the router is reading the default route while the delete attempt is being done, is for sure part of the router normal operations (and/or the wan-failover.sh script actions), as I'm not manipulating the routing table either directly or indirectly by using some sort of mechanism.

Is this the code block that does the default route deletion ?
View attachment 53385

Thanks.
It is checking if the default route exists before trying to delete, that's what you see in the if statement as well as the WAN Gateway interface not being null.
 
It is checking if the default route exists before trying to delete, that's what you see in the if statement as well as the WAN Gateway interface not being null.
Unfortunately I'm not able to cut the internet service this weekend. Hope to followup during the next one.
Anyway, I guess that one or more of the conditions that control the until loop (where the code block shown above resides) execution, might be causing the failure. Tracing the script execution will probably help to reveal what exactly.
 
I'll bet on line noise. I can't test line noise, but given what the Telus tech said...

I have updated to your beta and set it to require 6 pings. I'll report back in a few days on whether it has quieted down, or is still doing it. Thanks. :)
Mine is still cutting over 1-2 times daily. I ended up shifting the config to 5 pings with 3 second timeout. It's only ever one line at a time. Usually the DSL, but sometimes the Cable. I knew the cable node was ornery in this area, hence choosing DSL initially, but I think now I might just have two connections with imperfect reliability.
 
***v2.1.0-beta2 Release***
Enhancements:
- Added WAN0 and WAN1 Web GUI configuration options to create routes for the device portals for each WAN interface.
- Added Reset Default Configuration to Configuration Menu, additionally the command argument resetconfig can be used.
- Enhanced uninstallation prompt for verifying to uninstall.

Fixes:
- Fixed an issue where update would hang if WAN Failover wasn't running.
- Fixed an issue that would allow Load Balance FWMarks and Masks to be non-hexidecimal values in console.
- Added function to verify reverse path filtering is disabled after restarting WAN interfaces and when performing initial WAN Status checks. This is already disabled by the firmware by default but automatically re-enables when an interface is restarted and can cause issues with the target IP rules.

WAN Web GUI Configuration:
1696861908140.png
 
Last edited:
Updated previous post to reflect release of v2.1.0-beta2.

 
The new Beta fixes the issue where if my wan0 went down. After it moved to wan1. It won't ever go back to wan0 when it's up because the script will still see it as having 100% packet loss. Usually only a script restart fixes this but this new Beta fixed it completely
So you are saying it will now fail back to wan0?
 
***v2.1.0 Release***
Enhancements:
- Added WAN0 and WAN1 Web GUI configuration options to create routes for the device portals for each WAN interface.
- Added Reset Default Configuration to Configuration Menu, additionally the command argument resetconfig can be used.
- Enhanced uninstallation prompt for verifying to uninstall.

Fixes:
- Fixed an issue where update would hang if WAN Failover wasn't running.
- Fixed an issue that would allow Load Balance FWMarks and Masks to be non-hexidecimal values in console.
- Added function to verify reverse path filtering is disabled after restarting WAN interfaces and when performing initial WAN Status checks. This is already disabled by the firmware by default but automatically re-enables when an interface is restarted and can cause issues with the target IP rules.
 
Updated previous post to reflect release of v2.1.0-beta2.

Hi... Finally I got some time and window to get back.
Just tested the 2.1.0 version which seems near to fulfill the wan-failover purpose, but is not there yet in my case.
I removed any traces from the previous installation and rebooted the router before install the 2.1.0 installation and rebooted again after (I did no changes to wan-failover config).
The router came up with WAN0 as the primary and I let it ran for almost 15-20 minutes before switching OFF the ISP router providing the service to WAN0.
At this point the WAN1 took the primary role and again I checked that it remained stable for 15-20 minutes.
Then I switched ON the ISP router providing the service to WAN0 but WAN1 kept the primary role.
After that I entered the wan-failover menu and forced the failover to WAN0, but it didn't work either.

During all the testing process, I noticed high CPU usage not only during the phases where the services were being restarted, but always which makes the router performance not good.
1698840025537.png


So I had to remove the script and restart the router again.
1698839990663.png


Thanks for you work and support but I guess it's time to invest in a new non Asus router.
 
Hi... Finally I got some time and window to get back.
Just tested the 2.1.0 version which seems near to fulfill the wan-failover purpose, but is not there yet in my case.
I removed any traces from the previous installation and rebooted the router before install the 2.1.0 installation and rebooted again after (I did no changes to wan-failover config).
The router came up with WAN0 as the primary and I let it ran for almost 15-20 minutes before switching OFF the ISP router providing the service to WAN0.
At this point the WAN1 took the primary role and again I checked that it remained stable for 15-20 minutes.
Then I switched ON the ISP router providing the service to WAN0 but WAN1 kept the primary role.
After that I entered the wan-failover menu and forced the failover to WAN0, but it didn't work either.

During all the testing process, I noticed high CPU usage not only during the phases where the services were being restarted, but always which makes the router performance not good.
View attachment 53908

So I had to remove the script and restart the router again.
View attachment 53907

Thanks for you work and support but I guess it's time to invest in a new non Asus router.
I would need debug logs collected to be able to diagnose what’s going on in your case.
 

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top