What's new

Dual WAN Failover ***v2 Release***

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

Yes, here the route If I put same DNS on wan0 and wan 1 :
Destination Gateway Genmask Flags Metric Ref Use Iface
default 192.168.4.1 0.0.0.0 UG 0 0 0 eth0
8.8.4.4 192.168.1.1 255.255.255.255 UGH 1 0 0 eth4
8.8.4.4 192.168.4.1 255.255.255.255 UGH 1 0 0 eth0
8.8.8.8 192.168.1.1 255.255.255.255 UGH 1 0 0 eth4
8.8.8.8 192.168.4.1 255.255.255.255 UGH 1 0 0 eth0
127.0.0.0 * 255.0.0.0 U 0 0 0 lo
192.168.1.0 * 255.255.255.0 U 0 0 0 eth4
192.168.1.1 * 255.255.255.255 UH 0 0 0 eth4
192.168.2.0 * 255.255.255.0 U 0 0 0 br0
192.168.4.0 * 255.255.255.0 U 0 0 0 eth0

But this will result to problem.
Because the DNS will be accessible ony from one route.
If we disconnect wan1 we lots DNS access.

I'm testing with wan0 : DNS Google. wan1 : other dns provider.
In this case, the router can access to DNS when failover to wan1 (after script wan failover changing DNS config).
But VPNMON still detect issue on WAN, even if no problem on WAN.
Yes, as I previously stated, make your DNS Servers different for each WAN Interface. Otherwise you'll get those duplicate routes in there to the same target that could cause issues.
 
Set AMTM up for first time while using this failover script and on checking WAN Failover Status within amtm I am getting Status: Unresponsive. The other settings appear all to be ok. Does that I mean that I have some sort of a fault in my setting?
 

Attachments

  • 2023-05-14_225640.jpg
    2023-05-14_225640.jpg
    64.4 KB · Views: 61
I installed the script, did some tests, and all looked good - but now it's giving me two issue.

1) By dumb luck, a day after installing, my main ISP has been having issues (actually, it's a power thing, but that's irrelevant). For hours now my primary has been down, and randomly it'll switch back to the primary even though there is no fiber in the media converter - def no pings happening. When I look at the status, it even shows 100% ping failure yet, it shows the connection being used. Not even sure what to look for in the logs, since, well, the ping failed...why switch? I guess I have to turn on dev logging, but looking for obvious things I should check.

2) When on the secondary connection the first two CPU cores run very high (per the Merlin UI). Oddly, however, when I run top, I don't see anything out of the ordinary, def no spikes from the script.

I tried pulling power for a full reboot, but the problems persist.
 
Set AMTM up for first time while using this failover script and on checking WAN Failover Status within amtm I am getting Status: Unresponsive. The other settings appear all to be ok. Does that I mean that I have some sort of a fault in my setting?
Looks like the script locked up trying to get values from NVRAM, possibly. Turn on NVRAM Checks and reload.
 
Looks like the script locked up trying to get values from NVRAM, possibly. Turn on NVRAM Checks and reload.
Thanks but not sure what NVRAM is and how to turn its checks on.

I have had to "Restart WAN Failover" from amtm and after a couple of minutes of "waiting for wan failover to restart from Cron job", it seems to be working now. Not sure if I would have to go through this process every time router is restarted.

Edit: Spoke to early as went back to "unresponsive" very shortly after showing that it was monitoring.
 
I have managed to enable NVRAM check and restarted script but with exactly the same problem: Status first shows failover monitoring for a couple of minutes before changing again to unresponsive. I have tried uninstalling and reinstalling to no avail. Not sure what the problem is!
 
Last edited:
I have managed to enable NVRAM check and restarted script but with exactly the same problem: Status firsts shows failover monitoring for a couple of minutes before changing again to unresponsive. I have tried uninstalling and reinstalling to no avail. Not sure what the problem is!
Go ahead and turn on debug logging and collect logs so I can take a look, thanks
 
Discovered another (smaller) bug in that "keep the configuration file" during uninstallation of the script behaves in the wrong way round. If you answer NO to keeping the file, then the file survives and if you answer YES, then the file is actually gets deleted.
 
Discovered another (smaller) bug in that "keep the configuration file" during uninstallation of the script behaves in the wrong way round. If you answer NO to keeping the file, then the file survives and if you answer YES, then the file is actually gets deleted.
Ok I’ll look into that.
 
Go ahead and turn on debug logging and collect logs so I can take a look, thanks
Right, so the script status is now correct at "Failover Monitoring" and no longer showing as unresponsive but only after I have changed my router's native dual WAN settings as per the screenshot attached:

The problem now is that WAN1 status within the script has changed from "Connected" to "Stopped" (and the secondary WAN status in the web GUI from "hot-standby" to "cold-standby"). Not sure if this is how things are meant to be but if this is the case, then I am good for now.
 

Attachments

  • 2023-05-15_231526.jpg
    2023-05-15_231526.jpg
    77.5 KB · Views: 61
Right, so the script status is now correct at "Failover Monitoring" and no longer showing as unresponsive but only after I have changed my router's native dual WAN settings as per the screenshot attached:

The problem now is that WAN1 status within the script has changed from "Connected" to "Stopped" (and the secondary WAN status in the web GUI from "hot-standby" to "cold-standby"). Not sure if this is how things are meant to be but if this is the case, then I am good for now.
Those settings should be disabled, it’s in the readme and will warn you during install if those are enabled.
 
I installed the script, did some tests, and all looked good - but now it's giving me two issue.

1) By dumb luck, a day after installing, my main ISP has been having issues (actually, it's a power thing, but that's irrelevant). For hours now my primary has been down, and randomly it'll switch back to the primary even though there is no fiber in the media converter - def no pings happening. When I look at the status, it even shows 100% ping failure yet, it shows the connection being used. Not even sure what to look for in the logs, since, well, the ping failed...why switch? I guess I have to turn on dev logging, but looking for obvious things I should check.

2) When on the secondary connection the first two CPU cores run very high (per the Merlin UI). Oddly, however, when I run top, I don't see anything out of the ordinary, def no spikes from the script.

I tried pulling power for a full reboot, but the problems persist.
@monorailmedic yes please retrieve debug logs so I can look into your issues, thank you.
 
***v2.0.4-beta1 has been released***

v2.0.4-beta1:
Enhancements:
- Added 3rd DNS Server from Automatic Settings to be factored into WAN Failover events.
- The checkiprules function will now be checked at the beginning of WAN Status checks to ensure NAT rules are created if necessary prior to performing packet loss checks.

Fixes:
- Corrected issue during uninstall when logic was reversed for retaining configuration file or to delete.
 
I would love to try the beta update but it is just not updating for some reason. Any idea where I am going wrong here?
 

Attachments

  • 1.jpg
    1.jpg
    54.8 KB · Views: 49
  • 2.jpg
    2.jpg
    41.2 KB · Views: 46
I would love to try the beta update but it is just not updating for some reason. Any idea where I am going wrong here?
You have to restart the existing instance of the status page / menu for it to use the updated version.
 
You have to restart the existing instance of the status page / menu for it to use the updated version.
Thank you very much for the rapid reply but the status page was only started from afresh after the update was supposedly completed (I have put the screenshots in the right order). Is there anything else I can try? Do I need to restart the router (not preferrable unless absolutely necessary)?
 
Thank you very much for the rapid reply but the status page was only started from afresh after the update was supposedly completed (I have put the screenshots in the right order). Is there anything else I can try? Do I need to restart the router (not preferrable unless absolutely necessary)?
Loaded fresh from a clean terminal or or from the SSH UI Menu? Needs to be a clean load.
 
Loaded fresh from a clean terminal or or from the SSH UI Menu? Needs to be a clean load.
Thanks again and sorry to be a pest but tried a clean terminal and also a fresh script install directly from SSH and again via amtm to no avail. I am stuck with 2.0.3 for some odd reason.
 
Does this line showing up during a fresh install mean anything as it looks a bit unusual following a complete uninstall?

"wan-event script already exists..."
 
Last edited:

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top