What's new

Possible improvements to dual WAN fail-over function

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

Nerre

Senior Member
I will probably not have need for it now when I have fiber, but I though it might be an idea to start a topic about it to see how people would like to behave.

I have tried it with a 3G modem as secondary (fail-over) WAN connection and noted a few disappointing issues (and I tried to browse the source to see how things worked, but I'm not a programmer so I might have missed some points).


The primary issue is that fail-over seems to only switch WAN when the currently used WAN goes down. But I think most users would want to use the primary WAN whenever it is up, and only use the secondary when primary is down (because the secondary probably is slower, more expensive and/or might have traffic limits).

This means that the primary WAN connection would have to be regularly "probed" to find out when it goes up again. And this connects to the next issue.


A second issue is that the way the router detects WAN down seems to rely on the ethernet link status. Often the ethernet link is up but there is failed connection somewhere on the way to the internet (providers router down or something like that). So, there would be a need for a possibility to configure some kind of active probes (for example pinging an external host or pulling a page using wget or similar) to be able to assess the internet connection rather than the WAN ethernet link.


A third issue is that QoS probably would have to be set up with different max rates for the secondary WAN. If the secondary WAN has traffic limits maybe some traffic would have to be blocked too (for example if your 3G has limits like 5GB a month you wouldn't want your bittorrent client using up that traffic).



I was thinking about the best ways to solve these issues.

The first one does probably need some patching, but I think a clever patching (for example to support semaphore files in addition to the current logic) could make the other issues possible to solve using scripts.

For example the code could be written so when a file called wan_down is created it will switch to secondary WAN. When the wan_down sempahore disappear the router switches back to primary. (There might be some race conditions that have to be handled.)

That way "power users" could craft their own scripts for the probing, when the script detects that the primary WAN is down it creates the semaphore and keeps probing. When it detects that primary WAN is up again it just removes the semaphore.

The QoS configuration as well as changed iptables rules could probably be changed using scripts like the Merlin jffs features. A few if-statements in the scripts could be used to select different parts depending on what WAN interface is used (not sure if it currently is possible from a script to detect which interface is used, but I guess that it would be easy to solve using semaphore files too).
 
Dual WAN is an experimental and unfinished feature that Asus are still developing. Until they finalize it, it would be pointless for me to make any code change to it, as they might very well rewrite part of it in any future update.
 
But my suggested patch would probably be less than 10 lines of code, the rest would be handled by scripts.
 
But my suggested patch would probably be less than 10 lines of code, the rest would be handled by scripts.

It's not that simple. The Dual WAN code is spread all over the firmware. Taking one interface down and bringing another one up isn't straightforward (you have to take into account all the different WAN interfaces supported), and you also have to make sure not to run into any conflict with the existing code that takes care of checking the link states - your method suggests overriding them.

If someone feels like doing it and is able to provide a patch, I'd be happy to review the patch and possibly merge it in. But developing this amounts to at least a couple of hours of work (not including tests), not just a 30 mins patch.

Personally, I'd rather wait for Asus to finish this code. The failover recovery is only one thing that does not work yet - the webui itself is also broken and would require a few additional hours to complete and to debug.
 
This topic was not intended as a request, it was intended to discuss what could be done. So don't feel any pressure to any work at all, people interested in the changes can always check out the source and make their own build:)

The discussion could also be input for Asus development.
 
This topic was not intended as a request, it was intended to discuss what could be done. So don't feel any pressure to any work at all, people interested in the changes can always check out the source and make their own build:)

The discussion could also be input for Asus development.

No problem. Just giving my side of things on this.

And the more people post about Dual WAN, the more Asus might feel it's worth spending more efforts on finalizing and polishing it :) I think a properly working Dual WAN could be a killer feature that would sell routers on its own.
 
Yes, I know a lot of people who have so much problems with their ADSL that they are thinking about switching completely to 3G/4G (I was one of them until two days ago, our ADSL went down for 30-90 seconds every five-ten minutes when it was raining...). A working dual WAN fail-over would probably be worth a lot to them.

I was trying to find the place in the source where the switching took place but didn't find it now, maybe that part has been changed in the last release. What I remember was a couple of if-statments over a piece of code half a screen long.

It was something like (just pseudo code here): if (status=disconnected && dual_wan=failover) switch to other WAN

Everything using other functions so in that part there was nothing that was different depending on whether the second WAN was USB-connected or LAN1.

But I'm not even sure in what file I found it, maybe it wasn't wan.c.
 
How dual wan is done in dd-wrt

Sorry if this is a bit basic, but here is the approach taken for dual-wan in DD-WRT:

http://www.dd-wrt.com/wiki/index.php/Dual_WAN_with_failover

It seems that one of the easiest ways to test for a failed WAN connection would be to query that WAN link's DNS servers. I've seen it implemented on a heartbeat approach, trying a DNS query to each WAN every minute, or alternatively when a DNS query fails, the alternative WAN's DNSes are queried and the one that responds is regarded as live. To switch back to primary, one could continue to do every DNS query to both WAN links every time an external DNS query is done, and when primary responds, it's regarded as live.
 
Setup of the dual WAN feature with in my case a 3G "dongle" was pretty straightforward with the latest 3.0.0.4.374.720 build. Failover to the secondary WAN once the primary goes down seems to work quite well.
The three major lacks i see are:
  1. The secondary WAN needs a separate set QoS (as discussed above).
  2. Failover back to the primary WAN, once in the "air" again should be automatic (also discussed before). Alternatively at least add a "fail over" button in the GUI, instead of reset as only way to revert back to the primary WAN.
  3. There should be some alert mechanism that WAN has failed over (the best I can think of for home usage is an e-mail alert, for business purposes an SNMP message would do it).
 
Last edited:
Setup of the dual WAN feature with in my case a 3G "dongle" was pretty straightforward with the latest 3.0.0.4.374.720 build. Failover to the secondary WAN once the primary goes down seems to work quite well.
The three major lacks i see are:
  1. The secondary WAN needs a separate set QoS (as discussed above).
  2. Failover back to the primary WAN, once in the "air" again should be automatic (also discussed before). Alternatively at least add a "fail over" button in the GUI, instead of reset as only way to revert back to the primary WAN.
  3. There should be some alert mechanism that WAN has failed over (the best I can think of for home usage is an e-mail alert, for business purposes an SNMP message would do it).


Fully agreed to all items... Are there any improvements so far?
 

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top