What's new

Dual WAN option: problems and solutions

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

Is the Dual WAN option important for you?

  • YES!

    Votes: 1 50.0%
  • Maybe some day

    Votes: 1 50.0%
  • No

    Votes: 0 0.0%
  • What is Dual WAN option?

    Votes: 0 0.0%

  • Total voters
    2
  • Poll closed .

Ernst Lopes Cardozo

New Around Here
This is a long post in which I address the dual WAN option problems and suggest some solutions.

First, let us define what we expect from dual WAN:
  • If our primary WAN link is much faster and/or cheaper than the secondary, we want all traffic using the primary unless the primary does not work;
  • If the links have similar speed and there is no financial penalty for using either of them, we want to divide the traffic over the two links in such a way that performance is maximized. If there is a marked difference in the delay of the two links (e.g. ADSL vs. satellite), we want to direct delay-sensitive applications (VoIP, Skype, gaming) to ADSL and email and file transfers over the sat link.
  • Whenever one of the links deteriorates, we want the router to respond as quickly as possible, but when that link improves again, we want the router to make sure that the improvement is not short lived to prevent constant switching.
These simple wishes already show that the dual WAN option requires a lot more than just the ability to operate two WAN links. It becomes even more complex when we realize that:
  • WAN links can fail in multiple ways:
  • The physical link goes down: the router software gets an immediate signal that the Ethernet or USB port lost its connection, this failure is unambiguous and immediate. Restoration is almost as quick and clear.
  • The link remains up, but the device at the other end (cable modem, USB dongle, etc.) signals that it lost its connection. Hard and fast as well.
  • There is not error indication, but there is no traffic. Remote server slow or down? Temporary Internet routing problem? The only way to determine if there is a problem or not is sending periodic probes (pings) and look for responses; One missed ping is ‘normal’, two missed pings in a row is suspicious and three probably means the link is down.
  • To make things even worse, some links / paths experience intermittent failure. As an example, the backhaul radio link of my WiFi ISP runs through a hole in a stand of Italian poplars. Whenever the wind comes from the North, one of these moves in and out of the beam, resulting in many interruptions of a fraction of a second to several seconds. On a bad day, only email come through, surfing takes forever and Skype / VoIP is impossible. I then want to switch to my mobile provider, but of course, my monthly budget is limited. The disturbance of my regular link come in bursts: it can be ok for half an hour, then break hell for 5 minutes (or 2 hours) and then come back again. Impossible to control by hand, almost impossible to do right automatically.
There is still more to ponder:
  • If we configure some types of traffic to use one particular link, what do we do when that link goes down?
  • If one link is down and the remaining link starts to miss pings, should we declare it down too? Maybe, since there is no alternative, we should hang on to it for better or worse.
  • If one link misses pings in short bursts and the other has a similar problem, which one do we use?
  • Switching from one link to the other works for some applications, but not for all. The remote host sees our traffic from one external IP-address suddenly stop and packets start emerging from a different external IP address, probably from a different ISP. Our local client does not know that its traffic has been switched at all, so it has to recover intelligently from a severed connection or the application will have to be restarted. This underlines that the router should switch only when necessary.
  • How long should we wait for a ping response? If the target is nearby and the link is fast, 100ms should be enough. But on a sat link anything under 1 second will trigger false alarms – unless the sat modem fakes a ping response, in which case the whole thing is useless.
  • What if we keep receiving ping responses on a link, but they come with increasing delay? If we are load balancing, should then we decrease the fraction of traffic sent to this link?
  • Talking about load balancing, it is good to realize that we can only direct the outgoing traffic. How much traffic will come back after we start to send a TCP session over a particular link is completely undetermined; it can be a few kB or several GB. Therefore, to balance the load on a link, we have to monitor the incoming traffic as well.
  • Further complications arise if one link is symmetrical (same upload and download speeds) and the other is not. We would want to send an upload stream (e.g. cloud backup) over the link with the highest upload speed (measured or indicated by the administrator), but the router has no way to know which TCP-SYNC packet is the start of an upload session.

No wonder and no shame that the ASUSWRT software fails in quite a few situations to “do the right thing”. Yet, I think improvement is possible. For instance:

  1. In fail-over mode, we get the option to ping a remote host in order to determine if a link actually provides a live path to the internet. When selecting balance mode, this option disappears and consequently, if a path fails (but not the physical link), traffic is sent into a black hole. Solution: enable the watchdog in balancing mode.
  2. For the watchdog setting, I would rather see the ‘ping interval’ plus a ‘number of successive pings missed’ to declare a path down as parameters. In addition, a second parameter for the minimum number of successive ping responses before declaring the link up again. Normally, to avoid trashing, this number should be higher than the number that triggered the link to go down.
  3. Allowing a parameter to set ping timeout (for each path) would help to optimize the failure detection time for low delay paths without compromising high delay paths.
  4. It would help to have buttons to turn off and on WAN links manually; sometimes, the algorithm just cannot get it right because it lacks information and it would be very useful in a debugging situation. And please, without rebooting or restarting services: just jinx the interface at the lowest level possible.
  5. Merge the load balancing and fail over modes: If in the ‘Load balancing configuration’ one link were set to zero, that link would only be used if the other one is down, effectively functioning as fail over. Please change the name of this parameter to something more descriptive like “distribution ratio” or “link priority ratio”. If a link/path goes down, load balancing should of course act as fail over anyway, so there is no real difference between the two modes.
  6. Routing based not only on client and server IP addresses, but on port numbers as well. If my cloud backup server uses a particular port number, I could direct that traffic over the link with the highest upload rate or the lowest financial cost. Similarly, I want to be able to avoid sending VoIP traffic from a multi-purpose device (smartphone, PC) over a sat link. There are of course many more use cases for port-based routing.
  7. If a link is down and the second goes down as well, and both are down because of excessive ping failures, can we do something intelligent to maintain connectivity? If we collect ping statistics, can we determine which link is best and direct all traffic there?
  8. If our ping statistics suggest that a link is overloaded, should we look at the QoS parameters and drop the traffic with the lowest priority in order to protect the high priority sessions?
 
I don't do any work on the Dual WAN code, so any issue/request would have to be taken to Asus, not me.
 

Similar threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top