Layer 4 VPN load balancing for increased bandwidth

doktor-x · Feb 4, 2022

Hi all,

tl;dr is it possible to combine two VPN tunnels to increase my download speeds with AsusWRT Merlin?

The situation:
I have a RPI4 set up with NZBGet, which connects to the internet via my RT-AX88U. The connection is routed through a OpenVPN client using VPN Director. NZBGet opens multiple TCP connections to the usenet, each downloading a different file. With this setup i'm averaging at about 15MB/s. When i disable the VPN Director rule i average at about 25MB/s with everything else unchanged. For me it seems that the bottleneck is the VPN tunnel max connection speed as it varies quite a bit (10MB/s to 20 MB/s).

What i'd like to do:
To mitigate this bottleneck i would like to set up a second OpenVPN client to another server of my VPN provider next to the existing one in AsusWRT and load balance the TCP connections based on the source port of the connection using round robin over both VPN tunnels.
For visualization:

I think that this is currently not an supported option in AsusWRT Merlin, so would it be possible using another RPI as a proxy which load balances the streams and sends it over different network interfaces so i can use VPN Director to send the streams over different VPN servers based on source IP. This would look like this:

So,
1. does this make any sense?
2. is this possible?
3. am i missing something obvious why this would not increase my download speed?
4. if this is possible using a proxy server do you have any suggestions with which software stack i could implement the TCP load balancing?

Thanks in advance for any help on this topic!

Tech Junky · Feb 4, 2022

That's a creative idea but, I think the issue you'll run into assuming you're using OVPN is the lack of CPU power to make sense of this setup.

Even going full bore on a 12700K custom router the most I could get on a 1gbps line w/ OVPN topped out at ~600mbps DL. With Wire Guard though I can get closer to wire speed @ 1200mbps+ w/ a whole lot less CPU utilization.

RPI though won't get that fast on WG but, it would be a lot faster than OVPN.

doktor-x · Feb 4, 2022

I don't think CPU power is going to be a bottleneck in the setup, as all the heavy lifting for the OpenVPN should be done by the RT-AX88U and tests show that it is able of 200Mb/s with OpenVPN (also i'm wondering if these tunnels are single threaded as i always see only one core getting over 70% when downloading through the VPN tunnel and the other 3 cores are almost idling..). And stable 200Mb/s is all i want as this would already be a 66% speed increase.

My VPN provider does currently not support WG and even if it would i'm not sure it would be good enough to satuarate my 400Mb/s line as i think the server load on a single server is too high to give 400Mb/s solely to my connection..

Do you have any idea how this could be implemented? I'm not really finding anything regarding load balancing/routing based on source port to multiple output interfaces for arbitrary source ports and destination IPs..

Tech Junky · Feb 4, 2022

Well, WG will spawn additional resources to saturate the line.

Just a snapshot while DL'ing Ubuntu through tor. This hit 400mbps in a few seconds though w/ a PC being the router w/ no bottlenecks due to CPU but, far left is the CPU utilization per kworker instance.

IIRC a RPI4 could potentially hit line speed -

https://www.reddit.com/r/WireGuard/comments/eeafds

Using this breaks away from the drastic speed reduction of OVPN and opens you to full speed use of your ISP WAN connection.

doktor-x · Feb 4, 2022

Ah ok, good to know, but currently WG is unfortunately no option.

I just found out that you can create iptables rules based on source port [ranges] (i'm quite new to this deeply tinkering with networks). First guess was define a rule based on first half of ephemeral ports from the client and route them through tunnel 1 and the other half through tunnel 2, unfortunately source ports are not chosen random but rather incrementally, so i will try now to define a rule for every port where every other rule routes to a different tunnel.

Tech Junky · Feb 4, 2022

source is usually random.. DST though can be worked / finessed into a path.

getting some load balancing out of it through might be a bit trickier to do.

Load balancing OpenVPN connections via IPVS (Linux Virtual Server)

First Steps First off, I've performed the setup and configuration on my own local PC using VirtualBox VMs. Network for all the VMs are s...

techtots.blogspot.com

This looks interesting as towards the end it looks to be using LACP mode 4 for load balancing the 2 connections.

Looking at this they're sending anything destined for 1194 to the 2 gateways thus doubling the speed.

OpenVPN and high availability – Javier Miqueleiz

iptables -t mangle -A OUTPUT -d ${IPV4_IO_PUB} -p udp --dport 1194 -j MARK --set-mark 1000
iptables -t mangle -A OUTPUT -d ${IPV4_IO_PUB} -p udp --dport 1195 -j MARK --set-mark 1001

so, after creating the 2 virtual "tun" interfaces using LACP 4 this appears to work for load balancing. The only issue here is designating which IP's to send through the VPN or creating a different subnet for the hosts needing to always hit the VPN or just tunnel everything. The additional benefit is you can create as many "tun" interfaces you want and add them to the network/interface configuration with ifenslave.

Code:

auto enp3s0 enp8s0 enp9s0 enp11s0 enp12s0 wlp4s0
allow-hotplug enp3s0 enp8s0 enp9s0 enp11s0 enp12s0 wlp4s0

auto bo0
iface bo0 inet dhcp
        bond-mode 4
        bond-miimon 100
        bond-lacp-rate 1
        bond-slaves enp11s0 enp12s0 enp3s0

Here 8/9/11/12 are on my 4-port 5GE card and I split them into LAN/WAN using br0 / bo0.

Since I'm using Nord for my VPN from CLI I don't have to bond tun interfaces as it just creates an additional interface "nordlynx" and I add some rules / options in iptables to enable it to work automatically.

Code:

*nat
-A POSTROUTING -o nordlynx -j MASQUERADE

Code:

0.0.0.0/1 via 10.5.0.2 dev nordlynx
128.0.0.0/1 via 10.5.0.2 dev nordlynx

Code:

nordlynx: flags=209<UP,POINTOPOINT,RUNNING,NOARP>  mtu 1420
        inet 10.5.0.2  netmask 255.255.255.255  destination 10.5.0.2
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  txqueuelen 1000  (UNSPEC)
        RX packets 9539986  bytes 13542216936 (13.5 GB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2955292  bytes 459055484 (459.0 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

doktor-x · Feb 5, 2022

I also thought source was random but i checked the open ports when downloading and they were incremental (even more weird the ports are always even). Maybe when the same process uses multiple outgoing ports they are incremental and not random. I didn't find anything on changing this behaviour.

I realy appreciate your effort but i only understand half of the blog post, at best..

I now created two new routing tables which use a MARK set via iptables, one for each tunnel. I have a rule for every ephemeral port defined by my client (ports 32768 to 60999) and every other rule gets a different mark set, by this now one half of my requests gets routed through one tunnel and the other half through the other. The major drawback is with this approach that i need 28231 rules to achieve this, which is way too much. So as my client uses 100 parallel connections the idea is to use rules with port ranges with 50 ports each, this will cut down the number of rules to ca. 565 but will still have about the same effect. Also i'll try to drastically cut down the ephemeral port range of the client, this way i could cut it down even further. When i'm done i'll post the necessary commands here if somebody else wants to try this.

Furthermore if the outgoing ports were random this could be done with 2 rules each defining a port range for one half of the ephemeral port range..

Edit:
Prove that it is working

doktor-x · Feb 5, 2022

so after tinkering more i was able to cut it down to 84 rules, but with this setup i was getting slightly less throughput than using one VPN server, but experiencing higher loads on the router. I think this might work better if the chosen ephemeral ports were random so 2 rules would suffice. Or if you can drastically cut down the ephemeral port range (which i can't do with my current setup).

Because of that i will wait for my VPN provider supporting WG, but it still was fun and i learnt a lot.

For anyone interested in what the nat-start script would look like with this setup here you go:

Code:

# create routing tables associated with fwmark
ip rule add fwmark 2 table 2
ip rule add fwmark 3 table 3
# add a default gateway to each table respectively
# the inner command automatically gets the default gateway for the tunnel by looking it up in the routing table which
# gets created when you add the VPN server via the UI, i used the first and second VPN client, i assume the tables follow the scheme 11[1-5] for VPN client 1-5
ip route add default via $(ip route show table 111 | grep default | grep -o "[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*") table 2
ip route add default via $(ip route show table 112 | grep default | grep -o "[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*") table 3
ip route flush cache
# overwrite source ip with VPN client ip so the request finds it's destination
# the inner command automatically gets the client ip from the corresponding interface, i assume it follows the scheme tun1[1-5]
iptables -t nat -A POSTROUTING -o tun11 -j SNAT --to-source $(ip addr show tun11 | grep -o "inet [0-9]*\.[0-9]*\.[0-9]*\.[0-9]*" | grep -o "[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*")
iptables -t nat -A POSTROUTING -o tun12 -j SNAT --to-source $(ip addr show tun12 | grep -o "inet [0-9]*\.[0-9]*\.[0-9]*\.[0-9]*" | grep -o "[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*")

# add a rule which MARKs multiple port ranges with one of the 2 used marks
# replace 192.168.178.40 with the ip of the client you want to route through both tunnels
iptables -t mangle -A PREROUTING -p tcp -s 192.168.178.40 -m multiport -j MARK --set-mark 2 --sports 32768:32817,32968:33017,33168:33217,33368:33417,33568:33617,33768:33817,33968:34017
# if a mask rule matches return on the next rule so the other rules don't get evaluated
iptables -t mangle -A PREROUTING -p tcp -s 192.168.178.40 -m multiport --sports 32768:32817,32968:33017,33168:33217,33368:33417,33568:33617,33768:33817,33968:34017 -j RETURN
# do the same but with different port ranges and other mask
iptables -t mangle -A PREROUTING -p tcp -s 192.168.178.40 -m multiport -j MARK --set-mark 3 --sports 32868:32917,33068:33117,33268:33317,33468:33517,33668:33717,33868:33917,34068:34117
iptables -t mangle -A PREROUTING -p tcp -s 192.168.178.40 -m multiport --sports 32868:32917,33068:33117,33268:33317,33468:33517,33668:33717,33868:33917,34068:34117 -j RETURN

# continue the schema for all desired ports
(i suggest you write yourself a script if you have many entries, but be aware that -m multipart supports a maximum of 7 port ranges)
iptables -t mangle -A PREROUTING -p tcp -s 192.168.178.40 -m multiport -j MARK --set-mark 2 --sports 34168:34217,34368:34417,34568:34617,34768:34817,34968:35017,35168:35217,35368:35417
iptables -t mangle -A PREROUTING -p tcp -s 192.168.178.40 -m multiport --sports 34168:34217,34368:34417,34568:34617,34768:34817,34968:35017,35168:35217,35368:35417 -j RETURN
iptables -t mangle -A PREROUTING -p tcp -s 192.168.178.40 -m multiport -j MARK --set-mark 3 --sports 34268:34317,34468:34517,34668:34717,34868:34917,35068:35117,35268:35317,35468:35517
iptables -t mangle -A PREROUTING -p tcp -s 192.168.178.40 -m multiport --sports 34268:34317,34468:34517,34668:34717,34868:34917,35068:35117,35268:35317,35468:35517 -j RETURN
# ...

# finally route every traffic that's coming from another port or via udp to one of the servers
iptables -t mangle -A PREROUTING -s 192.168.178.40 -j MARK --set-mark 2

Tech Junky · Feb 5, 2022

It's slower because you over complicated it with marking and all of the hits for what's basically all of your traffic.

If you google "whats my ip" you'll get the VPN IP as a result and not your actual IP since every application uses a dynamic port.

I would condense this down to 2-3 lines forcing all traffic into the tunnel w/ the MASQUERADE under -nat and then do the load balancing with the mode / weight command(s).

Turning IPTables into a TCP load balancer for fun and profit

In this technical deep dive into iptables, the Linux network security configuration utility, we’ll see why and how to build a sophisticated TCP router and load balancer suitable to handle IoT applications traffic.

scalingo.com

Turning IPTables into a TCP load balancer

1: Random balancing To really load balance traffic on 3 different servers, the previous three rules become: iptables -A PREROUTING -t nat -p tcp -d 192.168.1.1 –dport 27017 \ -m statistic &#8…

ywjheart.wordpress.com

Net-isp-balance

Net-ISP-Balance : Load-balance your Internet connection across two or more ISPs for improved bandwidth and reliability

lstein.github.io

Go to the end and there's - ...Load balance across two OpenVPN tunnels?

Dive into the scripts and there might be something interesting to be modified.

As with everything there's 50 different ways to do things but, there's more efficient ways and then there's granular ways to do things. With FW processing it's about finding a happy medium and placing priority rules at the top so they don't have to look through each on to get to the bottom before finding a hit.

Tech Junky · Feb 5, 2022

Something just occurred to me to keep it simple.

Use the bonding in /etc/network/interfaces but for the tunnels

Code:

auto tun1 tun2
allow-hotplug tun1 tun2

auto bo0 iface bo0 inet dhcp
bond-mode 4
bond-miimon 100
bond-lacp-rate 1
bond-slaves tun1 tun1

#Iptable rule under *nat
-A POSTROUTING -o bo0 -j MASQUERADE

This would route all traffic / load balance the OVPN connections w/o all of the marking / rate / weight / etc.

sfx2000 · Feb 5, 2022

doktor-x said:
tl;dr is it possible to combine two VPN tunnels to increase my download speeds with AsusWRT Merlin?

Run the VPN off the Pi4, not the router, then you only need the single link (the router is the bottleneck here)

Better yet...

I would recommend getting a linux PC, and run your NBZget app on it - that will be faster and more efficient than any of these other approaches.

Shop around, one can find one the 1L class micro desktops with a decent amount of CPU/RAM/Storage for $150-$200 USD on eBay, etc - A Core i3 or i5 is going to simply be much faster than any Pi board will be - close to line speed perhaps even with OVPN - WG is nicer, but not all providers support it, yet...

Sometimes the best solution is the most obvious one...

Tech Junky · Feb 5, 2022

sfx2000 said:
getting a linux PC

I usually go this route as well, but this seemed interesting enough to work through and see the results.

You can get SFF PC's on Amazon NEW all day long for $150 and then build into it and replace the router completely.

Thread starter	Title	Forum	Replies	Date
	Setting up VPN server (router?) for offsite access	VPN	13	Mar 27, 2024
C	Instant Guard IPsec VPN Log Showing Regular Access Attempts	VPN	3	Mar 3, 2024
A	Best way to set up vpn on router for gaming?	VPN	13	Jan 31, 2024
M	VPN for a Cruise Ship	VPN	11	Jan 23, 2024
F	Router for VPN with AES-NI	VPN	8	Jan 9, 2024
S	How to start client VPN from the CLI on tomato firmware	VPN	0	Dec 19, 2023
	How to build a safe VPN	VPN	3	Dec 17, 2023
F	Site To Site VPN between Debian VM and AsusWRT-Merlin router, no routes exists	VPN	2	Dec 15, 2023
P	OPNsense / Adguard / DNS & VPN questions	VPN	3	Dec 14, 2023
6	VPN problem on remote connection.	VPN	0	Nov 6, 2023

Search

Search

Layer 4 VPN load balancing for increased bandwidth

doktor-x

New Around Here

Tech Junky

Part of the Furniture

doktor-x

New Around Here

Tech Junky

Part of the Furniture

doktor-x

New Around Here

Tech Junky

Part of the Furniture

Load balancing OpenVPN connections via IPVS (Linux Virtual Server)

doktor-x

New Around Here

doktor-x

New Around Here

Tech Junky

Part of the Furniture

Turning IPTables into a TCP load balancer for fun and profit

Turning IPTables into a TCP load balancer

Net-isp-balance

Tech Junky

Part of the Furniture

sfx2000

Part of the Furniture

Tech Junky

Part of the Furniture

Similar threads

Similar threads

Latest threads

Sign Up For SNBForums Daily Digest

Members online