OpenVPN performance of the RT-AC86U

  • ATTENTION! As of November 1, 2020, you are not able to reply to threads 6 months after the thread is opened if there are more than 500 posts in the thread.
    Threads will not be locked, so posts may still be edited by their authors.
    Just start a new thread on the topic to post if you get an error message when trying to reply to a thread.

RMerlin

Asuswrt-Merlin dev
I mean that crypto engine [bcmspu (crypto driver) + bcmpdc (parallel encryption)] seems just only did parallel encryption.
The thing that bothered me in my test results was the fact that the CPU usage was so high when using the crypto engine. Part of the point behind such engines isn't just to accelerate the crypto handling, but also to free up the CPU for other tasks while it handles the crypto. The fact the second core was loaded that much then was puzzling. Could be that BCM's engine implementation sucks. Maybe Asus is also suspecting something there as they asked me to include CPU usage results when I submitted benchmark results to them.
 

sfx2000

Part of the Furniture
Beside, the point of this thread is to provide VPN performance datapoints specific to the RT-AC86U, not to start advertising the RT-AC86U as the end-it-all IPSEC solution for businesses.
It's not a business class device, nor positioned as one -- Asus has other offerings there...
 

sfx2000

Part of the Furniture
EdgeRouter-X can do 377Mbit/s from my tests..details HERE. Microtik's hEX can do close to 500Mbit/s. As always Microtik excels at system optimization over Ubiquiti. Both routers use a cheap SoC and much less powerful than the SoC in RT-AC86U.

I'm not promoting either brands. Just provide another two yardsticks when comparing VPN performance..
My SG-2440 does pretty well there as well - but how is this relevant to the thread topic?

It's a bit more spendy than the RT-AC86U, and not as user friendly, but in the hands of someone that has some deep experience in networking, it's a good tool to use...

The RT-AC86U numbers are pretty good considering everything...
 

sfx2000

Part of the Furniture
The thing that bothered me in my test results was the fact that the CPU usage was so high when using the crypto engine. Part of the point behind such engines isn't just to accelerate the crypto handling, but also to free up the CPU for other tasks while it handles the crypto. The fact the second core was loaded that much then was puzzling. Could be that BCM's engine implementation sucks. Maybe Asus is also suspecting something there as they asked me to include CPU usage results when I submitted benchmark results to them.
If they're using Cryptodev - that explains a bit, as one has to jump from kernel to userland and back and forth - l2tp/ipsec is in the kernel, but if we're doing cryptodev, then it needs to jump to userland and back, which can keep one of the cores busy with context switches...

Then it's core scheduling from there... Asus has done a decent job there considering the restraints of AsusWRT and the HND and porting the rest of legacy code on to HND.

One of the more interesting things in linux and 10GBe and above, is moving networking out of kernel space, and keeping it in userland, hence userland drivers for NIC's, where one has more flexibility with things like DPDK, VPP, crypto, flow offload, etc... but that's well beyond the scope of a consumer Router/AP/BHR...
 

sfx2000

Part of the Furniture
Regarding OpenVPN, it should be retired a while back. See the speed difference in real-world usage in the above link: OpenVPN vs IPsec vs Shadowsocks.

I could understand your personal bonding to OpenVPN since you first brought it to Asus FW a few years ago. You defend OpenVPN until recently that you have IPsec to "sell" to users on newer Asus routers. lol
I'm not the biggest fan of OpenVPN - but I acknowledge their intents - their strength is portability...

Performance might not be the best, but generally it is secure, and clients can be found on most platforms - heck, even ChromeOS supports it, LOL, but that's ok...
 

RMerlin

Asuswrt-Merlin dev
If they're using Cryptodev - that explains a bit,
They're not, it's directly accessing the kernel API.

I did a few experiments with cryptodev + OpenSSL/OpenVPN back then, performance was pretty bad.

I configured a test setup to test the RT-AX88U. Something's not right tho, my initial tests are only giving me 80-90 Mbps of throughput (and CPU usage is very low, indicating most likely that something in the test setup is bottlenecking performance).

EDIT: looks like something is off with my laptop's USB adapter (despite getting good numbers accessing a LAN share), because I'm getting much better performance by having the laptop (the iperf server endpoint) connected over Wifi. First test is pretty good:

Code:
E:\Share>iperf -c  192.168.50.12 -M 1400 -N -t 30
------------------------------------------------------------
Client connecting to 192.168.50.12, TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[320] local 10.10.10.1 port 10221 connected with 192.168.50.12 port 5001
[ ID] Interval       Transfer     Bandwidth
[320]  0.0-30.0 sec  1.40 GBytes    402 Mbits/sec
Need to so some more testing.
 
Last edited:

kvic

Part of the Furniture
It might not be the fastest, but it has other strong points in its favor, making it still very much relevant today. And above everything else: it works just fine. Why change just for the sake of changing?
I couldn't agree with your view on this. I stopped using OpenVPN two years or so ago - ending the short fanfare of OpenVPN that got started with RT-AC56U. Currently I have IPsec VPN backed up by Shadowsocks.

Heck! Recently I heard UK still has 7000+ ppl watching black & white TV..

My SG-2440 does pretty well there as well - but how is this relevant to the thread topic?
It's not, you shouldn't attempt to bring it up.

My examples that you quoted though make a clear point for people care to spend little time thinking - the cheap SoC, MediaTek MT7621A inside ER-X/hEX runs at 880MHz and outperforms RT-AC86U on IPSec VPN.
 

RMerlin

Asuswrt-Merlin dev
I completed a series of tests on the RT-AX88U and IPSec. I tested the following three scenarios:

- Software only
- Software with parallel processing (i.e. multicore)
- With the Broadcom HW engine

Software only (single core):
Code:
E:\Share>iperf -c 192.168.50.12 -N -M 1400 -t 20
------------------------------------------------------------
Client connecting to 192.168.50.12, TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[320] local 10.10.10.1 port 1499 connected with 192.168.50.12 port 5001
[ ID] Interval       Transfer     Bandwidth
[320]  0.0-20.0 sec    362 MBytes    152 Mbits/sec


Mem: 432868K used, 471600K free, 0K shrd, 3888K buff, 36552K cached
CPU:  0.1% usr 19.8% sys  0.2% nic 72.8% idle  0.0% io  0.0% irq  6.8% sirq
Load average: 3.02 3.21 2.40 3/188 15525
  PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
  230    2 admin    RW       0  0.0   3 25.1 [bcmsw_rx]
 1170    1 admin    S N   9068  1.0   1  1.0 httpds -s -i br0 -p 8443
 2608    1 admin    S N  21208  2.3   2  0.3 aaews --sdk_log_dir=/tmp

Software only, multi-core (through pcrypt):
Code:
E:\Share>iperf -c 192.168.50.12 -N -M 1400 -t 20
------------------------------------------------------------
Client connecting to 192.168.50.12, TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[336] local 10.10.10.1 port 14508 connected with 192.168.50.12 port 5001
[ ID] Interval       Transfer     Bandwidth
[336]  0.0-20.0 sec    848 MBytes    356 Mbits/sec


Mem: 433344K used, 471124K free, 0K shrd, 3888K buff, 36404K cached
CPU:  0.2% usr 47.3% sys  0.4% nic 34.2% idle  0.0% io  0.0% irq 17.8% sirq
Load average: 3.70 3.34 2.20 5/191 12347
  PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
   15    2 admin    RW       0  0.0   1 13.7 [kworker/1:0]
  230    2 admin    SW       0  0.0   0  9.2 [bcmsw_rx]
   68    2 admin    SW       0  0.0   1  4.6 [kworker/1:1]
12249     2 admin    RW       0  0.0   1  4.1 [kworker/1:3]
 8251    2 admin    SW       0  0.0   0  3.9 [kworker/0:2]
 8252    2 admin    SW       0  0.0   2  3.5 [kworker/2:2]
   30    2 admin    SW       0  0.0   0  3.3 [kworker/0:1]
 8250    2 admin    SW       0  0.0   3  3.1 [kworker/3:0]
10289     2 admin    SW       0  0.0   2  3.0 [kworker/2:3]
    4    2 admin    SW       0  0.0   0  3.0 [kworker/0:0]
   14    2 admin    SW       0  0.0   1  2.9 [ksoftirqd/1]
   74    2 admin    RW       0  0.0   2  2.9 [kworker/2:1]
 1077    2 admin    SW       0  0.0   3  2.9 [kworker/3:2]
   73    2 admin    SW       0  0.0   3  2.8 [kworker/3:1]
 1170    1 admin    S N   9068  1.0   3  1.1 httpds -s -i br0 -p 8443
 2608    1 admin    S N  21208  2.3   3  0.5 aaews --sdk_log_dir=/tmp


With bcmspu (Broadcom's crypto engine):
Code:
E:\Share>iperf -c 192.168.50.12 -N -M 1400 -t 20
------------------------------------------------------------
Client connecting to 192.168.50.12, TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[316] local 10.10.10.1 port 14909 connected with 192.168.50.12 port 5001
[ ID] Interval       Transfer     Bandwidth
[316]  0.0-20.0 sec    924 MBytes    387 Mbits/sec


Mem: 434776K used, 469692K free, 0K shrd, 3888K buff, 36492K cached
CPU:  0.2% usr 20.7% sys  0.3% nic 59.7% idle  0.0% io  0.0% irq 18.9% sirq
Load average: 3.32 3.37 2.32 4/191 13694
  PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
  240    2 admin    RW       0  0.0   3 23.9 [pdc_rx]
  230    2 admin    RW       0  0.0   0 13.4 [bcmsw_rx]
 1170    1 admin    S N   9068  1.0   1  1.6 httpds -s -i br0 -p 8443
 2608    1 admin    R N  21208  2.3   1  0.4 aaews --sdk_log_dir=/tmp
The engine version is slightly faster than the multicore version, but also sports a lower general CPU usage (59% idle versus 34% idle).

Unsure why the multicore version doesn't fully load all four cores however - maybe it might be the case if more than one client was connected at once.
 

ludnell

New Around Here
Hello!

I just recently set up my 86U for OVPN encryption. Been trying websites speedsites offering 10GB files for downloading but would like a more accurate result.

I'm really curious to see how well it is doing and also try some different servers offered by my VPN provider.

I have been using iperf before but only for local wireless testing, never encryption. Do I need a second PC somewhere else and test through internet or can I test my routers encryption capacity using local traffic aswell?
 

sfx2000

Part of the Furniture
Regarding OpenVPN, it should be retired a while back. See the speed difference in real-world usage in the above link: OpenVPN vs IPsec vs Shadowsocks.
Been tinkering with a "gateway on a stick" as a science project - Allwinner H3 can do 50Mbps on AES-128-GCM, and as such, the NanoPi NEO's 100Mbit ethernet interface is more than enough, and this is a unit half the size of a business card and runs cool and clean with Armbian and Arch...

(all core, as the Allwinner CryptoBlock is a bit 'hugged up' and not that fast anyways)

The NanoPI NEO is an $8USD board (FOB Shenzen, CN)...

Code:
[email protected]:~$ openvpn --genkey --secret /tmp/secret
[email protected]:~$ sudo time openvpn --test-crypto --secret /tmp/secret --verb 0 --tun-mtu 20000 --cipher aes-128-gcm
Fri Dec  7 17:43:35 2018 disabling NCP mode (--ncp-disable) because not in P2MP client or server mode
61.48user 0.01system 1:01.50elapsed 99%CPU (0avgtext+0avgdata 4016maxresident)k
0inputs+0outputs (0major+235minor)pagefaults 0swaps
Watching the board during that little run...

Code:
Time        CPU    load %cpu %sys %usr %nice %io %irq   CPU  C.St.
17:43:27:  240MHz  0.09   1%   1%   0%   0%   0%   0% 31.0°C  0/7
17:43:32:  648MHz  0.08   1%   1%   0%   0%   0%   0% 31.5°C  0/7
17:43:37: 1104MHz  0.08  12%   1%  11%   0%   0%   0% 33.4°C  0/7
17:43:42: 1104MHz  0.15  25%   0%  25%   0%   0%   0% 36.1°C  0/7
17:43:47:  816MHz  0.22  25%   0%  25%   0%   0%   0% 36.1°C  0/7
17:43:52: 1104MHz  0.28  25%   0%  25%   0%   0%   0% 36.8°C  0/7
17:43:58:  648MHz  0.34  25%   0%  25%   0%   0%   0% 37.0°C  0/7
17:44:03: 1104MHz  0.39  25%   0%  25%   0%   0%   0% 36.4°C  0/7
17:44:08: 1104MHz  0.44  25%   0%  24%   0%   0%   0% 36.7°C  0/7
17:44:13:  648MHz  0.49  25%   0%  25%   0%   0%   0% 37.0°C  0/7
17:44:18:  816MHz  0.53  25%   0%  25%   0%   0%   0% 36.4°C  0/7
17:44:23:  816MHz  0.57  25%   0%  25%   0%   0%   0% 36.7°C  0/7
17:44:28: 1104MHz  0.60  25%   0%  24%   0%   0%   0% 37.0°C  0/7
17:44:33: 1104MHz  0.63  25%   0%  25%   0%   0%   0% 36.8°C  0/7
17:44:39:  240MHz  0.66  15%   0%  14%   0%   0%   0% 35.3°C  0/7
I've got a NEO2 enroute from FriendlyARM, should be here on 12/8 if tracking is accurate - the NEO2 is basically the same as NEO, just changing out the SoC from Cortex-A7 to Cortex-A53, which enables ARMv8/aarch64...

I think what would be more interesting is data from a chip like the Rockchip Renegade - my gut tells me that it'll match pretty much any BHR board support platform...
 

sfx2000

Part of the Furniture
OpenVPN's greatest strength is its flexibility. It can handle a large variety of scenario, can use user-configurable ciphers based on one's needs, and can be run on any tcp/udp port chosen by the user, and its less prone to firewall-related issues than VPN technologies relying on other protocols than TCP or UDP. Its code passed two independent audits 1-2 years ago, and it's very actively developed.

I'm not "selling" IPSEC, in fact I still prefer OpenVPN over IPSEC when using an Asus router. All I did was provide benchmark results since someone actually asked me for those a few days ago.

Nobody's saying OpenVPN is the perfect solution. But just because a newer fad introduces faster protocols does not mean it's suddenly obsolete and everyone should rush toward new, unproven and poorly supported technologies. It might not be the fastest, but it has other strong points in its favor, making it still very much relevant today. And above everything else: it works just fine. Why change just for the sake of changing?
OpenVPN performs well enough - BruteForce and perhaps less efficient than other VPN tech...

Portability does come at a cost - but OpenVPN almost runs on everything as both client and server on any OS.

L2TP/IPSec has it's advantages - it's one less hop into hyperspace (e.g. to and fro from userland to kernel to userland and back to kernel - so less switches there)

I'm looking into a couple of alternates...

1) Wireguard - most promising actually, as this has the official stamp of approval from Linus Torvalds himself...
2) ZeroTier - a bit different, it's more of an SDWAN approach, but performance looks really good, and might be one of the better site-to-site approaches...
 

kvic

Part of the Furniture
I think what would be more interesting is data from a chip like the Rockchip Renegade - my gut tells me that it'll match pretty much any BHR board support platform...
I don't have OpenVPN installed on the box. Here is the OpenSSL numbers:

Code:
[[email protected] ~]# openssl speed -evp aes-128-gcm 
OpenSSL 1.1.1  11 Sep 2018
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-128-gcm      70331.53k   207303.62k   406902.12k   539375.39k   596860.29k   596179.67k
 

sfx2000

Part of the Furniture
I don't have OpenVPN installed on the box. Here is the OpenSSL numbers:
AllWinner H5 - keep in mind that it's DDR3 with a narrow interface...

(FriendlyARM Nano Pi NEO2) - Armbian numbers here, and they do underclock the board to 1.1GHz as a default with Armbian 5.65

Code:
OpenSSL 1.1.0g, built on 2 Nov 2017
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-128-cbc     102999.40k   304368.43k   581475.50k   775797.76k   859652.10k   864600.06k
aes-256-cbc      95027.48k   247282.65k   410361.51k   500125.01k   534126.59k   535582.04k
aes-128-gcm      66409.36k   178810.37k   330285.65k   423227.39k   458244.10k   458877.61k
If one has a 64-bit ARM with the Crypto Extensions - one really doesn't need a dedicated crypto accelerator block...

H5 - Potential performance...

Code:
[email protected]:~$ time openvpn --test-crypto --secret /tmp/secret --verb 0 --tun-mtu 20000 --cipher aes-128-gcm
Mon Dec 10 02:18:26 2018 disabling NCP mode (--ncp-disable) because not in P2MP client or server mode
real 0m25.277s
user 0m25.251s
sys 0m0.025s
3200/25.277... That suggests 120Mb/Sec give or take a few bits on OVPN - the AllWinner 1Gbe MAC is good for at least 700Mb/Sec there one way - so full duplex, at least 1.5Gb/Sec...

As a GW on a stick, aka One Armed Bandit, this little box might be a good approach for many...

ipSec/Wireguard is going to be better than OpenVPN...

Not bad for a $20USD board...

IMG_1832.jpg
 
Last edited:

Xentrk

Part of the Furniture
Hello guys, good morning.

First of all, thank you Merlin for such a good job... you're amazing. I installed NordVPN on my AC86U and everything is great, except my Nvidia Shield that is not working under VPN, so i took it off. All my other devices are working fine, even Netflix, just perfect. Do you guys have any idea? The Nvidia Shield under vpn not even load the netflix's thumbnails, but the netflix works great on my samsung tv under vpn (for example). I'm talking about Netflix, but everything on Nvidia Shield is off under VPN. Maybe the Android TV app could work, i'm not home right now to test it.

Thank you guys, greetings from Brazil.
Antonio
I found a similar issue with the Nvidia.

https://www.snbforums.com/threads/nvidia-android-box-wants-to-use-wi-fi-to-determine-location.46007/

But Netflix works. Try turning location services off.
 

Ditch

Occasional Visitor
Got this router today, coming from the ac68u. I switched because I need more power with VPN. However, I love Asus HW and Merlin FW but, damn, this things are so ugly. Why Asus doesn't design something more professional looking? Something that I can pile?
 

RMerlin

Asuswrt-Merlin dev
Something that I can pile?
They shouldn't be piled up, for heat dissipation reasons...

I don't see anything wrong with the RT-AC86U design, it's very similar in design to the majority of mainstream routers.
 

Ditch

Occasional Visitor
They shouldn't be piled up, for heat dissipation reasons...

I don't see anything wrong with the RT-AC86U design, it's very similar in design to the majority of mainstream routers.
Talking about design i do prefer Ubiquiti
 

RMerlin

Asuswrt-Merlin dev

pusb87

Regular Contributor
Got this router today, coming from the ac68u. I switched because I need more power with VPN. However, I love Asus HW and Merlin FW but, damn, this things are so ugly. Why Asus doesn't design something more professional looking? Something that I can pile?
whats this got to do with openvpn performance ??
 

Similar threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top