What's new

Wireguard Session Manager (4th) thread

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

A bit late..........RT-AX86U Pro v388.2_beta1 @ZebMcKayhan / @RMerlin but when I try to manually modify the 'blog' files
Code:
ll /proc/blog/

-r--r--r--    1 admin    root             0 Mar 22 20:55 skip_wireguard_network
-r--r--r--    1 admin    root             0 Mar 22 18:12 skip_wireguard_port
I get errors

e.g.
Code:
echo "1234 either" >> /proc/blog/skip_wireguard_port
echo: write error: Invalid argument

echo "172.16.1.1/32" >> /proc/blog/skip_wireguard_network
echo: write error: Invalid argument

The correct syntax is:

<operation> <subnet>

i.e.:

Code:
admin@RT-AX86U_Pro-E930:/tmp/home/root#  echo "add 172.16.1.1/32" >> /proc/blog/skip_wireguard_network
admin@RT-AX86U_Pro-E930:/tmp/home/root# cat /proc/blog/skip_wireguard_network
192.168.50.8/32
172.16.1.1/32
admin@RT-AX86U_Pro-E930:/tmp/home/root#  echo "del 172.16.1.1/32" >> /proc/blog/skip_wireguard_network
 
The correct syntax is:

<operation> <subnet>

i.e.:

Code:
admin@RT-AX86U_Pro-E930:/tmp/home/root#  echo "add 172.16.1.1/32" >> /proc/blog/skip_wireguard_network
admin@RT-AX86U_Pro-E930:/tmp/home/root# cat /proc/blog/skip_wireguard_network
192.168.50.8/32
172.16.1.1/32
admin@RT-AX86U_Pro-E930:/tmp/home/root#  echo "del 172.16.1.1/32" >> /proc/blog/skip_wireguard_network
Thanks!

Guess the big questions are:
- does it work / is implemented for ipv6 as well?
- is it only source/local address or could it be destinations as well (like 9.9.9.9)?
 
Thanks!

Guess the big questions are:
- does it work / is implemented for ipv6 as well?
- is it only source/local address or could it be destinations as well (like 9.9.9.9)?
It's only for local lan adresses.
 
The correct syntax is:

<operation> <subnet>

i.e.:

Code:
admin@RT-AX86U_Pro-E930:/tmp/home/root#  echo "add 172.16.1.1/32" >> /proc/blog/skip_wireguard_network
admin@RT-AX86U_Pro-E930:/tmp/home/root# cat /proc/blog/skip_wireguard_network
192.168.50.8/32
172.16.1.1/32
admin@RT-AX86U_Pro-E930:/tmp/home/root#  echo "del 172.16.1.1/32" >> /proc/blog/skip_wireguard_network
Many thanks.
 
Great that you pick this up and start trying!!

FYI,,,

I've uploaded experimental wg_manager v4.19b5 to the dev branch on GitHub.

So when starting a 'client' Peer in policy mode, it will populate the 'blog' files (if they are available within the firmware e.g. v388.2b1? and the model is supported) and will intelligently remove the entries if deemed appropriate.

i.e. as it is perfectly acceptable to have multiple concurrent 'client' peers that use the same destination port (e.g. Torguard uses Port 1443) but with differing geo-specific endpoint IP addresses., as per @RMerlin's advice, if I terminate say my Torguard New York WireGuard 'client' Peer, if there is another ACTIVE 'client' peer that also uses (TorGuard) port 1443 then it will not be removed from '/proc/blog/skip_wireguard_port'

NOTE: Currently only IPv4 'client' peers are fully implemented, and since I only have 40/10 ISP WAN service I wouldn't be able to tell if the non-VPN LAN clients are being throttled back to the WireGuard max 350Mbps.

The ?/about commands will now display both the current state of the Flow Cache (fc) and if the 'blog' files are 'available' and/or actually populated.

Code:
            wgm ?

            Router RT-AX86U_Pro Firmware (v388.2_beta1)

            [✔] Entware Architecture arch=aarch64


            v4.19b5 WireGuard® Session Manager (Change Log: https://github.com/MartineauUK/wireguard/commits/dev/wg_manager.sh)
            MD5=a51247569ebeeb99e988f1865a067114 /jffs/addons/wireguard/wg_manager.sh

                v4.19.2 (wg_client)
                v4.17.1 (wg_server)

            [✔] WireGuard® Kernel module/User Space Tools included in Firmware (1.0.20210124)


            [✔] WebUI Addon Enabled

            [✔] DNSmasq is listening on ALL WireGuard® interfaces 'wg*'

            [✔] firewall-start is monitoring WireGuard® Firewall rules

            [✖] WAN KILL-Switch is DISABLED (use 'vx' command for info)
            [✖] UDP monitor is DISABLED

            [✔] Flow Cache is ENABLED (WireGuard© VPN Bypass ENABLED for 3 LAN entities)

            <snip>
 
I've uploaded experimental wg_manager v4.19b5 to the dev branch on GitHub.
Cool, great job! I cannot test it on my AC86U and from previous discussions I assume this is not implemented in firmware for AX88U either. But hopefully for the rest of the AX-routers. Hope you get som great feedback!
 
Definitely experience errors. I only started looking through the syslog because of all web connections (across all connected devices) becoming very sluggish and complaints from the family along the lines of 'what's wrong with the internet'. Symptoms do not appear to start until the mcast errors have been running for a couple of hours or more. Will not have any time to test for a few days, but when I do I will take proper notes on environment (connected devices, etc) and symptoms.
Despite having a clean build and testing both 'client only' and 'server only' I am really no further forward.

About the only things I have observed are
  • Router kernel: potentially unexpected fatal signal 11 errors are related to server use and typically (but not consistently) arise when a remote device connects. The connection itself appears to be fine.
  • If I try to add a wireguard client connection after getting this error, wgm shows the link as being setup, but the routed device does not (still shows local IP, VPN website shows as not connected).
  • Router kernel: [0;33;41m[ERROR mcast] bcm_mcast_blog_process,819: blog allocation failure[0m errors are related to client use (connecting to VPN provider)
  • Once this error occurs I can only clear it by a reboot of the router.
  • Either error can appear with a couple of hours or not occur for days
  • Other than noted above, there does not appear to be any correlation with which LAN devices are connected.
So while other have indicated that WGM works fine with the RT-AX88U, I can only comment that for me it does not, but there is no consistency in when or if the errors will arise.

For now, given I only have a 80/20 speeds and as such Wireguard does not impact on the real world performance I will revert to fc=off. Should @Martineau get the bypass to work with WGM and if RMerlin has the time to backport this to the RT-AX88U then I am happy to do some more testing.
 
Despite having a clean build and testing both 'client only' and 'server only' I am really no further forward.

About the only things I have observed are
  • Router kernel: potentially unexpected fatal signal 11 errors are related to server use and typically (but not consistently) arise when a remote device connects. The connection itself appears to be fine.
  • If I try to add a wireguard client connection after getting this error, wgm shows the link as being setup, but the routed device does not (still shows local IP, VPN website shows as not connected).
  • Router kernel: [0;33;41m[ERROR mcast] bcm_mcast_blog_process,819: blog allocation failure[0m errors are related to client use (connecting to VPN provider)
  • Once this error occurs I can only clear it by a reboot of the router.
  • Either error can appear with a couple of hours or not occur for days
  • Other than noted above, there does not appear to be any correlation with which LAN devices are connected.
So while other have indicated that WGM works fine with the RT-AX88U, I can only comment that for me it does not, but there is no consistency in when or if the errors will arise.

For now, given I only have a 80/20 speeds and as such Wireguard does not impact on the real world performance I will revert to fc=off. Should @Martineau get the bypass to work with WGM and if RMerlin has the time to backport this to the RT-AX88U then I am happy to do some more testing.
Did you get to try without your ipsets? If I remember correctly you use them to track changing ipv6 address and route over vpn. Perhaps try to make temporary static rules instead?

How about completally block ipv6 over vpn and see if that changes anything?
 
In respect of ipsets, I did not try that, though I would think that would only impact on the client setup. I could try taking the ipset out of the equation (presumably I will need to amend the .conf and up/down scripts accordingly, as well as setting a static IPv6 address on the chosen client, otherwise it will be rotated out in due course). Alternatively I could (a) disable IPV6 on the client and / or disable IPv6 on the router, however as noted above, the appearance of the mcast_blog_process error is completely unpredictable - I have had fc=on and both client and server vpns up for the last 48 hours without any issue, so a null result might be because I had isolated the root or just chance.

I am in the process of loading 388.2 beta 2 and have also updated WGM (though it should not make any difference). I am going to leave fc=on for now and see what happens. If I get a recurrence I will start with disabling IPv6 on the router (easiest to do) and take it from there.
 
If I get a recurrence I will start with disabling IPv6 on the router (easiest to do) and take it from there.
As far as I know AC86U that I run and AX88U that you run are very similar in this regard. And I run a client only with self assigned ipv6 and uses ipset to bypass to wan (fwmark 0x8000) only. And I don't get these errors. But perhaps that could be the key point? Im only marking data for wan which have no issues with fc. I suspect you mark package for wg vpn. Just a thought.

But yes, this would only affect client as no ipset marks are setup to use with server...
 
As far as I know AC86U that I run and AX88U that you run are very similar in this regard. And I run a client only with self assigned ipv6 and uses ipset to bypass to wan (fwmark 0x8000) only. And I don't get these errors. But perhaps that could be the key point? Im only marking data for wan which have no issues with fc. I suspect you mark package for wg vpn. Just a thought.

But yes, this would only affect client as no ipset marks are setup to use with server...
Sorry, I can read the words, but most of them are going over my head.

I understand that a fwmark is a tag that instructs iptables to treat tagged packets in a certain way. So when you say " I run a client only with self assigned ipv6 and uses ipset to bypass to wan' do you mean that
  • you are not running a WireGuard server?
  • that you are routing all internet traffic via the WireGuard client or only specific LAN devices via Wireguard?
  • in respect of 'self assigned ipv6' do you mean using DHCP-PD from the scope provided by your ISP or something else?
  • in respect of 'ipset to bypass the wan' do you mean that any tagged packet (in your case 0x8000) goes via the VPN interface rather than the WAN (eth0)?
  • in respect of 'you mark package for wg vpn' that I have fwmarks for both wg21 and wg11, as noted below or something different.
How would I see what fwmarks are being employed and understand what they do, e.g.,

if I run peer wg11 (from wgm) I see
Code:
IPSet     Enable  Peer  FWMark  DST/SRC
wg11-mac  Y       wg11  0x1000  src
and from ip6tables -nvL PREROUTING -t mangle
Code:
 pkts bytes target     prot opt in     out     source               destination
 5900 1846K MARK       all      wg11   *       ::/0                 ::/0                 /* WireGuard 'client' */ MARK xset 0x1/0x7
    1   116 MARK       all      wg21   *       ::/0                 ::/0                 /* WireGuard 'server' */ MARK xset 0x1/0x7
 7143  879K MARK       all      *      *       ::/0                 ::/0                 match-set wg11-mac src /* WireGuard 'client' */ MARK or 0x1000
ip6tables -nvL FORWARD -t mangle
Code:
 pkts bytes target     prot opt in     out     source               destination
 5008  609K MARK       all      *      wg11    ::/0                 ::/0                 /* WireGuard 'client' */ MARK xset 0x1/0x7
   82  6616 TCPMSS     tcp      wg11   *       ::/0                 ::/0                 tcpflags: 0x06/0x02 /* WireGuard 'client' */ TCPMSS clamp to PMTU
  780 62400 TCPMSS     tcp      *      wg11    ::/0                 ::/0                 tcpflags: 0x06/0x02 /* WireGuard 'client' */ TCPMSS clamp to PMTU
    0     0 MARK       all      *      wg21    ::/0                 ::/0                 /* WireGuard 'server' */ MARK xset 0x1/0x7
    0     0 TCPMSS     tcp      wg21   *       ::/0                 ::/0                 tcpflags: 0x06/0x02 /* WireGuard 'server' */ TCPMSS clamp to PMTU
    0     0 TCPMSS     tcp      *      wg21    ::/0                 ::/0                 tcpflags: 0x06/0x02 /* WireGuard 'server' */ TCPMSS clamp to PMTU
  444 38848 DNSFILTERF  udp      br+    *       ::/0                 ::/0                 udp dpt:53
    0     0 DNSFILTERF  tcp      br+    *       ::/0                 ::/0                 tcp dpt:53
Is there somewhere else I should be checking?
Is it relevant - what does it mean that your rules are 0x8000 and mine are 0x1000 and 0x1/0x7?

Thanks Archiel

Note: Just over 2 hours since installing 388.2beta1 a repeat of Router kernel: potentially unexpected fatal signal 11. even though no clients connected to the server. Have rebooted and will wait.
 
Last edited:
you are not running a WireGuard server?
Nope. No point since Im behind ipv4 cgnat only. I have a server setup but I can only use it within my lan to experiment.

that you are routing all internet traffic via the WireGuard client or only specific LAN devices via Wireguard?
No, but Im only running entire /24 or /64 networks to either vpn client or wan.

in respect of 'self assigned ipv6' do you mean using DHCP-PD from the scope provided by your ISP or something else?
No, I dont get ipv6 from my isp as it does not support it. I assigned my lan a modified ula and nat6 over vpn since I got ipv6 over wireguard I would like to experiment.

in respect of 'ipset to bypass the wan' do you mean that any tagged packet (in your case 0x8000) goes via the VPN interface rather than the WAN (eth0)?
I mean my entire network is over vpn except for NETFLIX and MYIP which corresponds to ipsets marking domains which gets sent directly to wan. Meaning the only packets that are marked are for wan.

n respect of 'you mark package for wg vpn' that I have fwmarks for both wg21 and wg11, as noted below or something different.
Im marking packets to bypass VPN and to go to wan as the ordinary policy rules put all data to vpn. You have the opposite and are marking packets that should be sent to vpn. If the packet mark interferes with the marks to bypass fc it may be of importance but I dont know.
 
How would I see what fwmarks are being employed and understand what they do, e.g.,
The marks should correspond to a routing rule, much in the same way as ordinary policy rules. I.e
Code:
admin@RT-AC86U-D7D8:/tmp/home/root# ip rule
0:      from all lookup local
9900:   from 192.168.1.1/24 fwmark 0x8000 lookup main
9910:   from all to 192.168.1.1/16 lookup main
9911:   from 192.168.1.1/24 lookup 121
9921:   from 192.168.6.0/24 lookup 122
9991:   from all fwmark 0x1000/0x1000 lookup 121
32766:  from all lookup main
32767:  from all lookup default
admin@RT-AC86U-D7D8:/tmp/home/root# ip -6 rule
0:      from all lookup local
9900:   from aaff:a37f:fa75:1::/64 fwmark 0x8000 lookup main
9910:   from all to aaff:a37f:fa75:1::1/48 lookup main
9911:   from aaff:a37f:fa75:1::1/64 lookup 121
9921:   from aaff:a37f:fa75:6::1/64 lookup 122
9991:   from all fwmark 0x1000/0x1000 lookup 121
32766:  from all lookup main

Here we can see that 0x8000 points to main routing table which sends data to wan. Table 121 would be vpn1 and table 122 would be vpn2.
 
Thanks for the clarification on your setup, though I am still confused as to what I can check in respect of my ip rules / fwmarks
As noted before I can see from ip(6)tables -nvL PREROUTING -t mangle
Code:
Chain PREROUTING (policy ACCEPT 1459K packets, 1307M bytes)
 pkts bytes target     prot opt in     out     source               destination
 109K   75M MARK       all      wg11   *       ::/0                 ::/0                 /* WireGuard 'client' */ MARK xset 0x1/0x7
   10  1160 MARK       all      wg21   *       ::/0                 ::/0                 /* WireGuard 'server' */ MARK xset 0x1/0x7
 127K  114M MARK       all      *      *       ::/0                 ::/0                 match-set wg11-mac src /* WireGuard 'client' */ MARK or 0x1000
and ip (-6) rule
Code:
admin@Router:/tmp/home/root# ip rule
0:      from all lookup local
9810:   from all fwmark 0xd2 lookup 210
9911:   from 192.168.3.1 lookup 121
9981:   from 10.50.1.2 lookup 121
9991:   from all fwmark 0x1000/0x1000 lookup 121
32766:  from all lookup main
32767:  from all lookup default
admin@Router:/tmp/home/root# ip -6 rule
0:      from all lookup local
9810:   from all fwmark 0xd2 lookup 210
9911:   from fd36:7ef1:2add:aa88:100::1 lookup 121
9981:   from aa36:7ef1:2add:aa88:100::2 lookup 121
9991:   from all fwmark 0x1000/0x1000 lookup 121
32766:  from all lookup main
Where
9911 are the IPv4/IPv6 router IP aliases, used for routing DNS queries via unbound and wg11 and
9981 are the wg21 IP addresses for my phone when using passthrough to wg11

I am guessing that lookup 210 relates to the wg21 routing table, but where should I see fwmark 0xd2 in ip(6)tables?

nb. I did have an mcast error kick off after the router had been up for around 8 hours, however I am discounting it for now (1) I had disabled adding the br0 IPv6 address onto eth0 - in case it was no longer needed for DDNS on IPv6 and (2) then DDNS stopped working at 2:00 am after the clocks went back, so there was a whole bunch of other stuff not working as expected. I have reverted to adding the br0 IPv6 address onto eth0.
 
I am still confused as to what I can check in respect of my ip rules / fwmarks
As noted before I can see from ip(6)tables -nvL PREROUTING -t mangle
Your top 2 rules adds the fwmark 0x1/0x7 for wg11 and wg21 respectively in an xor fashion (honestly I understand how XOR works but I cannot wrap my head around how firewall does this). The last lines sets the bit 0x1000 (OR) and it should not touch the 0x1/0x7 bits at all.

The fc bypass rules works in your system as otherwise you would not be able to run for hours/days. But there seems to be something very specific that generates your errors. Perhaps related to unbound and how you handle it via link local address, or perhaps the package marks, or perhaps something completally different...

As you are doing, strip out piece by piece to find out what is causing this.
 
Your top 2 rules adds the fwmark 0x1/0x7 for wg11 and wg21 respectively in an xor fashion (honestly I understand how XOR works but I cannot wrap my head around how firewall does this). The last lines sets the bit 0x1000 (OR) and it should not touch the 0x1/0x7 bits at all.

The fc bypass rules works in your system as otherwise you would not be able to run for hours/days. But there seems to be something very specific that generates your errors. Perhaps related to unbound and how you handle it via link local address, or perhaps the package marks, or perhaps something completally different...

As you are doing, strip out piece by piece to find out what is causing this.
In order to test where the problem might lie, I did the following
  1. Disabled IPv6 and rebooted - the mcast_blog error eventually returned
  2. Disabled WAN proxies, removed WGM redirects to WAN proxies, remove WAN Proxies from Unbound and rebooted - the mcast_blog error eventually returned
  3. Stopped using entware_kernel_module for Wireguard - reverted to default and rebooted - the mcast_blog error eventually returned
  4. Stopped using Cake and rebooted - the mcast_blog error eventually returned
  5. Uninstalled Unbound and rebooted - the mcast_blog error eventually returned
  6. Removed the MAC ipset and set IP rules - the mcast_blog error eventually returned
After change 2 above I started noticing that the recurrence of the mcast_blog error happened whenever traffic routed though the client increased substantially
The client device is a Hyper-V VM (Dynamic Memory, 4 Virtual Processors, unrestricted bandwidth, hosted on a Samsung 870 SSD, running ubuntu 22.10). The host is a Ryzen 9 5900X with 64GB RAM. Neither Task Manager nor System Monitor indicate any high loads on host or VM. Most of the traffic is Transmission. The ISP connection is a notional 80/20 and the errors typically occur once the download speed exceeds 40Mb/s.

As previously noted, with Flow Control disabled, there are no issues. In its stripped down state the only router scripts are diversion, [pixelserv-tls -disabled], skynet, scribe, uiscribe, scMerlin, ntpMerlin [chrony] and WireGuard Manager.

I don't know what else to test, but it does seem to me that on this router (RT-AX88U Hardware A1.1) disabling Flow Control is necessary, at least until RMerlin has some time to backport the workaround for the AX86U etc.
 
I don't know what else to test, but it does seem to me that on this router (RT-AX88U Hardware A1.1)
It most certeinly appears so. You tested alot! But when going through the lists of tests, did you reset everything between each tests or did you leave previous test as it was?

I cant think of anything else to test, except for maybe test all removed (if you have not already) in case there are more than 1 thing that generates the error.

After change 2 above I started noticing that the recurrence of the mcast_blog error happened whenever traffic routed though the client increased substantially
What do you mean by this? Are these local clients and hosts or how are these included in the Wireguard network?
 
It most certeinly appears so. You tested alot! But when going through the lists of tests, did you reset everything between each tests or did you leave previous test as it was?

I cant think of anything else to test, except for maybe test all removed (if you have not already) in case there are more than 1 thing that generates the error.


What do you mean by this? Are these local clients and hosts or how are these included in the Wireguard network?
The tests were sequential, so after I disabled or removed something, it stayed off/removed. So currently IPv6 is not in use, Unbound is removed, wan proxy IPs are removed, etc. and will not be there until I start putting everything back on.

I use WG client for all traffic for a few devices on the network. Some occasionally and one (almost) all the time. This latter device is the Hyper-V VM running Ubuntu, and Transmision runs on this. i.e. wg11 has a rule pointing at this device. It is high levels of traffic though this client device that is associated with the errors arising.
 
Last edited:
This latter device is the Hyper-V VM running Ubuntu, and Transmision runs on this. i.e. wg11 has a rule pointing at this device. It is high levels of traffic though this client device that is associated with the errors arising.
So, could it be transmission that generates some packet that your router does not like, perhaps? If you could test terminate transmission for acouple of days and see if your problem goes away?
 
So, could it be transmission that generates some packet that your router does not like, perhaps? If you could test terminate transmission for acouple of days and see if your problem goes away?
As the a core reason for using a VPN on the clients is for the BitTorrent traffic, that might be a pyrrhic result. I could configure using Deluge or qBitorrent, but I would assume that the packet types would be the the same.
 

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top