Issue with Policy rules blocking all clients every 24 hours (using 384.19)

  • ATTENTION! As of November 1, 2020, you are not able to reply to threads 6 months after the thread is opened if there are more than 500 posts in the thread.
    Threads will not be locked, so posts may still be edited by their authors.
    Just start a new thread on the topic to post if you get an error message when trying to reply to a thread.

cplay

Regular Contributor
Hello Everyone,

Would love some of your intelligent feedback on quite an irritating issue - I am sure there are many of you who will most likely know why this is happening and how to fix!

So,

I have two OPVN clients set up on client 1 and client 3 on my router.

Astrill in client 1 and Torguard in Client 3.

Astrill has only 3 IP addresses that use it (192.168.1.7/8/9), Torguard has all addresses after 192.168.1.10

They are both set up with a kill switch and policy rules (strict).

The issue is this:

Every 24 hours my ISP changes my IP, the new ip renews on the router perfectly BUT Astrill (openvpn client 1) fails to connect and provides this error:

Nov 1 15:39:05 kernel: eth0 (Int switch port: 3) (Logical Port: 3) Link DOWN.
Nov 1 15:39:08 kernel: eth0 (Int switch port: 3) (Logical Port: 3) Link UP 1000 mbps full duplex
Nov 1 15:40:24 ovpn-client1[28358]: 192 variation(s) on previous 20 message(s) suppressed by --mute
Nov 1 15:40:24 ovpn-client1[28358]: TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity)
Nov 1 15:40:24 ovpn-client1[28358]: TLS Error: TLS handshake failed

Not only does it fail to connect, but it ALSO blocks internet access to ALL addresses in the 192.168.1.xxx range INCLUDING all of the IP addresses in the Torguard policy rules section. Torguard vpn client reconnects perfectly and has no TLS handshake issue BUT there is no access to the internet as client 1 is blocking access to the internet of all the 192.168.1xxx range even though I have only specified for Astrill (client 1) to be used on only 3 devices not the entire range.

So, to summarise I have two issues;

1. Astrill VPN TLS Handshake failing every 24 hours when the ISP changes IP.

This causes issue number two:

2. Client 1 (Astrill VPN) then blocks internet access to ALL clients on the 192.168.1.xxx range instead of just 192.168.1.7/8/9.

The only way to fix this is to turn off client 1, disable auto start of client 1 and then restart the router.


I have attached to the thread my settings and policy rules for both of the clients in case I am just configuring something wrong.

I have attached below custom config for each client;


Client 1;
setenv FORWARD_COMPATIBLE 1
setenv UV_SERVERID 429
mssfix 1418
link-mtu 1418
ns-cert-type server
tls-version-min 1.2 or-highest
push-peer-info
explicit-exit-notify
mute 20
mute-replay-warnings
max-routes 1000
block-outside-dns
fast-io

Client 3;
remote-cert-tls server
resolv-retry infinite
tls-version-min 1.2
tun-mtu-extra 32
fast-io

ASUS Wireless Router RT-AC86U - OpenVPN Client Settings 2020-11-02 11-26-50.png
ASUS Wireless Router RT-AC86U - OpenVPN Client Settings 2020-11-02 11-29-23.png
 

cplay

Regular Contributor
I believe this might be in the wrong section of the forum.

I believe it might be better placed in “VPN”.
 

eibgrad

Very Senior Member
Admittedly, I'm not familiar w/ Astrill. In fact, from what little I do know, I was under the impression you need to install their applet on the router. And if that's the case, that obviously raises the issue of how *Astrill* handles its own kill switch vs. Merlin. But it seems you've configured it like most other OpenVPN providers (PIA, PureVPN, ExpressVPN, etc.).

As far as Merlin, I tested OpenVPN client #1 w/ ExpressVPN, Routing Policy (Strict), and bound the same source IPs to the VPN. When the VPN was connected, I could see the following relevant data structures.

Code:
[email protected]:/tmp/home/root# ip rule show
0:      from all lookup local
10101:  from 192.168.1.7 lookup ovpnc1
10102:  from 192.168.1.8 lookup ovpnc1
10103:  from 192.168.1.9 lookup ovpnc1
32766:  from all lookup main
32767:  from all lookup default

[email protected]:/tmp/home/root# ip route show table ovpnc1
10.134.0.81 dev tun11  proto kernel  scope link  src 10.134.0.82
192.168.1.0/24 dev br0  proto kernel  scope link  src 192.168.1.1
default via 10.134.0.81 dev tun11
All very normal and expected.

I then manually killed the OpenVPN connection (killall openvpn) and got the following output from the same commands.

Code:
[email protected]:/tmp/home/root# ip rule show
0:      from all lookup local
10101:  from 192.168.1.7 lookup ovpnc1
10102:  from 192.168.1.8 lookup ovpnc1
10103:  from 192.168.1.9 lookup ovpnc1
32766:  from all lookup main
32767:  from all lookup default

[email protected]:/tmp/home/root# ip route show table ovpnc1
192.168.1.0/24 dev br0  proto kernel  scope link  src 192.168.1.1
prohibit default
Those source IPs are still pointing to the same alternate routing table (ovpnc1), but are denied access to a default gateway. But I see nothing that would prevent any *other* source IPs in the 192.168.1.0/24 network from having normal access to the WAN.

So the first thing I would recommend doing is running *only* the Astrill OpenVPN client and doing what I did above. See if you get different results. Because it's going to be difficult to determine the cause of the problem unless someone else is also configured exactly as you, including Astrill. But perhaps if you know what to look for in the underlying data structures, you can discover the problem for yourself.
 
Last edited:

cplay

Regular Contributor
Admittedly, I'm not familiar w/ Astrill. In fact, from what little I do know, I was under the impression you need to install their applet on the router. And if that's the case, that obviously raises the issue of how *Astrill* handles its own kill switch vs. Merlin. But it seems you've configured it like most other OpenVPN providers (PIA, PureVPN, ExpressVPN, etc.).

As far as Merlin, I tested OpenVPN client #1 w/ ExpressVPN, Routing Policy (Strict), and bound the same source IPs to the VPN. When the VPN was connected, I could see the following relevant data structures.

Code:
[email protected]:/tmp/home/root# ip rule show
0:      from all lookup local
10101:  from 192.168.1.7 lookup ovpnc1
10102:  from 192.168.1.8 lookup ovpnc1
10103:  from 192.168.1.9 lookup ovpnc1
32766:  from all lookup main
32767:  from all lookup default

[email protected]:/tmp/home/root# ip route show table ovpnc1
10.134.0.81 dev tun11  proto kernel  scope link  src 10.134.0.82
192.168.1.0/24 dev br0  proto kernel  scope link  src 192.168.1.1
default via 10.134.0.81 dev tun11
All very normal and expected.

I then manually killed the OpenVPN connection (killall openvpn) and got the following output from the same commands.

Code:
[email protected]:/tmp/home/root# ip rule show
0:      from all lookup local
10101:  from 192.168.1.7 lookup ovpnc1
10102:  from 192.168.1.8 lookup ovpnc1
10103:  from 192.168.1.9 lookup ovpnc1
32766:  from all lookup main
32767:  from all lookup default

[email protected]:/tmp/home/root# ip route show table ovpnc1
192.168.1.0/24 dev br0  proto kernel  scope link  src 192.168.1.1
prohibit default
Those source IPs are still pointing to the same alternate routing table (ovpnc1), but are denied access to a default gateway. But I see nothing that would prevent any *other* source IPs in the 192.168.1.0/24 network from having normal access to the WAN.

So the first thing I would recommend doing it running *only* the Astrill OpenVPN client and doing what I did above. See if you get different results. Because it's going to be difficult to determine the cause of the problem unless someone else is also configured exactly as you, including Astrill. But perhaps if you know what to look for in the underlying data structures, you can discover the problem for yourself.
thank you for taking the time to replicate my issue.

astrill does not need to applet to function and have been using the opvn config for nearly a year - I only use them because they are the fastest in my testing but their support is the worst (quite literally).

is there any chance you could redo your test but this time have no IPS listed in the policy rules but keep everything the same?

Also, I run both clients at the same time, did you replicate that also?

Once again, thank you for taking the time to try and replicate this issue.
 

cplay

Regular Contributor
Admittedly, I'm not familiar w/ Astrill. In fact, from what little I do know, I was under the impression you need to install their applet on the router. And if that's the case, that obviously raises the issue of how *Astrill* handles its own kill switch vs. Merlin. But it seems you've configured it like most other OpenVPN providers (PIA, PureVPN, ExpressVPN, etc.).

As far as Merlin, I tested OpenVPN client #1 w/ ExpressVPN, Routing Policy (Strict), and bound the same source IPs to the VPN. When the VPN was connected, I could see the following relevant data structures.

Code:
[email protected]:/tmp/home/root# ip rule show
0:      from all lookup local
10101:  from 192.168.1.7 lookup ovpnc1
10102:  from 192.168.1.8 lookup ovpnc1
10103:  from 192.168.1.9 lookup ovpnc1
32766:  from all lookup main
32767:  from all lookup default

[email protected]:/tmp/home/root# ip route show table ovpnc1
10.134.0.81 dev tun11  proto kernel  scope link  src 10.134.0.82
192.168.1.0/24 dev br0  proto kernel  scope link  src 192.168.1.1
default via 10.134.0.81 dev tun11
All very normal and expected.

I then manually killed the OpenVPN connection (killall openvpn) and got the following output from the same commands.

Code:
[email protected]:/tmp/home/root# ip rule show
0:      from all lookup local
10101:  from 192.168.1.7 lookup ovpnc1
10102:  from 192.168.1.8 lookup ovpnc1
10103:  from 192.168.1.9 lookup ovpnc1
32766:  from all lookup main
32767:  from all lookup default

[email protected]:/tmp/home/root# ip route show table ovpnc1
192.168.1.0/24 dev br0  proto kernel  scope link  src 192.168.1.1
prohibit default
Those source IPs are still pointing to the same alternate routing table (ovpnc1), but are denied access to a default gateway. But I see nothing that would prevent any *other* source IPs in the 192.168.1.0/24 network from having normal access to the WAN.

So the first thing I would recommend doing it running *only* the Astrill OpenVPN client and doing what I did above. See if you get different results. Because it's going to be difficult to determine the cause of the problem unless someone else is also configured exactly as you, including Astrill. But perhaps if you know what to look for in the underlying data structures, you can discover the problem for yourself.
So, I re-run your test except with both of my clients running (1&3) and run exactly the same commands as you.

Here were the findings from following your exact test and testing both client 1 and 3 not just client 1;

Code:
[email protected]:/tmp/home/root# ip rule show
0:    from all lookup local
10101:    from 192.168.1.7 lookup ovpnc1
10102:    from 192.168.1.8 lookup ovpnc1
10103:    from 192.168.1.9 lookup ovpnc1
10401:    from all to 192.168.8.1 lookup main
10501:    from 192.168.1.10/31 lookup ovpnc3
10502:    from 192.168.1.12/30 lookup ovpnc3
10503:    from 192.168.1.16/28 lookup ovpnc3
10504:    from 192.168.1.32/27 lookup ovpnc3
10505:    from 192.168.1.64/26 lookup ovpnc3
10506:    from 192.168.1.128/25 lookup ovpnc3
32766:    from all lookup main
32767:    from all lookup default

[email protected]:/tmp/home/root# ip route show table ovpnc1
default via 198.18.0.1 dev tun11
192.168.1.0/24 dev br0  proto kernel  scope link  src 192.168.1.1
198.18.0.0/21 dev tun11  proto kernel  scope link  src 198.18.0.226

[email protected]:/tmp/home/root# ip route show table ovpnc3
default via 10.22.0.86 dev tun13
10.22.0.86 dev tun13  proto kernel  scope link  src 10.22.0.85
192.168.1.0/24 dev br0  proto kernel  scope link  src 192.168.1.1
I then run the killall vpn command like you did and then retested your commands.

These were the findings;

Code:
[email protected]:/tmp/home/root# ip rule show
0:    from all lookup local
10101:    from 192.168.1.7 lookup ovpnc1
10102:    from 192.168.1.8 lookup ovpnc1
10103:    from 192.168.1.9 lookup ovpnc1
10401:    from all to 192.168.8.1 lookup main
10501:    from 192.168.1.10/31 lookup ovpnc3
10502:    from 192.168.1.12/30 lookup ovpnc3
10503:    from 192.168.1.16/28 lookup ovpnc3
10504:    from 192.168.1.32/27 lookup ovpnc3
10505:    from 192.168.1.64/26 lookup ovpnc3
10506:    from 192.168.1.128/25 lookup ovpnc3
32766:    from all lookup main
32767:    from all lookup default

[email protected]:/tmp/home/root# ip route show table ovpnc1
prohibit default
192.168.1.0/24 dev br0  proto kernel  scope link  src 192.168.1.1

[email protected]:/tmp/home/root# ip route show table ovpnc3
prohibit default
192.168.1.0/24 dev br0  proto kernel  scope link  src 192.168.1.1
After running these commands and testing it appears the killswitch is working fine.

The devices that should be blocked are and the ones 192.168.1.6 and below (the IPS not listed in either of the vpn clients) do NOT have internet blocked.

This however is not what happens when the TLS handshake issue arises.

So,

1. Can you check my above ip tables to see if correctly setup (I think it is)?

2. I will re run the ip table commands AFTER this tls handshake issue happens again (happens at 15:39 every day) and report back on the issue with my code findings to see if there are any differences.
 

Similar threads

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top