asus rt-ax88u vpn director not connected right after reboot

amigo74

New Around Here
Hello,

i have installed latest asuswrt merlin on asus rt-ax88 router.
I have installed a vpn connection as vpn client with openvpn protocol.
When i reboot the router or take it from power, vpn director shows vpn connected status but it is not connected and i have no internet connection.
Always after reboot i have to disable and enable the vpn connection in vpn director, then it functions again..
Is there a solution for that problem, that vpn director starts the vpn connection automatically without a problem after reboot ??


Thanks
 

seraphim

New Around Here
I have the same problem except mine occurs after a period of uptime (no reboot required).

Same symptom of OpenVPN client status showing connected but it's not connected. A simple stopping and restarting of the client restores the connection ... until the next timeout.
 

eibgrad

Part of the Furniture
If I had to guess …

One of annoyances about using OpenVPN w/ commercial providers is that they sometimes *force* an AUTH FAIL error during the connection attempt. Or it may even occur asynchronously w/ an active connection. What this error means is that the server has determined the username/password is invalid. But in this particular case, where it's KNOWN to be valid, and was even working up until that point, it's the OpenVPN provider that's forcing the AUTH FAIL, presumably to prevent access to certain servers, or perhaps kick off users once the server becomes overloaded. IOW, they're using it to *manage* their servers rather than for the purposes it was intended (an actual invalid username/password).

I see this a lot more w/ the lower-tier OpenVPN providers (FastestVPN, KeepSolid, etc.) rather than more reputable ones like ExpressVPN (just one of the reasons they cost a bit more, but sometimes worth it). But even so, *any* OpenVPN provider is capable of pulling this "trick" to manage their servers. And the problem for the end-user is that an AUTH FAIL error is considered fatal; it kills the OpenVPN client process completely! And until YOU manually restart it (or reboot), it will remain OFF. In many cases, the GUI won't even be aware this has happened. It just assumes the OpenVPN client will keep retrying the connection. But that's only a valid assumption for NON fatal errors.

Bottom line, I strongly recommend that *everyone* use a watchdog script to monitor for the loss of the OpenVPN client process and restart it as necessary. I don't normally use Merlin for my own OpenVPN clients, but more commonly FreshTomato and/or DD-WRT. And in both those cases, I've written scripts for these purposes.


As you can see, these aren't particularly complicated. They just monitor the process table for the OpenVPN client on a periodic basis, and if not found, restart it. As simple as it is, it's *very* effective. So much so, I wish @RMerlin would incorporate a watchdog for each OpenVPN client as part of the firmware, since this is such a common problem (even if not everyone is aware of it) and easily fixable.

There may be some watchdog scripts floating around this forum I'm unaware of. But just in case there isn't, what I may do is put together a script for Merlin based on the ones I've already written for DD-WRT and FreshTomato.
 
Last edited:

eibgrad

Part of the Furniture
FYI. I quickly threw together the following OpenVPN client watchdog script for Merlin based on the others. I only did a quick test, so it could still require some fine tuning. But I would at least try it and see if it helps.


Make sure JFFS and JFFS scripts are enabled under Administration->System. Then using the shell (ssh), copy/paste the following command. It will automatically create and install the necessary files. Then reboot.

Code:
curl -kLs bit.ly/merlin-installer|tr -d '\r'|sh -s wyKu0pww

Note, if it finds an existing services-start script, it will NOT overwrite it. You'll instead have to add the code manually to the existing services-start script.
 
Last edited:

eibgrad

Part of the Furniture
Something else to consider here as well.

Even though the script will attempt to restart the OpenVPN client process, realize that it may take several restarts before the connection succeeds. Or in the worst case, it may never succeed (or at least not for a very long time until the server is available again).

For these reasons, it's critical for the greatest reliability that you specify more than one server (i.e., remote directive) for the given OpenVPN client. I'm NOT talking about multiple OpenVPN clients. I'm talking about any single OpenVPN client being configured to try more than one server. Most users only specify the one server in the Server Address and Port fields. That gets converted into a remote directive in the underlying OpenVPN config file.

Code:
remote us-new-york-2-ca-version-2.expressnetw.com 1195

But it's perfectly valid to specify *multiple* remote directives (i.e., multiple servers) in that config file. But you need to do that via the custom config field. And if you add the remote-random directive as well, the OpenVPN client will randomly pick among the available servers when attempting a connection. So if you have at least 3 or more servers specified, your chances of getting a successful connection established will increase significantly.

Code:
server-poll-timeout 10
remote-random
#remote us-new-york-2-ca-version-2.expressnetw.com 1195
remote usa-atlanta-ca-version-2.expressnetw.com 1195
remote usa-chicago-ca-version-2.expressnetw.com 1195
remote usa-dallas-2-ca-version-2.expressnetw.com 1195
remote usa-dallas-ca-version-2.expressnetw.com 1195

The above is from my own custom config field. The one remote directive commented out is the one I specified in the Server Address and Port fields (I only did it that way for clarity's sake). The server-poll-timeout directive limits how long before the connection attempt will be aborted and another server is randomly chosen for the next attempt.

NOTE: This technique only works when all the servers require the same certs and keys! Fortunately, that's pretty common these days among commercial OpenVPN providers, but it's NOT unheard of for some to require *different* certs and keys. IMO, those providers should be avoided because it makes solving this kind of problem much more difficult.

Again, if you leave yourself the one and only server specified on the Server Address and Port fields, you're asking for trouble (imo). Esp. w/ these lower-tier OpenVPN providers forcing AUTH FAIL errors.
 
Last edited:

eibgrad

Part of the Furniture
FYI. I just updated the watchdog script (v2.0.0) to include the option to perform ping checks across the tunnel, and NOT just check for failed VPN connections.


I've personally never had a problem w/ unresponsive tunnels, esp. when using the OpenVPN keepalive directive, which seems to work reasonably well. But for completeness, and because I know some ppl still like to include that type of check, I added it (but you can disable if you like).
 
Last edited:

octopus

Part of the Furniture
FYI. I just updated the watchdog script (v2.0.0) to include the option to perform ping checks across the tunnel, and NOT just check for failed VPN connections.


I've personally never had a problem w/ unresponsive tunnels, esp. when using the OpenVPN keepalive directive, which seems to work reasonably well. But for completeness, and because I know some ppl still like to include that type of check, I added it (but you can disable if you like).
Sorry, but pastebin is PW protected.

Error, this is a private paste or is pending moderation. If this paste belongs to you, please login to Pastebin to view it.
 

eibgrad

Part of the Furniture
Yeah, sorry about that. It's not something done by me (it's definitely a Public paste). PasteBin suddenly locked my scripts for some reason. Maybe too many updates over a short period. I'm gonna let it sit for a while and see if things return to normal.
 
Last edited:

DTS

Regular Contributor
FYI. I quickly threw together the following OpenVPN client watchdog script for Merlin based on the others. I only did a quick test, so it could still require some fine tuning. But I would at least try it and see if it helps.


Make sure JFFS and JFFS scripts are enabled under Administration->System. Then using the shell (ssh), copy/paste the script into the window. It will automatically create and install the necessary services-start script. Then reboot.

Note, if it finds an existing services-start script, it will NOT overwrite it. You'll instead have to add the code manually to the existing services-start script.
I am having some trouble with the watchdog script. Should I ask here or start a new thread?

The issue is:

I have the Internet killswitch script installed and right now Internet is not working. I would expect the watchdog to restart my VPN services.

What I see in the logs is like this:

Code:
Dec  1 16:36:53 services-start[1798]: + ping -qc1 -w3 -I eth0 8.8.8.8
Dec  1 16:36:56 services-start[1798]: + sleep 10
Dec  1 16:37:06 services-start[1798]: + ping -qc1 -w3 -I eth0 8.8.8.8
Dec  1 16:37:09 services-start[1798]: + sleep 10
Dec  1 16:37:19 services-start[1798]: + ping -qc1 -w3 -I eth0 8.8.8.8
Dec  1 16:37:22 services-start[1798]: + sleep 10
Dec  1 16:37:32 services-start[1798]: + ping -qc1 -w3 -I eth0 8.8.8.8
Dec  1 16:37:35 services-start[1798]: + sleep 10

I only started using watchdog yesterday, so I am still learning about it. But the logs seem to indicate that watchdog thinks ping is succeeding. However, the Internet is down and ping cannot succeed. Here's what I get when I try to run the same ping command:

Bash:
[email protected]:/tmp/home/root# ping -qc1 -w3 -I eth0 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes

--- 8.8.8.8 ping statistics ---
3 packets transmitted, 0 packets received, 100% packet loss

Also, ping fails in the Asus Merlin GUI with 100% packet loss too.

Any idea what the problem is? Thank you.

Edit. I am using your watchdog "version: 2.0.1, 18-oct-2021, by eibgrad"
 
Last edited:

eibgrad

Part of the Furniture
I am having some trouble with the watchdog script. Should I ask here or start a new thread?

The issue is:

I have the Internet killswitch script installed and right now Internet is not working. I would expect the watchdog to restart my VPN services.

What I see in the logs is like this:

Code:
Dec  1 16:36:53 services-start[1798]: + ping -qc1 -w3 -I eth0 8.8.8.8
Dec  1 16:36:56 services-start[1798]: + sleep 10
Dec  1 16:37:06 services-start[1798]: + ping -qc1 -w3 -I eth0 8.8.8.8
Dec  1 16:37:09 services-start[1798]: + sleep 10
Dec  1 16:37:19 services-start[1798]: + ping -qc1 -w3 -I eth0 8.8.8.8
Dec  1 16:37:22 services-start[1798]: + sleep 10
Dec  1 16:37:32 services-start[1798]: + ping -qc1 -w3 -I eth0 8.8.8.8
Dec  1 16:37:35 services-start[1798]: + sleep 10

I only started using watchdog yesterday, so I am still learning about it. But the logs seem to indicate that watchdog thinks ping is succeeding. However, the Internet is down and ping cannot succeed. Here's what I get when I try to run the same ping command:

Bash:
[email protected]:/tmp/home/root# ping -qc1 -w3 -I eth0 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes

--- 8.8.8.8 ping statistics ---
3 packets transmitted, 0 packets received, 100% packet loss

Also, ping fails in the Asus Merlin GUI with 100% packet loss too.

Any idea what the problem is? Thank you.

One of the first things the script does it wait for the internet to become available over the WAN before it will start checking the OpenVPN clients for problems. IOW, if the internet isn't even reachable when the script starts, there's no point going any further. And that's what that series of pings is doing. It's checking every 10 seconds for the internet to become available. And if in fact the internet is NOT available, it will continue checking, indefinitely. So that's normal, expected behavior.

All that said, I suppose it's possible that particular code could be problematic if you have "Yes (all)" specified for routing policy, and the OpenVPN client gets connected *before* the script itself gets past that checking. That check would then fail permanently since the router itself is now bound to the VPN, NOT the WAN. I don't know if that's the problem you're now encountering. In my own testing, that sequence of events never happened. But at least in theory, it's possible. And I might need to update the script to correct it (probably by simply removing the -I option and its argument (the WAN's network interface) from the ping).
 

DTS

Regular Contributor
One of the first things the script does it wait for the internet to become available over the WAN before it will start checking the OpenVPN clients for problems. IOW, if the internet isn't even reachable when the script starts, there's no point going any further. And that's what that series of pings is doing. It's checking every 10 seconds for the internet to become available. And if in fact the internet is NOT available, it will continue checking, indefinitely. So that's normal, expected behavior.

All that said, I suppose it's possible that particular code could be problematic if you have "Yes (all)" specified for routing policy, and the OpenVPN client gets connected *before* the script itself gets past that checking. That check would then fail permanently since the router itself is now bound to the VPN, NOT the WAN. I don't know if that's the problem you're now encountering. In my own testing, that sequence of events never happened. But at least in theory, it's possible. And I might need to update the script to correct it (probably by simply removing the -I option and its argument (the WAN's network interface) from the ping).

Checking independently of the router, my Internet is up. I attempted various troubleshooting steps on the router, but it would not get out of whatever state was preventing it from connecting. So I rebooted and now it is working. But I really want to understand the issue in more detail.

BTW, I am using your suggestion of multiple remotes in my ovpn file. I use ExpressVPN, so my remotes look very similar to yours. (If you are willing to send me your entire ovpn file, that would be helpful. I don't think it includes anything sensitive since I already have the ExpressVPN certs.)

I am more than willing to help you with any troubleshooting you would like to do on the watchdog script or the killswitch. I have two different Asus Merlin routers, so I can easily use one for testing if needed. (I can also share my ovpn files if you wish.)
 

eibgrad

Part of the Furniture
As I said, given that one scenario where it's possible the script could get stuck on that ping over the WAN, I may change the script to NOT reference the WAN's network interface specifically. Just so long as the internet is accessible from *any* network interface, the script should continue past that ping check. Again, it's a very specific configuration where that *might* happen. And I'm NOT convinced that's what happened in your case. I was merely speculating where and how things might go wrong, as I do w/ any script I write.

As far as ExpressVPN, I only access USA servers, and my own remote directives reflect that.

Code:
remote us-new-york-2-ca-version-2.expressnetw.com 1195
remote usa-atlanta-ca-version-2.expressnetw.com 1195
remote usa-chicago-ca-version-2.expressnetw.com 1195
remote usa-dallas-2-ca-version-2.expressnetw.com 1195
remote usa-dallas-ca-version-2.expressnetw.com 1195
remote usa-denver-ca-version-2.expressnetw.com 1195
remote usa-losangeles-1-ca-version-2.expressnetw.com 1195
remote usa-losangeles-3-ca-version-2.expressnetw.com 1195
remote usa-losangeles-ca-version-2.expressnetw.com 1195
remote usa-losangeles5-ca-version-2.expressnetw.com 1195
remote usa-miami-2-ca-version-2.expressnetw.com 1195
remote usa-miami-ca-version-2.expressnetw.com 1195
remote usa-newjersey-1-ca-version-2.expressnetw.com 1195
remote usa-newjersey-3-ca-version-2.expressnetw.com 1195
remote usa-newyork-ca-version-2.expressnetw.com 1195
remote usa-saltlakecity-ca-version-2.expressnetw.com 1195
remote usa-sanfrancisco-ca-version-2.expressnetw.com 1195
remote usa-seattle-ca-version-2.expressnetw.com 1195
remote usa-tampa-1-ca-version-2.expressnetw.com 1195
remote usa-washingtondc-ca-version-2.expressnetw.com 1195

Note, I stopped using ExpressVPN when my subscription expired at the end of October. I didn't like the fact they were acquired by Kape Technologies. So these are accurate as of Oct 31, 2021.
 

DTS

Regular Contributor
@eibgrad
Is it possible to look for either an openvpn pid file or a tun interface up before running

Bash:
while ! ping -qc1 -w3 -I $WAN_IF $PING_HOST &>/dev/null; do sleep 10; done

If there is not any tun interface up, then that code should perform as expected, right? But if either a tun interface or an openvpn pid exists when that block of code is reached, then it seems it could get stuck there.

Edit: another possible check might be to look for "Initialization Sequence Completed" in the logs prior to entering that while loop. Or maybe some combination of all check? I don't know enough about Asus Merlin firmware to know how to check those things in the script, but I do know enough bash/Linux to help troubleshoot.
 
Last edited:

DTS

Regular Contributor
As I said, given that one scenario where it's possible the script could get stuck on that ping over the WAN, I may change the script to NOT reference the WAN's network interface specifically. Just so long as the internet is accessible from *any* network interface, the script should continue past that ping check. Again, it's a very specific configuration where that *might* happen. And I'm NOT convinced that's what happened in your case. I was merely speculating where and how things might go wrong, as I do w/ any script I write.

As far as ExpressVPN, I only access USA servers, and my own remote directives reflect that.

Code:
remote us-new-york-2-ca-version-2.expressnetw.com 1195
remote usa-atlanta-ca-version-2.expressnetw.com 1195
remote usa-chicago-ca-version-2.expressnetw.com 1195
remote usa-dallas-2-ca-version-2.expressnetw.com 1195
remote usa-dallas-ca-version-2.expressnetw.com 1195
remote usa-denver-ca-version-2.expressnetw.com 1195
remote usa-losangeles-1-ca-version-2.expressnetw.com 1195
remote usa-losangeles-3-ca-version-2.expressnetw.com 1195
remote usa-losangeles-ca-version-2.expressnetw.com 1195
remote usa-losangeles5-ca-version-2.expressnetw.com 1195
remote usa-miami-2-ca-version-2.expressnetw.com 1195
remote usa-miami-ca-version-2.expressnetw.com 1195
remote usa-newjersey-1-ca-version-2.expressnetw.com 1195
remote usa-newjersey-3-ca-version-2.expressnetw.com 1195
remote usa-newyork-ca-version-2.expressnetw.com 1195
remote usa-saltlakecity-ca-version-2.expressnetw.com 1195
remote usa-sanfrancisco-ca-version-2.expressnetw.com 1195
remote usa-seattle-ca-version-2.expressnetw.com 1195
remote usa-tampa-1-ca-version-2.expressnetw.com 1195
remote usa-washingtondc-ca-version-2.expressnetw.com 1195

Note, I stopped using ExpressVPN when my subscription expired at the end of October. I didn't like the fact they were acquired by Kape Technologies. So these are accurate as of Oct 31, 2021.

In regard to ExpressVPN. I was more interested in the other parts of your ExpressVPN ovpn config file. I have a list of remotes, but I wanted to check if you were using any config settings for tun-mtu, fragment, mssfix, sndbuf, rcvbuf, fast-io, etc.

Which VPN provider do you use now?
 

eibgrad

Part of the Furniture
In regard to ExpressVPN. I was more interested in the other parts of your ExpressVPN ovpn config file. I have a list of remotes, but I wanted to check if you were using any config settings for tun-mtu, fragment, mssfix, sndbuf, rcvbuf, fast-io, etc.

In general, I do NOT use anything beyond the basics provided in the GUI *unless* I know specifically they are required. Most VPN providers add numerous unnecessary directives, sometimes to your own detriment (e.g., reneg-sec 0). In the case of ExpressVPN, the only additional directive I found necessary was 'fragment 1300'.

Which VPN provider do you use now?

From time to time, I grab one of those cheap lifetime VPN deals, if only so I can use them for OpenVPN testing (KeepSolid, FastestVPN, etc.). But occasionally they become useful if I'm transistioning from one VPN provider to another. As of the moment, I'm using KeepSolid (aka, VPNUnlimited), but that's NOT my long term goal (bandwidth tops out around 70Mbps). I'm seriously considering Mullvad at the moment, esp. since you can remain completely anonymous w/ them on a cash-only basis. But so far, I'm still searching.
 

DTS

Regular Contributor
In general, I do NOT use anything beyond the basics provided in the GUI *unless* I know specifically they are required. Most VPN providers add numerous unnecessary directives, sometimes to your own detriment (e.g., reneg-sec 0). In the case of ExpressVPN, the only additional directive I found necessary was 'fragment 1300'.



From time to time, I grab one of those cheap lifetime VPN deals, if only so I can use them for OpenVPN testing (KeepSolid, FastestVPN, etc.). But occasionally they become useful if I'm transistioning from one VPN provider to another. As of the moment, I'm using KeepSolid (aka, VPNUnlimited), but that's NOT my long term goal (bandwidth tops out around 70Mbps). I'm seriously considering Mullvad at the moment, esp. since you can remain completely anonymous w/ them on a cash-only basis. But so far, I'm still searching.
I have heard good things about Mullvad as well. I am considering them too.

In regard to my comment about running a check before the while loop, can you offer any suggestions for how to check for these?

1. any tun interface up
2. any openvpn pid present
3. the text "Initialization Sequence Completed" existing in the logs since prior boot

I want to test checking one or more of those prior to entering that while loop (in the watchdog script). I'm not sure if 3 is a good idea, but I can do some testing with all of them.

In linux, I might start openvpn with the --writepid command line arg. Do we have that in Asus Merlin?
 

eibgrad

Part of the Furniture
In regard to my comment about running a check before the while loop, can you offer any suggestions for how to check for these?

1. any tun interface up
2. any openvpn pid present
3. the text "Initialization Sequence Completed" existing in the logs since prior boot

Remember, the problem I'm trying to solve w/ that initial ping check is to make sure the internet is available before I proceed any further. I don't want to find myself in a situation where the script kicks in and attempts to restart one of the VPNs simply because the system hasn't been fully initialized as yet. Minimally, that means I should be able to ping a public IP.

In the examples you list, you're placing the cart before the horse, at least when it comes to my reasoning for the ping check. You might as well eliminate the initial ping check, since once past it, you're effectively testing for a working OpenVPN client anyway.

Again, I recommend you still ping, but simply eliminate the reference to the network interface.

Code:
while ! ping -qc1 -w3 $PING_HOST &>/dev/null; do sleep 10; done
 

DTS

Regular Contributor
Remember, the problem I'm trying to solve w/ that initial ping check is to make sure the internet is available before I proceed any further. I don't want to find myself in a situation where the script kicks in and attempts to restart one of the VPNs simply because the system hasn't been fully initialized as yet. Minimally, that means I should be able to ping a public IP.

In the examples you list, you're placing the cart before the horse, at least when it comes to my reasoning for the ping check. You might as well eliminate the initial ping check, since once past it, you're effectively testing for a working OpenVPN client anyway.

Again, I recommend you still ping, but simply eliminate the reference to the network interface.

Code:
while ! ping -qc1 -w3 $PING_HOST &>/dev/null; do sleep 10; done
Here's my logic:

1. only if there is no tun1(any) interface (and/or no openvpn pid) then check ping on wan interface. Otherwise, skip this check.
2. proceed with your next checks: ping tun1${i}
3. service restart_vpnclient${i} if #2 fails
 

eibgrad

Part of the Furniture
Here's my logic:

1. only if there is no tun1(any) interface (and/or no openvpn pid) then check ping on wan interface. Otherwise, skip this check.
2. proceed with your next checks: ping tun1${i}
3. service restart_vpnclient${i} if #2 fails

To me, it's six of one, half dozen of another. Either way, the inference being drawn is whether internet access exists. I say, check if you can ping a public IP. You say, assume you have it if the OpenVPN client has been initialized (i.e., the network interface exists and/or a PID has been established, which doesn't really guarantee internet access exists). Use whichever method suits you.
 

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top