Robust DNS Settings

  • ATTENTION! As of November 1, 2020, you are not able to reply to threads 6 months after the thread is opened if there are more than 500 posts in the thread.
    Threads will not be locked, so posts may still be edited by their authors.
    Just start a new thread on the topic to post if you get an error message when trying to reply to a thread.

Kanji-San

Regular Contributor
I am having issues with Google DNS currently. The router claims to be disconnected from the Internet.
connmon, using 8.8.8.8, shows 0 for the last hour.

My settings are:
WAN DNS Settings:
DNS Server 1: 8.8.8.8
DNS Server 2: 1.1.1.1

DNS Privacy Protocol: DNS-over-TLS (DoT)
DNS-over-TLS Profile: Strict

DNS-over-TLS Server List:
9.9.9.9
8.8.8.8

ping does not reach 8.8.8.8. However if I replace DNS Server 1: with 9.9.9.9 and in the DNS-over-TLS Server List with 1.1.1.1 everything works again.
My questions:
  • Why isn't the backup DNS server (1.1.1.1 or 9.9.9.9) automatically used when 8.8.8.8 does not work?
  • What settings do I need that this configuration becomes more robust?
 

dave14305

Part of the Furniture
Add secondary servers to the DoT list for added resilience with the same provider (2 Google, 2 Quad9).

Does DNS stop working, or is it just that ping stops working? Is Network Monitoring enabled in the System page?
 

Kanji-San

Regular Contributor
Add secondary servers to the DoT list for added resilience with the same provider (2 Google, 2 Quad9).

Does DNS stop working, or is it just that ping stops working? Is Network Monitoring enabled in the System page?
DNS stops working because 8.8.8.8 does not reply. There is no fall over to the other DNS server (9.9.9.9) in the list. Network monitoring is enabled, as well as connmon which clearly shows that 8.8.8.8 is not reachable.

What I would like to do is to configure dnsmasq in a way that it automatically uses the other DNS servers in the list if the first one or the next one does not work. Since I am no dnsmasq wizard, I was hoping for some ideas.
 

dave14305

Part of the Furniture
With DoT enabled, dnsmasq is only going to forward to Stubby (127.0.1.1:53), so there is no resilience at the dnsmasq level. Stubby will do round-robin selection of the configured DoT servers. If one fails, it is configured to avoid it (backoff-time) for 15 minutes (900 seconds).
 

Kanji-San

Regular Contributor
Thanks, Dave. This is very interesting. Is there a stubby log file?
 

dave14305

Part of the Furniture
Thanks, Dave. This is very interesting. Is there a stubby log file?
Not for normal operation. But you can enable a log file for debugging (written to /tmp/stubby.log, so it can fill up RAM if left on too long).
Bash:
nvram set stubby_debug=1
service restart_stubby
tail -f /tmp/stubby.log
To disable:
Bash:
nvram unset stubby_debug
service restart_stubby
Once you're done with the log, delete with rm /tmp/stubby.log
 

Kanji-San

Regular Contributor
stubby seems to back off increasingly longer. Here is a snippet:
(3 DNS servers configured: 1.1.1.1, 9.9.9.9, and 8.8.8.8)

Code:
[19:19:26.018545] STUBBY: 8.8.8.8                                  : Conn closed: TLS - *Failure*
[19:19:26.018803] STUBBY: 1.1.1.1                                  : Conn opened: TLS - Strict Profile
[19:19:26.018835] STUBBY: 8.8.8.8                                  : Conn closed: TLS - Resps=     0, Timeouts  =     0, Curr_auth =   None, Keepalive(ms)=     0
[19:19:26.018854] STUBBY: 8.8.8.8                                  : Upstream   : TLS - Resps=     0, Timeouts  =     0, Best_auth =   None
[19:19:26.018874] STUBBY: 8.8.8.8                                  : Upstream   : TLS - Conns=     0, Conn_fails=     2, Conn_shuts=      0, Backoffs     =     5
[19:19:26.018898] STUBBY: 8.8.8.8                                  : Upstream   : !Backing off TLS on this upstream    - Will retry again in 64s at Mon Dec 21 19:20:30 2020
[19:19:26.049595] STUBBY: 1.1.1.1                                  : Verify passed : TLS
[19:19:50.670559] STUBBY: 1.1.1.1                                  : Conn closed: TLS - Resps=     4, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)=  9000
[19:19:50.670616] STUBBY: 1.1.1.1                                  : Upstream   : TLS - Resps=   140, Timeouts  =     0, Best_auth =Success
[19:19:50.670635] STUBBY: 1.1.1.1                                  : Upstream   : TLS - Conns=    12, Conn_fails=     0, Conn_shuts=      1, Backoffs     =     0
[19:19:54.854189] STUBBY: 1.1.1.1                                  : Conn opened: TLS - Strict Profile
[19:19:54.885002] STUBBY: 1.1.1.1                                  : Verify passed : TLS
[19:19:54.917489] STUBBY: 9.9.9.9                                  : Conn closed: TLS - Resps=     9, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)=  9000
[19:19:54.917525] STUBBY: 9.9.9.9                                  : Upstream   : TLS - Resps=   252, Timeouts  =     0, Best_auth =Success
[19:19:54.917547] STUBBY: 9.9.9.9                                  : Upstream   : TLS - Conns=    10, Conn_fails=     0, Conn_shuts=      1, Backoffs     =     0
[19:19:54.917950] STUBBY: 9.9.9.9                                  : Conn opened: TLS - Strict Profile
[19:19:54.941825] STUBBY: 9.9.9.9                                  : Verify passed : TLS
[19:20:04.192530] STUBBY: 1.1.1.1                                  : Conn closed: TLS - Resps=     2, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)=  9000
[19:20:04.192576] STUBBY: 1.1.1.1                                  : Upstream   : TLS - Resps=   142, Timeouts  =     0, Best_auth =Success
[19:20:04.192596] STUBBY: 1.1.1.1                                  : Upstream   : TLS - Conns=    13, Conn_fails=     0, Conn_shuts=      1, Backoffs     =     0
[19:20:07.490571] STUBBY: 1.1.1.1                                  : Conn opened: TLS - Strict Profile
[19:20:07.520844] STUBBY: 1.1.1.1                                  : Verify passed : TLS
[19:20:16.592520] STUBBY: 1.1.1.1                                  : Conn closed: TLS - Resps=     1, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)=  9000
[19:20:16.592571] STUBBY: 1.1.1.1                                  : Upstream   : TLS - Resps=   143, Timeouts  =     0, Best_auth =Success
[19:20:16.592591] STUBBY: 1.1.1.1                                  : Upstream   : TLS - Conns=    14, Conn_fails=     0, Conn_shuts=      1, Backoffs     =     0
[19:20:23.406695] STUBBY: 1.1.1.1                                  : Conn opened: TLS - Strict Profile
[19:20:23.441628] STUBBY: 1.1.1.1                                  : Verify passed : TLS
[19:20:25.452526] STUBBY: 9.9.9.9                                  : Conn closed: TLS - Resps=     6, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)=  9000
[19:20:25.452563] STUBBY: 9.9.9.9                                  : Upstream   : TLS - Resps=   258, Timeouts  =     0, Best_auth =Success
[19:20:25.452583] STUBBY: 9.9.9.9                                  : Upstream   : TLS - Conns=    11, Conn_fails=     0, Conn_shuts=      1, Backoffs     =     0
[19:20:30.192481] STUBBY: 9.9.9.9                                  : Conn opened: TLS - Strict Profile
[19:20:30.222020] STUBBY: 9.9.9.9                                  : Verify passed : TLS
[19:20:32.461535] STUBBY: 1.1.1.1                                  : Conn closed: TLS - Resps=     1, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)=  9000
[19:20:32.461582] STUBBY: 1.1.1.1                                  : Upstream   : TLS - Resps=   144, Timeouts  =     0, Best_auth =Success
[19:20:32.461603] STUBBY: 1.1.1.1                                  : Upstream   : TLS - Conns=    15, Conn_fails=     0, Conn_shuts=      1, Backoffs     =     0
[19:20:32.670744] STUBBY: 1.1.1.1                                  : Conn opened: TLS - Strict Profile
[19:20:32.700585] STUBBY: 1.1.1.1                                  : Verify passed : TLS
[19:20:37.966464] STUBBY: 8.8.8.8                                  : Conn opened: TLS - Strict Profile
[19:20:40.369517] STUBBY: 8.8.8.8                                  : Conn closed: TLS - *Failure*
[19:20:40.369571] STUBBY: 8.8.8.8                                  : Conn closed: TLS - Resps=     0, Timeouts  =     0, Curr_auth =   None, Keepalive(ms)=     0
[19:20:40.369605] STUBBY: 8.8.8.8                                  : Upstream   : TLS - Resps=     0, Timeouts  =     0, Best_auth =   None
[19:20:40.369625] STUBBY: 8.8.8.8                                  : Upstream   : TLS - Conns=     0, Conn_fails=     1, Conn_shuts=      0, Backoffs     =     6
[19:20:41.709212] STUBBY: 8.8.8.8                                  : Conn opened: TLS - Strict Profile
[19:20:44.105449] STUBBY: 8.8.8.8                                  : Conn closed: TLS - *Failure*
[19:20:44.105487] STUBBY: 8.8.8.8                                  : Conn closed: TLS - *Failure*
[19:20:44.105509] STUBBY: 8.8.8.8                                  : Conn closed: TLS - Resps=     0, Timeouts  =     0, Curr_auth =   None, Keepalive(ms)=     0
[19:20:44.105528] STUBBY: 8.8.8.8                                  : Upstream   : TLS - Resps=     0, Timeouts  =     0, Best_auth =   None
[19:20:44.105546] STUBBY: 8.8.8.8                                  : Upstream   : TLS - Conns=     0, Conn_fails=     2, Conn_shuts=      0, Backoffs     =     6
[19:20:44.105570] STUBBY: 8.8.8.8                                  : Upstream   : !Backing off TLS on this upstream    - Will retry again in 128s at Mon Dec 21 19:22:52 2020
[19:21:02.658531] STUBBY: 1.1.1.1                                  : Conn closed: TLS - Resps=     6, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)=  9000
[19:21:02.658577] STUBBY: 1.1.1.1                                  : Upstream   : TLS - Resps=   150, Timeouts  =     0, Best_auth =Success
[19:21:02.658597] STUBBY: 1.1.1.1                                  : Upstream   : TLS - Conns=    16, Conn_fails=     0, Conn_shuts=      1, Backoffs     =     0
[19:21:03.252486] STUBBY: 9.9.9.9                                  : Conn closed: TLS - Resps=    11, Timeouts  =     0, Curr_auth =Success, Keepalive(ms)=  9000
[19:21:03.252521] STUBBY: 9.9.9.9                                  : Upstream   : TLS - Resps=   269, Timeouts  =     0, Best_auth =Success
[19:21:03.252541] STUBBY: 9.9.9.9                                  : Upstream   : TLS - Conns=    12, Conn_fails=     0, Conn_shuts=      1, Backoffs     =     0

So, was the problem that 8.8.8.8 was configured as DNS Server 1 in WAN Settings? Is this independent from stubby?
 

dave14305

Part of the Furniture
So, was the problem that 8.8.8.8 was configured as DNS Server 1 in WAN Settings? Is this independent from stubby?
They are independent, but a failure of 8.8.8.8 would impact both services (router's resolv.conf and stubby).

Do you have "Enable WAN down browser redirect notice" enabled or disabled? If the router thinks the WAN is down because it cannot ping 8.8.8.8 or resolve dns against 8.8.8.8 then it might redirect traffic and DNS to the "WAN down" internal page. Maybe a screenshot of the Network Monitoring area of the System page will answer more of my questions before I ask them. :)

Personally, I wouldn't ping the same IP that is being used in the DNS lookup tests (if this is how yours is setup). Diversity in internet endpoints is important if you're using the results to determine if the Internet is working or not.
 

Kanji-San

Regular Contributor
They are independent, but a failure of 8.8.8.8 would impact both services (router's resolv.conf and stubby).

Do you have "Enable WAN down browser redirect notice" enabled or disabled? If the router thinks the WAN is down because it cannot ping 8.8.8.8 or resolve dns against 8.8.8.8 then it might redirect traffic and DNS to the "WAN down" internal page. Maybe a screenshot of the Network Monitoring area of the System page will answer more of my questions before I ask them. :)

Personally, I wouldn't ping the same IP that is being used in the DNS lookup tests (if this is how yours is setup). Diversity in internet endpoints is important if you're using the results to determine if the Internet is working or not.

Really appreciate your help, Dave :)

WAN down browser redirect is enabled:
1608580686451.png


These are my current, working DNS settings:
1608580728712.png


As soon as I replace DNS Server1 with 8.8.8.8, the router thinks the Internet is down.
 

john9527

Part of the Furniture
Very unusual that google dns would be completely down....
Are you running any addons? Maybe a bad entry crept in to a blocking file?
 

Kanji-San

Regular Contributor
Very unusual that google dns would be completely down....
Are you running any addons? Maybe a bad entry crept in to a blocking file?
I don't think this is a Google DNS problem. I think my ISP has a problem with Google DNS. I checked with neighbors and they also see Google DNS not reachable. However if I connect via VPN or use my cell connection, Google DNS works fine.

I called my ISP, Gigamonster, and the customer support is unfortunately not helpful at all. I asked to open a ticket with their networking team.

The only addons I run are FlexQoS and connmon.
 

Kanji-San

Regular Contributor
They are independent, but a failure of 8.8.8.8 would impact both services (router's resolv.conf and stubby).

Do you have "Enable WAN down browser redirect notice" enabled or disabled? If the router thinks the WAN is down because it cannot ping 8.8.8.8 or resolve dns against 8.8.8.8 then it might redirect traffic and DNS to the "WAN down" internal page. Maybe a screenshot of the Network Monitoring area of the System page will answer more of my questions before I ask them. :)

Personally, I wouldn't ping the same IP that is being used in the DNS lookup tests (if this is how yours is setup). Diversity in internet endpoints is important if you're using the results to determine if the Internet is working or not.

I disabled now the Enable WAN down browser redirect notice and configured DNS Server1 with 8.8.8.8 again. The router claims not to be connected to the Internet.

Clients don't see Asus' down notice anymore, and DNS seems to work again, however it seems sluggish. stubby backs off in doubling (exponential) intervals: 2s, 4s, 8s, 16s, ... Maybe the sluggishness comes from the initial phase of 2s, 4s.
 

Kanji-San

Regular Contributor
After 7 hours the connection to Google DNS was restored, I assume by my ISP:
(connmon tracked this nicely)

Google DNS Ping.PNG


Thanks, Dave, for all your help :)
And I learned today that it's better to have the WAN browser redirect notice disabled because stubby does its job to rotate between all specified DNS server. That way clients can still get DNS resolution.
 

Similar threads

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top