What's new

ntpMerlin AGH is breaking ntpmerlin

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

I have followed chongnt suggestion and replaced the AdGuardHome.sh with the version from 1.55 and made NO other changes (Running AdGuardHome binary Version: v0.108.0-b.24) and even after just 30 minutes or so the 'Can't synchronise: no majority' log messages have gone and the DRIFT values are chaging again.
The AdGuardHome.sh would appear to be the root cause of the ntpMerlin problem when running with AdGuardHome.
Once again thanks to chongnt for spotting this connection and providing the details on how to get round the issue (I have been trying everything to fix this but until this thread never linked AGH with the problem).
Yea there is nothing dynamically that adguardhome.sh does that would impact ntpmerlin except for the fact adguardhome.sh does not wait for the clock to sync before allowing adguardhome to start. So it is important you come up with away for your ntp servers to have access to plain text dns as I have previously mention. I recommend going to the current version and trying my method mention above. Add a upstream exclusion for each ntpserver hostname because the adguardhome script does not wait for the clock to sync before starting dns service.


Example:

Screenshot_20221224_131004.jpg



And encrypted dns services are known not to work well with resolving ntp requests.

Maybe as @chongnt gets time he can test this suggestion and let you know how it works out.
 
Last edited:
Yea there is nothing dynamically that adguardhome.sh does that would impact ntpmerlin except for the fact adguardhome.sh does not wait for the clock to sync before allowing adguardhome to start. So it is important you come up with away for your ntp servers to have access to plain text dns as I have previously mention. I recommend going to the current version and trying my method mention above. Add a upstream exclusion for each ntpserver hostname because the adguardhome script does not wait for the clock to sync before starting dns service.


Example:

View attachment 46633


And encrypted dns services are known not to work well with resolving ntp requests.

Maybe as @chongnt gets time he can test this suggestion and let you know how it works out.
I had already tried that approach (ensuring the upstream DNS for the ntp servers was defined as plain DNS operations) before making the .sh mod but it did not fix the problem for me.
 
I had already tried that approach (ensuring the upstream DNS for the ntp servers was defined as plain DNS operations) before making the .sh mod but it did not fix the problem for me.
Did you try selecting "no" to using adguardhome as local cache during the install process?

Code:
Do you want to run AdGuardHome as a local caching DNS service which includes router traffic?

Basically, the most current .sh script doesn't wait for NTP to sync before allowing itself to take over as the master of DNS. If the routers local service traffic is getting sent to AdGuardHome before NTP has had a chance to sync, there in lies the problem. The only "true" solution for this is to require your router traffic to not go to AdGuardHome. Basically all of your network client traffic will still use adguardhome while all router "himself" traffic will avoid adguardhome and use the default WAN DNS (or ISP DNS).
 
Last edited:
Did you try selecting "no" to using adguardhome as local cache during the install process?

Code:
Do you want to run AdGuardHome as a local caching DNS service which includes router traffic?

Basically, the most current .sh script doesn't wait for NTP to sync before allowing itself to take over as the master of DNS. If the routers local service traffic is getting sent to AdGuardHome before NTP has had a chance to sync, there in lies the problem. The only "true" solution for this is to require your router traffic to not go to AdGuardHome. Basically all of your network client traffic will still use adguardhome while all router "himself" traffic will avoid adguardhome and use the default WAN DNS (or ISP DNS).
I tried add NTP server in upstream server but still the same. I have not try to select no on this yet. Perhaps will try this later. I have the mount bind in post-mount script anyway. But that was not run because AGH run it first.
Thing is, NTP can sync during startup, I have NTP check in nat-start, firewall-start script. The problem is afterwards, we will see a lot of messages chronyc selected new source. And in between there is can’t synchronize: no majority messages. There is also messages system clock wrong by over 0.5 second. Is this a sign that system time is too far off NTP time? Does it means system can get NTP time, otherwise it won’t know about this.
I will try update to 1.5.8 and select no on this option. Let post-mount script handle the use of local caching DNS server as system resolver.
 
I did a quick test with 1.5.8, once I mount bind manually the issue started. After unmount everything is back to normal.

Code:
mount -o bind /rom/etc/resolv.conf /tmp/resolv.conf

It appears this is not limited to AGH startup. Even after AGH started, it still happening.


I update to 1.5.8 and set
Do you want to run AdGuardHome as a local caching DNS service which includes router traffic?
to no. I also stop the mount bind in post-mount. Yet, the symptoms still happen.
 
Last edited:
I did a quick test with 1.5.8, once I mount bind manually the issue started. After unmount everything is back to normal.

Code:
mount -o bind /rom/etc/resolv.conf /tmp/resolv.conf

It appears this is not limited to AGH startup. Even after AGH started, it still happening.


I update to 1.5.8 and set
Do you want to run AdGuardHome as a local caching DNS service which includes router traffic?
to no. I also stop the mount bind in post-mount. Yet, the symptoms still happen.
Yea I don't know. Since there is actually nothing my script does dynamically to ntpmerlin it had to be something related to adguardhome itself. Chrony shouldn't be falling off sync unless some where there is a break in the actual dns connection. For all we know it might be something related to Unbound and dnssec since you guys run unbound as well (I know @chongnt does, but I don't know if the OP does).

If there is a dnssec setting adguardhome I'd advise turning it off since unbound already does dnssec. See if that changes any thing.

I am just surprised ntpmerlin isnt catching itself and restarting chrony once the "failure" happens.
 
Last edited:
I did a quick test with 1.5.8, once I mount bind manually the issue started. After unmount everything is back to normal.

Code:
mount -o bind /rom/etc/resolv.conf /tmp/resolv.conf

It appears this is not limited to AGH startup. Even after AGH started, it still happening.


I update to 1.5.8 and set
Do you want to run AdGuardHome as a local caching DNS service which includes router traffic?
to no. I also stop the mount bind in post-mount. Yet, the symptoms still happen.

Yep I passed over the script again looking for any clue on what could cause the chrony failure. I had no luck. One option is to restart ntpmerlin once adguardhome starts which can be done by adding an additional post cmd to the adguardhome S99 script. I am just unfamiliar with ntpmerlin to know what the best command for restart is. Feel free to share anything you know.
 
Last edited:
Recently I installed AGH to block ads for my network (ax88u). Since installing AGH Ntpmerlin has been outputting weird errors in the system log. Ntpmerlin works fine without AGH installed. Both are running stock settings except the custom ntp and DNS servers I chose. If anyone has encountered this issue or knows of a fix, any insight would be appreciated.

Dec 23 16:39:05 chronyd[907808]: Can't synchronise: no majority
Dec 23 16:39:05 chronyd[907808]: Selected source 17.253.16.253 (time.apple.com)
Dec 23 16:39:33 chronyd[907808]: Selected source 17.253.4.125 (time.apple.com)
Dec 23 16:40:09 chronyd[907808]: Can't synchronise: no majority
Dec 23 16:40:10 chronyd[907808]: Selected source 17.253.16.253 (time.apple.com)
Dec 23 16:40:38 chronyd[907808]: Selected source 17.253.4.125 (time.apple.com)
Dec 23 16:41:14 chronyd[907808]: Can't synchronise: no majority
Dec 23 16:41:16 chronyd[907808]: Selected source 17.253.4.125 (time.apple.com)
Dec 23 16:42:19 chronyd[907808]: Can't synchronise: no majority
Dec 23 16:42:20 chronyd[907808]: Selected source 17.253.4.253 (time.apple.com)
Dec 23 16:42:48 chronyd[907808]: Selected source 17.253.4.125 (time.apple.com)
Dec 23 16:43:23 chronyd[907808]: Can't synchronise: no majority
Dec 23 16:43:25 chronyd[907808]: Selected source 17.253.4.253 (time.apple.com)
This no majority is actually a known issue with chronyc. It is not limited to being an issue here on asuswrt-merlin routers.

here is a couple of links discussing it.



I have seen this happen over the years using chrony as well ... adding a third
source (or more) is the solution I used.

Lonnie

This user recommends using three or more ntpservers with chrony.


Here is another response

Not surprising. How can chrony decide which is the right one, and which source
has gone crazy. That is why it is always advised to use an odd number of
sources. Three allows one to go crazy and for chrony to know it is the one to
ignore. Five allows 2 to go crazy. Of course this also implies that the
sources must be independent. If two of the sources are both tied together in
what they report, (eg both use one third source for their time) then of course
chrony cannot detect that they have both gone crazy. It will assume they are
good and the third one is the bad one. So, use three, not two, independent

sources for time.
 
Just a follow up post. I have tired most of the possible solutions posted in this thread, and nothing fixed the problem. The only solution I haven't tried is running an older version of AGH, not sure if I like the idea of running outdated software. I uninstalled AGH again to double check my first theory, sure enough chrony resumed normal function. Reinstalling AGH (see picture) immediately caused the synchronization error. I think for now I will run AGH without Ntpmerlin. I want to thank everyone for their help.
 

Attachments

  • Screenshot_20221225_123938.png
    Screenshot_20221225_123938.png
    45.8 KB · Views: 47
Just a follow up post. I have tired most of the possible solutions posted in this thread, and nothing fixed the problem. The only solution I haven't tried is running an older version of AGH, not sure if I like the idea of running outdated software. I uninstalled AGH again to double check my first theory, sure enough chrony resumed normal function. Reinstalling AGH (see picture) immediately caused the synchronization error. I think for now I will run AGH without Ntpmerlin. I want to thank everyone for their help.
try adjusting adguardhomes ratelimiting under dns settings. switch it temporarily to disabled to see if it fixes the issue.


1672119929694.png
 
Last edited:
Yea there is nothing dynamically that adguardhome.sh does that would impact ntpmerlin except for the fact adguardhome.sh does not wait for the clock to sync before allowing adguardhome to start. So it is important you come up with away for your ntp servers to have access to plain text dns as I have previously mention. I recommend going to the current version and trying my method mention above. Add a upstream exclusion for each ntpserver hostname because the adguardhome script does not wait for the clock to sync before starting dns service.


Example:

View attachment 46633


And encrypted dns services are known not to work well with resolving ntp requests.

Maybe as @chongnt gets time he can test this suggestion and let you know how it works out.
After reviewing this for abit longer and researching with openwrt as well,

I believe the solution was correct:

[/pool.ntp.org/]1.1.1.1 example appears to be working.
 
Last edited:
I did a quick test with 1.5.8, once I mount bind manually the issue started. After unmount everything is back to normal.

Code:
mount -o bind /rom/etc/resolv.conf /tmp/resolv.conf

It appears this is not limited to AGH startup. Even after AGH started, it still happening.


I update to 1.5.8 and set
Do you want to run AdGuardHome as a local caching DNS service which includes router traffic?
to no. I also stop the mount bind in post-mount. Yet, the symptoms still happen.
Also, when chronyd cannot determine which server to use, it is sometimes best to add "trust" to atleast one so it will not continuously calling false ticker when there is not actual issue.
 

Similar threads

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top