What's new

AdGuardHome AdGuard Home fails to start after reboot

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

Markfree

Regular Contributor
Router: RT-AX86U - 386.5.2
AdGuard Home: 1.5.2 - build Release

Sometimes, when I restart the router, AdGuard simply doesn't start. About every 3 router reboots, 1 AdGuard doesn't start.
When that happens, I manually start the service and it just starts with no issues. (/opt/etc/init.d/S99AdGuardHome start)

I've checked the router's log but could not find any indication of what could have happened.
When AdGuard fails to start, there are no messages about it in the log.

Is this happening to someone else?
 
AGH has it's log in /opt/var/log/AdGuardHome.log. Perhaps can check if there is any abnormal entry when the issue happen.
 
Router: RT-AX86U - 386.5.2
AdGuard Home: 1.5.2 - build Release

Sometimes, when I restart the router, AdGuard simply doesn't start. About every 3 router reboots, 1 AdGuard doesn't start.
When that happens, I manually start the service and it just starts with no issues. (/opt/etc/init.d/S99AdGuardHome start)

I've checked the router's log but could not find any indication of what could have happened.
When AdGuard fails to start, there are no messages about it in the log.

Is this happening to someone else?
What does your router clock look like ? how long does it take for your routers time to get set? to me this sounds like it isn't actually adguardhome failing to start, but more like the router NTP is not syncing in time for Entware services to start properly. What do your WAN settings look like on your WAN page, specifically wan dns 1 and wan dns 2.
 
As @SomeWhereOverTheRainBow stated, there's no AdGuard log in /var/log.
Code:
# ls -lah /opt/var/log/
drwxr-xr-x    3 diasdm   root        4.0K May  7 00:05 .
drwxr-xr-x    8 diasdm   root        4.0K May 14 04:00 ..
drwxr-xr-x    2 diasdm   root        4.0K Apr  2 13:22 chrony
-rw-------    1 diasdm   root       12.2K May 14 00:05 logrotate.log
-rw-------    1 diasdm   root        2.3M May 14 10:49 messages
-rw-------    1 diasdm   root        4.0M May  7 00:01 messages-20220507
-rw-------    1 diasdm   root       75.7K May 14 10:01 syslog-ng.log
-rw-------    1 diasdm   root      242.7K May 14 10:39 wlceventd.log

My WAN DNS is configured like this:
1652539502185.png


The last time AdGuard failed to start was yesterday at 4 am. The router rebooted and the clock was changed.
Code:
May 13 04:00:01 [hostname] zabbix_agentd[3394]: Got signal [signal:15(SIGTERM),sender_pid:10031,sender_uid:0,reason:0]. Exiting ...
May 13 04:00:01 [hostname] zabbix_agentd[3394]: Zabbix Agent stopped. Zabbix 5.0.18 (revision c3f6aa65fe).
May  5 02:05:07 syslogd started: BusyBox v1.25.1
May  5 02:05:07 kernel: klogd started: BusyBox v1.25.1 (2022-03-25 10:23:25 EDT)

After a few seconds, it resynced the clock but did not start AdGuard.
Code:
May  5 02:05:18 WAN_Connection: WAN was restored.
May  5 02:05:18 dnsmasq[1558]: read /etc/ hosts - 11 addresses
May  5 02:05:18 dnsmasq[1558]: read /jffs/addons/YazDHCP.d/.hostnames - 17 addresses
May  5 02:05:18 dnsmasq-dhcp[1558]: read /jffs/addons/YazDHCP.d/.staticlist
May  5 02:05:18 dnsmasq-dhcp[1558]: read /jffs/addons/YazDHCP.d/.optionslist
May  5 02:05:18 dnsmasq[1558]: using nameserver 2606:4700:4700::1001#53
May  5 02:05:18 dnsmasq[1558]: using nameserver 2001:4860:4860::8844#53
May  5 02:05:18 dnsmasq[1558]: using nameserver 1.0.0.1#53
May  5 02:05:18 dnsmasq[1558]: using nameserver 8.8.4.4#53
May  5 02:05:18 dnsmasq[1558]: using only locally-known addresses for domain[ddns_hostname]
May 13 04:01:02 ntpd: Initial clock set
May 13 04:01:02 dnsmasq[1558]: now checking DNSSEC signature timestamps
May 13 04:01:02 dnsmasq[1558]: read /etc/ hosts - 11 addresses
May 13 04:01:02 rc_service: ntpd_synced 3061:notify_rc restart_diskmon

Today it didn't fail, just started.
 
May 13 04:01:02 rc_service: ntpd_synced 3061:notify_rc restart_diskmon
A good way for any script to be informed of NTP being synced is to hook into service-event-end for $1=restart $2=diskmon. It can be better than waiting and looping for something that might never happen.
 
As @SomeWhereOverTheRainBow stated, there's no AdGuard log in /var/log.
Code:
# ls -lah /opt/var/log/
drwxr-xr-x    3 diasdm   root        4.0K May  7 00:05 .
drwxr-xr-x    8 diasdm   root        4.0K May 14 04:00 ..
drwxr-xr-x    2 diasdm   root        4.0K Apr  2 13:22 chrony
-rw-------    1 diasdm   root       12.2K May 14 00:05 logrotate.log
-rw-------    1 diasdm   root        2.3M May 14 10:49 messages
-rw-------    1 diasdm   root        4.0M May  7 00:01 messages-20220507
-rw-------    1 diasdm   root       75.7K May 14 10:01 syslog-ng.log
-rw-------    1 diasdm   root      242.7K May 14 10:39 wlceventd.log

My WAN DNS is configured like this:
View attachment 41259

The last time AdGuard failed to start was yesterday at 4 am. The router rebooted and the clock was changed.
Code:
May 13 04:00:01 [hostname] zabbix_agentd[3394]: Got signal [signal:15(SIGTERM),sender_pid:10031,sender_uid:0,reason:0]. Exiting ...
May 13 04:00:01 [hostname] zabbix_agentd[3394]: Zabbix Agent stopped. Zabbix 5.0.18 (revision c3f6aa65fe).
May  5 02:05:07 syslogd started: BusyBox v1.25.1
May  5 02:05:07 kernel: klogd started: BusyBox v1.25.1 (2022-03-25 10:23:25 EDT)

After a few seconds, it resynced the clock but did not start AdGuard.
Code:
May  5 02:05:18 WAN_Connection: WAN was restored.
May  5 02:05:18 dnsmasq[1558]: read /etc/ hosts - 11 addresses
May  5 02:05:18 dnsmasq[1558]: read /jffs/addons/YazDHCP.d/.hostnames - 17 addresses
May  5 02:05:18 dnsmasq-dhcp[1558]: read /jffs/addons/YazDHCP.d/.staticlist
May  5 02:05:18 dnsmasq-dhcp[1558]: read /jffs/addons/YazDHCP.d/.optionslist
May  5 02:05:18 dnsmasq[1558]: using nameserver 2606:4700:4700::1001#53
May  5 02:05:18 dnsmasq[1558]: using nameserver 2001:4860:4860::8844#53
May  5 02:05:18 dnsmasq[1558]: using nameserver 1.0.0.1#53
May  5 02:05:18 dnsmasq[1558]: using nameserver 8.8.4.4#53
May  5 02:05:18 dnsmasq[1558]: using only locally-known addresses for domain[ddns_hostname]
May 13 04:01:02 ntpd: Initial clock set
May 13 04:01:02 dnsmasq[1558]: now checking DNSSEC signature timestamps
May 13 04:01:02 dnsmasq[1558]: read /etc/ hosts - 11 addresses
May 13 04:01:02 rc_service: ntpd_synced 3061:notify_rc restart_diskmon

Today it didn't fail, just started.
do you have an actual estimate of how long it too your clock to sync? maybe more logs to go with it as well? are you sure your router is actually completing the reboot process when this incident occurs? could something actually be hanging the reboot process preventing it from completing the reboot? I don't see any logs of anything else starting except YazDHCP.d. What about any other entware packages? where are your logs for your USB device mounting? AdGuardHome requires your USB to mount since it relies on ENTWARE( i.e. it relies on the USB storage entware is stored on to mount, other wise the manager that starts it just hangs 10 until it mounts) to start, which I don't see happening when these failures to start occur.


My diagnosis is, your AdGuardHome appears not to be starting because your USB disk never mounts during the times the incidents appear to occur. It could mean you have some corruption somewhere on the disk itself preventing the mount on subsequent reboots, or you have some process hanging your routers reboot that does not allow it to complete the full reboot. AdGuardHome does not actually control or hinder the USB mounting process. If your disk continues to fail to mount during the reboot process, then you will still face this delimma of AdGuardHome not starting. Unfortunately AdGuardHome not starting is a symptom of what is happening, not the actual problem.
 
Last edited:
A good way for any script to be informed of NTP being synced is to hook into service-event-end for $1=restart $2=diskmon. It can be better than waiting and looping for something that might never happen.
I may look into this practice more!. However, I think what actually is happening with the OP in this instance is the USB is failing to mount on some of the reboots. A possible indication of a disk (or entware) corruption. One of the factors for AdGuardHome to actually start is that the USB that houses ENTWARE must be mounted. The clock not sync'ing doesn't hinder the mount process unless the OP is using the pre-mount disk check script, which I see no logs indicating such. I have yet to see any indications of the disk actually mounting either. We would have at least seen an indication that the monitor process has started since we know the clock clearly syncs.


For me to see what is happening in the OP's situation, I need as much logs as possible. Not just the first 10 seconds after the reboot. I need to see what happens for atleast 10 minutes. We need to know if the disk ever was going to mount. If it was, at somepoint AdGuardHome ( or atleast its manager) would have shown some signs of a VERBOSE life where I could see if it was the actual AdGuardHome process itself not starting.
 
Last edited:
Thanks for the insights.

After every reboot, the USB drive is mounted and everything else seems to start correctly.
Besides AdGuard, I see no other issue.
The most odd thing is that that happens occasionally.

If you wouldn't mind looking at the log, I've shared it through Mega due to the file size.
It goes from May 7 to today. During this period, AdGuard didn't start at 05/08 and 05/13.
 
This attached shows the section relevant to entware mounting, seems it does mount, but even after mount all of jacks processes seem to still be waiting for ntp even though the clock is set already.

 
From what it appears in the logs I am seeing AdGuardHome start on reboots. I am not sure the cause of your occasional behavior where it doesn't, but from what i can tell in the logs on each reboot I see where adguardhome is started. However since @dave14305 is also taking a look at it, maybe he might see an anomaly that I don't. Strange that it didnt happen on the 8th and 13th though. I followed the logs and there is no indication of why.
 
Last edited:
@Markfree maybe something is bugging out specifically with your adguardhome itself. I am not quite sure because I don't see any errors in the log you submitted, how ever maybe taking a deep dive at your adguardhome settings may help pinpoint the issue. without giving any personally identifiable information, try to explain your setup. like what upstreams are you using. how do you have it running.
 
@Markfree I just pushed a patch that hardened some of the auxillary file logic as a temporary solution. Let me know if it improves your reboot conditions of AdGuardHome over the next few days. I am currently out and about for work right now away from my workstation, but if the patch doesn't do the trick I may have to look into alternative methods as the one pointed out by @dave14305 . Unfortunately those rewrites wont be in the works till sometime mid next week at best. That should be long enough for you to test the current solution I have improvised.
 
maybe he might see an anomaly that I don't.
The PRECMD to kill dnsmasq isn’t being run on the “bad“ days. That’s all that seems unusual to me. So why would that be? If dnsmasq doesn’t die, how can AdGuard start?
 
The PRECMD to kill dnsmasq isn’t being run on the “bad“ days. That’s all that seems unusual to me. So why would that be? If dnsmasq doesn’t die, how can AdGuard start?
good question, here is your answer.

AdGuardHome.sh has a section that runs in a continous loop and checks for AdGuardHome not starting. On the first sign of dnsmasq not being killed, we would simply catch adguardhome on the next loop. The real question is why is the script not getting to the point where our continous loop starts. This loop is top teir to before the whole kill dnsmasq occurs. Intentionally, the loop at somepoint would kill dnsmasq. On the "bad days" we see no indication of any AdGuardHome start loops running.
 
Last edited:
The PRECMD to kill dnsmasq isn’t being run on the “bad“ days. That’s all that seems unusual to me. So why would that be? If dnsmasq doesn’t die, how can AdGuard start?
On the good days adguardhome is running, we see the manager script is all HAR HAR and catching adguardhome when it is stumbling drunk and restarting it.
 
@Markfree I just pushed a patch that hardened some of the auxillary file logic as a temporary solution. Let me know if it improves your reboot conditions of AdGuardHome over the next few days. I am currently out and about for work right now away from my workstation, but if the patch doesn't do the trick I may have to look into alternative methods as the one pointed out by @dave14305 . Unfortunately those rewrites wont be in the works till sometime mid next week at best. That should be long enough for you to test the current solution I have improvised.
Thanks for the quick replys.
I've updated AdGuard with your latest patch.
Now, lets wait and see if it fails to start again.

Maybe there's some misconfiguration on my part, but I'm not seeing it. I've shared AdGuardHome.yaml so you can have a look.
 

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top