What's new

ntpMerlin LAN clients can't sync

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

Same problem with chrony. There's something going on on the router. Let me try unsetting the TZ thing.
 
After I rebooted without the TZ stuff it's working. The problem is that now dnscrypt-proxy has the wrong time. I'm not sure how to resolve a contention between two applications that want timezone to be configured two different ways.
 
Nope. It's still doing something bizarre. I've never seen ntpd behave this way on other systems. If the output of ntpq is accurate, it's not even choosing a perferred server even after several hours:

Code:
ntpq> peers
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
132.163.97.1    .NIST.           1 u   36   64  377   35.425  +60.003   0.002
129.6.15.30     .NIST.           1 u  51y   64  377   67.681  +49.925   0.000
63.145.169.3    .TRUE.           1 u  51y   64  372  156.975  -28.045   0.000
66.220.9.122    .CDMA.           1 u  51y   64  357   11.875  +50.753   0.000

Screenshot_2021-06-01 ntpMerlin.png
 
Does the firmware's built-in ntp client "fight" with ntpd, or does ntpmerlin disable it somehow?
 
Learning more about debugging ntp as I go.

Code:
ntpq> rv 37824
associd=37824 status=901a conf, reach, sel_reject, 1 event, sys_peer,
srcadr=time-a-wwv.nist.gov, srcport=123, dstadr=192.168.1.1,
dstport=123, leap=00, stratum=1, precision=-29, rootdelay=0.244,
rootdisp=0.488, refid=NIST,
reftime=e4617b80.00000000  Tue, Jun  1 2021 20:37:36.000,
rec=83aa7e80.aae2c105  Wed, Dec 31 1969 16:00:00.667, reach=377,
unreach=0, hmode=3, pmode=4, hpoll=7, ppoll=13, headway=0,
flash=400 peer_dist, keyid=0, offset=-187.893, delay=35.568,
dispersion=15937.500, jitter=0.000, xleave=0.098,
filtdelay=  1622605 1622604 1622604 1622604 1622604 1622604 1622604 1622604,
filtoffset= +811302 +811302 +811302 +811302 +811302 +811302 +811302 +811302,
filtdisp=   16000.0 16000.0 16000.0 16000.0 16000.0 16000.0 16000.0 16000.0
Take a look at the received time "rec=" for that server. They all look similar. It looks like ntpd doesn't know what time it is. It just knows it's the beginning of computer time UTC.

The router knows what time it is, though.

Code:
admin@RT-AC88U-B1E8:/# date
Tue Jun  1 20:51:38 DST 2021
 
Last edited:
I am completely flummoxed. Here are several ntpq peer commands performed just minutes apart. Notice how one minute it says a server was last reached a few seconds ago, then the next minute it says it hasn't heard from that server in 51 years (the beginning of the Unix epoch). I don't understand.

Code:
ntpq> peers
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 45.15.168.98 (b 193.204.114.233  2 u  51y   64    7    0.000   +0.000   0.000
 eterna.binary.n 216.218.192.202  2 u  51y   64    7    0.000   +0.000   0.000
 ntp.wdc1.us.lea 130.133.1.10     3 u   16   64    7   62.275  -1529.3  11.554
 t2.time.bf1.yah 129.6.15.28      2 u   15   64    7   70.414  -1526.5  11.118
ntpq> peers
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 45.15.168.98 (b 193.204.114.233  2 u  51y   64   16    0.000   +0.000   0.000
 eterna.binary.n 216.218.192.202  2 u   19   64   17   58.219  -1540.8   0.002
 ntp.wdc1.us.lea 130.133.1.10     3 u  51y   64   17  1622608  +811304 8113041
 t2.time.bf1.yah 129.6.15.28      2 u  51y   64   17  1622608  +811304 8113041
ntpq> peers
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 45.15.168.98 (b 129.70.132.32    3 u  51y   64  376  1622609  +811304 33593.1
 eterna.binary.n 128.138.140.44   2 u  51y   64  377  1622609  +811304 32128.0
 ntp.wdc1.us.lea 130.133.1.10     3 u   14   64  377   62.320  -1665.5 8113045
 t2.time.bf1.yah 129.6.15.28      2 u  51y   64  377  1622609  +811304 33483.1
 
I must admit I've never seen this either. Perhaps a factory reset is worth a shot to clear out anything strange in the firmware?
 
Are you able to post the output of ntpq -p on your router? Now that it's been running overnight it seems to be getting the clock under control, but still has all the '51y' weirdness in the 'when' column. I'm starting to wonder if the problem isn't the ntpq binary itself.
 
Does the firmware's built-in ntp client "fight" with ntpd, or does ntpmerlin disable it somehow?
The firmware NTP client is disabled ntpMerlin. But if that did not happen, it could explain your situation.
Code:
# ps | grep -E 'chrony|ntp'
 
Nope. Dang.

Code:
admin@RT-AC88U-B1E8:/tmp/mnt/entware/entware/etc#  ps | grep -E 'chrony|ntp'
30308 admin     1448 S    {timeserverd} /bin/sh /opt/bin/timeserverd S77ntpd
30325 admin     7540 S    ntpd -c /jffs/addons/ntpmerlin.d/ntp.conf -g
 
I think the root of the issue lies in finding out what's going on with the way ntpq alternately understands the system time and then doesn't. This really feels like some kind of hardware issue (or perhaps a binary compiled with some option that's not supported on my system). I wonder how feasible it is to build ntpd and ntpq on the RT-AC88U.

Are these binaries compress by UPX by any chance? There are some outstanding bugs with ARMv7 binaries compressed by UPX.

Worst of all, LAN Windows machines still refuse to sync with the router.
 
Last edited:
This is interesting:

Code:
admin@RT-AC88U-B1E8:/tmp/mnt/entware/entware/sbin# ntpdc -l
localhost.localdomain: timed out, nothing received
***Request timed out
 
This is interesting:

Code:
admin@RT-AC88U-B1E8:/tmp/mnt/entware/entware/sbin# ntpdc -l
localhost.localdomain: timed out, nothing received
***Request timed out
Would you consider to use cronyd and try it out? With chrony I can use the command "chronyc clients" to check the status.
 
Would you consider to use cronyd and try it out? With chrony I can use the command "chronyc clients" to check the status.
I tried chrony and still could not get any LAN clients to sync. I did not try to debug chrony because I'm more familiar with ntp's tools and it's having the same issue, which suggests some kind of system problem.

Code:
admin@RT-AC88U-B1E8:/tmp/mnt/entware/entware# ntpq -c rv
associd=0 status=c028 leap_alarm, sync_unspec, 2 events, no_sys_peer,
version="ntpd 4.2.8p15@1.3728-o Sun Apr 18 13:11:49 UTC 2021 (1)",
processor="armv7l", system="Linux/2.6.36.4brcmarm", leap=11, stratum=16,
precision=-19, rootdelay=0.000, rootdisp=0.000, refid=.,
reftime=83aa7e80.b8b2420a  Wed, Dec 31 1969 16:00:00.721,
clock=e4622599.3bdfa4c6  Wed, Jun  2 2021  8:43:21.233, peer=0, tc=9,
mintc=3, offset=-2.315905, frequency=+4.421, sys_jitter=0.904328,
clk_jitter=2.971, clk_wander=1.526

The system clock is correct, but that reftime.

I'm prepping for a full reset. It's going to be a major production.

I've got backups of /jffs, /opt, the cfg file, and an nvram dump just in case I need to quickly get back to this known state. They're also handy to consult when manually reconfiguring.
 
Well, it's not ntpmerlin specifically. I installed ntp directly from opkg and same issue.
 
A lot of packets to NTP servers that show the system has wrong receive timestamps:

Code:
 15:57:35.790753 IP (tos 0xb8, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 76)
    x.x.x.x.ntp > time.cloudflare.com.ntp: [bad udp cksum 0xbf88 -> 0xeee5!] NTPv4, length 48
        Client, Leap indicator: clock unsynchronized (192), Stratum 0 (unspecified), poll 6 (64s), precision -19
        Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
          Reference Timestamp:  0.000000000
          Originator Timestamp: 3831663388.864090386 (2021/06/02 15:56:28)
          Receive Timestamp:    2208988800.149314613 (1969/12/31 16:00:00)
          Transmit Timestamp:   3831663455.790690755 (2021/06/02 15:57:35)
            Originator - Receive Timestamp:  -1622674588.714775772
            Originator - Transmit Timestamp: +66.926600369

Where does the receive timestamp come from? That seems to be where the issue lies. It's almost like the server is confusing outgoing packets with incoming packets.

Packets returning from the server seem OK.

Code:
-7:00:00.219937 IP (tos 0x0, ttl 59, id 17102, offset 0, flags [DF], proto UDP (17), length 76)
    time.cloudflare.com.ntp > x.x.x.x.ntp: [udp sum ok] NTPv4, length 48
        Server, Leap indicator:  (0), Stratum 3 (secondary reference), poll 6 (64s), precision -25
        Root Delay: 0.030319, Root dispersion: 0.001129, Reference-ID: 10.12.2.186
          Reference Timestamp:  3831663244.101998406 (2021/06/02 15:54:04)
          Originator Timestamp: 3831663455.790690755 (2021/06/02 15:57:35)
          Receive Timestamp:    3831663455.863878000 (2021/06/02 15:57:35)
          Transmit Timestamp:   3831663455.863907319 (2021/06/02 15:57:35)
            Originator - Receive Timestamp:  +0.073187244
            Originator - Transmit Timestamp: +0.073216564

EDIT - I just noticed the bad UDP checksum in the outgoing packet and the weird time (-7:00:00) on the server's reply.
 
Last edited:

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top