What's new

ntpMerlin LAN clients can't sync

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

I rejiggered everything having to do with timezones on my system. No dice. Waiting until the family is asleep to see if I have time for a reset.

These two ntpq queries are taken less than 5 seconds apart:

Code:
admin@RT-AC88U-B1E8:/tmp/mnt/entware/entware/sbin# ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
x.ns.gin.ntt.ne 249.224.99.213   2 u  51y   64    3    0.000   +0.000   0.000
ntp.seby.io     17.253.66.253    2 u   48   64    3  153.602   +7.114   0.002
ntp1.ds.network 92.21.53.217     2 u   48   64    3  239.395  +62.005 8113392
pve03.as24220.n 220.158.215.20   3 u   49   64    3  165.027  +13.243   0.002
admin@RT-AC88U-B1E8:/tmp/mnt/entware/entware/sbin# ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
x.ns.gin.ntt.ne 249.224.99.213   2 u  51y   64    7    0.000   +0.000   0.000
ntp.seby.io     17.253.66.253    2 u  51y   64    7  1622678  +811339   0.002
ntp1.ds.network 92.21.53.217     2 u  51y   64    7  1622678  +811339 8113393
pve03.as24220.n 220.158.215.20   3 u  51y   64    7  1622678  +811339   0.002
 
This has to be an issue with the entware ntp binaries or dependencies. I need to find some other ntpd binaries for ARMv7.
 
It may be an issue with entware and timezones. The ntp.log entries are in UTC time, not local time. I see this in other entware apps until I explicitly inform them of the entware timezone info.
 
Seeing these weird packets with tcpdump. It's only some packets.

Code:
01:33:34.323109 IP (tos 0xb8, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 76)
    x.x.x.x.ntp > ntp5.mum-in.hosts.301-moved.de.ntp: [bad udp cksum 0xe94b -> 0x17dc!] NTPv4, length 48
        Client, Leap indicator: clock unsynchronized (192), Stratum 0 (unspecified), poll 6 (64s), precision -19
        Root Delay: 0.000000, Root dispersion: 0.004135, Reference-ID: (unspec)
          Reference Timestamp:  0.000000000
          Originator Timestamp: 3831672747.447959177 (2021/06/03 01:32:27)
          Receive Timestamp:    2208988800.675553138 (1970/01/01 00:00:00)
          Transmit Timestamp:   3831672814.323003910 (2021/06/03 01:33:34)
            Originator - Receive Timestamp:  -1622683946.772406039
            Originator - Transmit Timestamp: +66.875044732

Then there are good packets too:

Code:
01:33:28.323112 IP (tos 0xb8, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 76)
    x.x.x.x.ntp > ntp1.torix.ca.ntp: [bad udp cksum 0x20d0 -> 0x0150!] NTPv4, length 48
        Client, Leap indicator: clock unsynchronized (192), Stratum 0 (unspecified), poll 6 (64s), precision -19
        Root Delay: 0.000000, Root dispersion: 0.004043, Reference-ID: (unspec)
          Reference Timestamp:  0.000000000
          Originator Timestamp: 3831672740.365237482 (2021/06/03 01:32:20)
          Receive Timestamp:    3831672740.398145381 (2021/06/03 01:32:20)
          Transmit Timestamp:   3831672808.323005522 (2021/06/03 01:33:28)
            Originator - Receive Timestamp:  +0.032907898
            Originator - Transmit Timestamp: +67.957768039

The different packets solidifies my suspicion that this is an ntpd problem. I would think an environment or system issue would cause consistently bad packets.
 
So I monitored one host:

Code:
admin@RT-AC88U-B1E8:/tmp/mnt/entware/entware/var/spool/ntp# tcpdump -i eth0 host
 ntp5.mum-in.hosts.301-moved.de -vvv
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
01:48:11.323217 IP (tos 0xb8, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 76)
    x.x.x.x.ntp > ntp5.mum-in.hosts.301-moved.de.ntp: [bad udp cksum 0xe94b -> 0x6037!] NTPv4, length 48
        Client, Leap indicator: clock unsynchronized (192), Stratum 0 (unspecified), poll 6 (64s), precision -19
        Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
          Reference Timestamp:  0.000000000
          Originator Timestamp: 3831673624.451940738 (2021/06/03 01:47:04)
          Receive Timestamp:    3831673624.565429405 (2021/06/03 01:47:04)
          Transmit Timestamp:   3831673691.323147763 (2021/06/03 01:48:11)
            Originator - Receive Timestamp:  +0.113488666
            Originator - Transmit Timestamp: +66.871207024
00:00:00.008176 IP (tos 0x0, ttl 52, id 63049, offset 0, flags [DF], proto UDP (17), length 76)
    ntp5.mum-in.hosts.301-moved.de.ntp > x.x.x.x.ntp: [udp sum ok] NTPv4, length 48
        Server, Leap indicator:  (0), Stratum 2 (secondary reference), poll 6 (64s), precision -23
        Root Delay: 0.022689, Root dispersion: 0.045547, Reference-ID: 14.139.60.107
          Reference Timestamp:  3831671747.264881498 (2021/06/03 01:15:47)
          Originator Timestamp: 3831673691.323147763 (2021/06/03 01:48:11)
          Receive Timestamp:    3831673691.453204151 (2021/06/03 01:48:11)
          Transmit Timestamp:   3831673691.453252667 (2021/06/03 01:48:11)
            Originator - Receive Timestamp:  +0.130056387
            Originator - Transmit Timestamp: +0.130104904
01:49:20.323109 IP (tos 0xb8, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 76)
    x.x.x.x.ntp > ntp5.mum-in.hosts.301-moved.de.ntp: [bad udp cksum 0xe94b -> 0xf5d9!] NTPv4, length 48
        Client, Leap indicator: clock unsynchronized (192), Stratum 0 (unspecified), poll 6 (64s), precision -19
        Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
          Reference Timestamp:  0.000000000
          Originator Timestamp: 3831673691.453252667 (2021/06/03 01:48:11)
          Receive Timestamp:    2208988800.008178023 (1970/01/01 00:00:00)
          Transmit Timestamp:   3831673760.323004341 (2021/06/03 01:49:20)
            Originator - Receive Timestamp:  -1622684891.445074643
            Originator - Transmit Timestamp: +68.869751673
00:00:00.443431 IP (tos 0x0, ttl 52, id 7962, offset 0, flags [DF], proto UDP (17), length 76)
    ntp5.mum-in.hosts.301-moved.de.ntp > x.x.x.x.ntp: [udp sum ok] NTPv4, length 48
        Server, Leap indicator:  (0), Stratum 2 (secondary reference), poll 6 (64s), precision -23
        Root Delay: 0.022689, Root dispersion: 0.046585, Reference-ID: 14.139.60.107
          Reference Timestamp:  3831671747.264881498 (2021/06/03 01:15:47)
          Originator Timestamp: 3831673760.323004341 (2021/06/03 01:49:20)
          Receive Timestamp:    3831673760.452384601 (2021/06/03 01:49:20)
          Transmit Timestamp:   3831673760.452425126 (2021/06/03 01:49:20)
            Originator - Receive Timestamp:  +0.129380260
            Originator - Transmit Timestamp: +0.129420784
^C
4 packets captured
25 packets received by filter
0 packets dropped by kernel
In the first exchange everything's normal, and then a few seconds later my router is sending that whacked receive timestamp again.
 
This has to be an issue with the entware ntp binaries or dependencies. I need to find some other ntpd binaries for ARMv7.
What is the ldd output?
Code:
ldd $(which ntpd)
I have no idea what could be wrong, but maybe there’s an Entware library mismatch.
 
Code:
admin@RT-AC88U-B1E8:/tmp/mnt/entware/entware/var/spool/ntp# ldd $(which ntpd)
        libcap.so.2 => /opt/lib/libcap.so.2 (0x2aafa000)
        libm.so.6 => /opt/lib/libm.so.6 (0x2abb9000)
        libcrypto.so.1.1 => /opt/lib/libcrypto.so.1.1 (0x2ac6e000)
        libssp.so.0 => /opt/lib/libssp.so.0 (0x2ab0f000)
        libgcc_s.so.1 => /opt/lib/libgcc_s.so.1 (0x2ab21000)
        libpthread.so.0 => /opt/lib/libpthread.so.0 (0x2ae6a000)
        libc.so.6 => /opt/lib/libc.so.6 (0x2ae93000)
        /opt/lib/ld-linux.so.3 (0x2ab86000)
        libdl.so.2 => /opt/lib/libdl.so.2 (0x2ab62000)
 
Code:
lrwxrwxrwx    1 admin    root            10 Jun  1 10:44 ld-linux.so.3 -> ld-2.23.so*

I don't know if that's relevant.
 
I wonder if there's a relatively easy way to put a router OS into a VM so I can see if this happens in that case as well?
 
The receive timestamp comes from the kernel. It might be a bug that causes some packets to have a corrupted timestamp. If you can rebuild ntpd, try removing the following line from ntpd/ntp_io.c:

ts = nts; /* network time stamp */

This should force ntpd to use its own (normally less accurate) timestamp.
 
I haven't been able to sync to an ntp server in pretty much a year, on my RT-AC3200. I have yet to figure it out. If you do stumble across a fix, let me know and I'll try it too.

Code:
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
-lithium.constan 192.5.41.40      2 u   35   64    7   82.535   -0.513   2.984
+ns3.switch.ca   206.108.0.131    2 u   37   64    7   68.476  +14.865   1.821
*70.35.196.28    132.163.96.2     2 u   35   64    7   74.021   +0.424   2.360
+64.ip-54-39-23. 207.197.87.124   4 u   39   64    7   76.309   +0.160   2.546

1623054769111.png


Nothing on my network can reach an NTP server, which is really frustrating for devices like Raspberry PI's, which refuse to boot properly without functioning NTP (or lack of an internet connection) - non-functioning NTP plus working internet bones them. :p
 
I don't know why it didn't dawn on me to check this earlier, but the built-in NTP works fine. LAN clients can sync to it when I tick "Enable local NTP server." There's just something going on with the Entware NTP.
 
I notice other people seeing this behavior (see ntpq output):


 
I did file a bug.

Chronyc doesn't show the same error as ntpq -p, but Windows 10 LAN client still wouldn't synch with chronyd, reporting "the peer is unreachable." Since sync works with built-in Busybox NTP it's not a firewall issue on Windows or the router.

After some experimentation I discovered that 'local stratum 10' makes it so Windows will refuse to sync. So everything appears to be working with chrony.
 
Well, dang it. It worked for a bit. Now Windows 10 is back to "the peer is unreachable," but chrony appears to be working fine.

EDIT - It's definitely a windows issue at this point.
 
Last edited:
Sorry for being all over the place, but I finally discovered there is definitely some bug going on with the RT-AC88U. I don't know if it's with Entware or the firmware, though. I happened to be monitoring when the Windows 10 LAN client refused to sync. Here's what happened:

Code:
admin@RT-AC88U-B1E8:/tmp/mnt/entware/entware/sbin# tcpdump -i br0 port 123 and h
ost 192.168.1.2 -vvv
tcpdump: listening on br0, link-type EN10MB (Ethernet), capture size 262144 bytes
00:00:00.258740 IP (tos 0x0, ttl 128, id 8912, offset 0, flags [none], proto UDP (17), length 76)
    192.168.1.2.ntp > RT-AC88U-B1E8.ntp: [udp sum ok] NTPv3, length 48
        Client, Leap indicator: clock unsynchronized (192), Stratum 0 (unspecified), poll 10 (1024s), precision -23
        Root Delay: 0.054321, Root dispersion: 8.819213, Reference-ID: (unspec)
          Reference Timestamp:  3832326708.630163699 (2021/06/10 15:11:48)
          Originator Timestamp: 0.000000000
          Receive Timestamp:    0.000000000
          Transmit Timestamp:   3832327170.083164599 (2021/06/10 15:19:30)
            Originator - Receive Timestamp:  0.000000000
            Originator - Transmit Timestamp: 3832327170.083164599 (2021/06/10 15:19:30)
15:19:30.026957 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 76)
    RT-AC88U-B1E8.ntp > 192.168.1.2.ntp: [bad udp cksum 0x4bb1 -> 0x1b7c!] NTPv3, length 48
        Server, Leap indicator:  (0), Stratum 3 (secondary reference), poll 10 (1024s), precision -22
        Root Delay: 0.051513, Root dispersion: 0.002288, Reference-ID: voipmonitor.wci.com
          Reference Timestamp:  3832326613.105295846 (2021/06/10 15:10:13)
          Originator Timestamp: 3832327170.083164599 (2021/06/10 15:19:30)
          Receive Timestamp:    2208986309.145514714 (1969/12/31 23:18:29)
          Transmit Timestamp:   3832327170.025900169 (2021/06/10 15:19:30)
            Originator - Receive Timestamp:  -1623340860.937649885
            Originator - Transmit Timestamp: -0.057264430

There's that whacked receive timestamp, but it's coming in the reply from chrony to the client. It's only intermittent as a few moments earlier we have:

Code:
admin@RT-AC88U-B1E8:/tmp/mnt/entware/entware/sbin# tcpdump -i br0 port 123 and h
ost 192.168.1.2 -vvv
tcpdump: listening on br0, link-type EN10MB (Ethernet), capture size 262144 bytes
14:53:57.545020 IP (tos 0x0, ttl 128, id 2138, offset 0, flags [none], proto UDP (17), length 76)
    192.168.1.2.ntp > RT-AC88U-B1E8.ntp: [udp sum ok] NTPv3, length 48
        Client, Leap indicator: clock unsynchronized (192), Stratum 0 (unspecified), poll 10 (1024s), precision -23
        Root Delay: 0.052947, Root dispersion: 8.819580, Reference-ID: (unspec)
          Reference Timestamp:  3832325564.706629899 (2021/06/10 14:52:44)
          Originator Timestamp: 0.000000000
          Receive Timestamp:    0.000000000
          Transmit Timestamp:   3832325637.596630699 (2021/06/10 14:53:57)
            Originator - Receive Timestamp:  0.000000000
            Originator - Transmit Timestamp: 3832325637.596630699 (2021/06/10 14:53:57)
14:53:57.546417 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 76)
    RT-AC88U-B1E8.ntp > 192.168.1.2.ntp: [bad udp cksum 0x4bb1 -> 0xf411!] NTPv3, length 48
        Server, Leap indicator:  (0), Stratum 3 (secondary reference), poll 10 (1024s), precision -22
        Root Delay: 0.050292, Root dispersion: 0.002975, Reference-ID: voipmonitor.wci.com
          Reference Timestamp:  3832325570.000007444 (2021/06/10 14:52:50)
          Originator Timestamp: 3832325637.596630699 (2021/06/10 14:53:57)
          Receive Timestamp:    3832325637.543329805 (2021/06/10 14:53:57)
          Transmit Timestamp:   3832325637.544665102 (2021/06/10 14:53:57)
            Originator - Receive Timestamp:  -0.053300894
            Originator - Transmit Timestamp: -0.051965597

I don't even know how to find out where the bug lies so I can report to the right place. Clearly it's packets originating from the RT-AC88U that have the problem, whether as client requests or as server responses. Since it affects chrony and ntpd I doubt it's a bug in those products. I suspect it's a kernel problem, but I don't even know how to go about testing that hypothesis.

I wonder if I built a little console program to get the network timestamp on the RT-AC88U over and over again if I would see intermittent 1969 dates.
 
Last edited:
Seems like something is not right in the router. Your client transmit timestamp is the same as router originator timestamp. Somehow router receive timestamp is way off when it is not working properly.

Reference Timestamp: Time when the system clock was last set or corrected
Origin Timestamp: Time at the client when the request departed for the server
Receive Timestamp: Time at the server when the request arrived from the client
Transmit Timestamp: Time at the server when the response left for the client
 
Seems like something is not right in the router. Your client transmit timestamp is the same as router originator timestamp. Somehow router receive timestamp is way off when it is not working properly.

Reference Timestamp: Time when the system clock was last set or corrected
Origin Timestamp: Time at the client when the request departed for the server
Receive Timestamp: Time at the server when the request arrived from the client
Transmit Timestamp: Time at the server when the response left for the client
What's strange is that it's always the receive timestamp. It's never one of the other timestamps. If I knew the code that would probably be a big clue as to where the problem lies.
 
I finally noticed something associated with the bad receive timestamps: the tcpdump timestamps are jacked up too. See the above examples. The bad exchange starts like this:

Code:
00:00:00.258740 IP (tos 0x0, ttl 128, id 8912, offset 0, flags [none], proto UDP (17), length 76)

The good exchange starts like this:

Code:
14:53:57.545020 IP (tos 0x0, ttl 128, id 2138, offset 0, flags [none], proto UDP (17), length 76)

Whenever the tcpdump timestamp is 00:00:00.xxxxxxx the received headers are messed up. This is borne out in logging dozens of NTP transactions. So it appears that whatever this bug is affects all Entware applications because even tcpdump can't get the right time. Now I have to test the built-in NTP to see if it ever has this issue.
 

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top