What's new

arp problem between wifi devices on RT-AC88u

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

sebounet93

Occasional Visitor
Hello,

Sorry if this subject has already been discussed but I did not find an answer to my question, I have an RT-AC88U, and I have big problems of communication between wifi devices.
It seems that it is a problem of propagation of ARP request between wifi equipments via the AC88U.

Indeed, the communications between the Ethernet and the wifi devices always work very well. The communications between equipment on Ethernet ports of the router also work very well, the communications between the 88u and the wifi devices also work very well.
On the other hand, the communications between equipments in wifi (via the 88u, they, work from time to time).

And each time when it blocks we notice that on the problematic equipment, the arp of the remote equipment is not known. The arp-who request does not arrive on the second wifi equipment. Except after a few seconds or minutes it's variable, the request forwarded and the destination equipment receives the arp-who, it answers it, and the arp finally arrives on the source wifi device...

To summarize:
ac88u to Ethernet device = OK
ac88u to Wifi = OK
Ethernet to Etherner (via ac88u) = OK
Wifi to Ethernet (via ac88u) = OK
Wifi to Wifi (via 88u) = Fail.

Arp level it gives us this:

admin@RT-AC88U-CC80:/tmp/home/root# arp -a | grep 21
? (192.168.1.21) at 5c:87:30:87:de:a8 [ether] on br0
admin@RT-AC88U-CC80:/tmp/home/root# arp -a | grep 56
q3s (192.168.1.56) at 20:32:33:32:46:a3 [ether] on br0
admin@RT-AC88U-CC80:/tmp/home/root# date
Mon Nov 7 15:44:36 MET 2022
admin@RT-AC88U-CC80:/tmp/home/root#

The RT knows the arp of the .56 as well as the .21. No problem from the router to ping these two IPs.
At the same time, I launch a ping from the .21 to the .56 no response. I trace at the same time on the router that gives me that:

nota: be focus just on capture between 192.168.1.21 (source of icmp request), and 192.168.1.56 (destination of icmp request (base of icmp reply)).

root@RT-AC88U-CC80:/# tcpdump -i br0 -n arp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br0, link-type EN10MB (Ethernet), capture size 262144 bytes


(...)
15:43:53.809343 ARP, Reply 192.168.1.1 is-at 3c:7c:3f:60:cc:80, length 28
15:43:54.734288 ARP, Request who-has 192.168.1.77 tell 192.168.1.77, length 28
01:00:00.898095 ARP, Request who-has 192.168.1.56 tell 192.168.1.21, length 46 <= why this timestamp ?
15:43:57.108258 ARP, Request who-has 192.168.1.56 tell 192.168.1.21, length 46
15:43:59.108250 ARP, Request who-has 192.168.1.56 tell 192.168.1.21, length 46
15:44:01.108147 ARP, Request who-has 192.168.1.56 tell 192.168.1.21, length 46
15:44:06.343590 ARP, Reply 192.168.1.1 is-at 3c:7c:3f:60:cc:80, length 28
01:00:00.099345 ARP, Request who-has 192.168.1.1 tell 192.168.1.96, length 46 <= why this timestamp ?
15:44:07.898838 ARP, Reply 192.168.1.1 is-at 3c:7c:3f:60:cc:80, length 28
01:00:00.119095 ARP, Request who-has 192.168.1.84 tell 192.168.1.88, length 46 <= why this timestamp ?
15:44:08.476215 ARP, Reply 192.168.1.84 is-at dc:f5:05:f0:0a:2a, length 46
15:44:08.764462 ARP, Request who-has 192.168.1.88 tell 192.168.1.1, length 28
01:00:00.146782 ARP, Reply 192.168.1.88 is-at 00:e0:4c:a5:df:0b, length 46 <= why this timestamp ?
15:44:15.349283 ARP, Request who-has 192.168.1.28 tell 192.168.1.28, length 28
15:44:16.464464 ARP, Request who-has 192.168.1.41 tell 192.168.1.1, length 28
01:00:00.474226 ARP, Reply 192.168.1.41 is-at dc:a6:32:0c:a0:74, length 46 <= why this timestamp ?
15:44:21.933612 ARP, Request who-has 192.168.1.101 tell 192.168.1.77, length 28
01:00:00.831260 ARP, Reply 192.168.1.101 is-at 00:e0:4c:a5:df:0b, length 46 <= why this timestamp ?
01:00:00.988578 ARP, Request who-has 192.168.1.56 tell 192.168.1.21, length 46
15:44:23.722006 ARP, Request who-has 192.168.1.51 tell 192.168.1.51, length 28
01:00:00.301232 ARP, Request who-has 192.168.1.1 tell 192.168.1.43, length 46 <= why this timestamp ?
15:44:23.807782 ARP, Reply 192.168.1.1 is-at 3c:7c:3f:60:cc:80, length 28
15:44:24.394553 ARP, Request who-has 192.168.1.56 tell 192.168.1.1, length 28
15:44:24.396576 ARP, Reply 192.168.1.56 is-at 20:32:33:32:46:a3, length 28
01:00:00.367627 ARP, Request who-has 192.168.1.56 tell 192.168.1.21, length 46 <= why this timestamp ?
01:00:00.367239 ARP, Request who-has 192.168.1.1 (3c:7c:3f:60:cc:80) tell 192.168.1.44, length 46 <= why this timestamp ?
15:44:25.872808 ARP, Reply 192.168.1.1 is-at 3c:7c:3f:60:cc:80, length 28
15:44:27.117939 ARP, Request who-has 192.168.1.56 tell 192.168.1.21, length 46
15:44:28.164463 ARP, Request who-has 192.168.1.96 tell 192.168.1.1, length 28
15:44:28.164763 ARP, Reply 192.168.1.96 is-at 78:a5:dd:26:df:7e, length 46
01:00:00.825869 ARP, Request who-has 192.168.1.56 tell 192.168.1.21, length 46 <= why this timestamp ?
(...)
15:44:55.218309 ARP, Request who-has 192.168.1.56 tell 192.168.1.21, length 46
15:44:55.456213 ARP, Request who-has 192.168.1.1 tell 192.168.1.70, length 46
15:44:55.456280 ARP, Reply 192.168.1.1 is-at 3c:7c:3f:60:cc:80, length 28
15:44:56.057136 ARP, Request who-has 192.168.1.101 tell 192.168.1.51, length 28
15:44:56.057354 ARP, Reply 192.168.1.101 is-at 00:e0:4c:a5:df:0b, length 46
15:44:56.804844 ARP, Reply 192.168.1.56 is-at 20:32:33:32:46:a3, length 28 <== after that, ping is ok between the 2 wifi devices.. no more arp-who from 192.168.1.21.
15:44:59.262719 ARP, Reply 192.168.1.56 is-at 20:32:33:32:46:a3, length 28
15:44:59.570133 ARP, Reply 192.168.1.56 is-at 20:32:33:32:46:a3, length 28
15:44:59.876987 ARP, Reply 192.168.1.56 is-at 20:32:33:32:46:a3, length 28
(...)

This is just an example, but I randomly see this behavior with other machines when they are both in wifi, sometimes it starts to answer immediately, sometimes it takes a few minutes.
Each time the arp is not known between the wifi equipment.

Do you know of a bug at this level?

Thank you in advance for your help. Very nice day to all.

Seb

PS: sorry for my English
 
This sounds like the problem I experienced on my RT-AC5300 where I cannot ping wifi devices after upgrading to any version past 386.3_2. The problem doesn't show up immediately. It shows up after a few days. I have to reboot the router to resolve the issue. Then it shows up again a few or several days later. The only resolution to the issue I have so far is to downgrade to 386.3_2. The problem started almost a year ago with the January release of Merlin.
 
Thanks for the feedback!
How is the downgrade from 386.7_2 to 386.3_2 in your opinion?
The only issue I had with the downgrade were messed up entries in the DHCP reservation list (if you have any). It was easy to fix buy just updating each entry. Other than that, no issues expect the arp/ping problem was fixed.
 
Cool I will try this this night.
I have 31 dhcp entries for fixed addresses I hope they won't all get corrupted :(
 
Hi,

The downgrade in 386.3_2 was successful. I just had to reset the hostnames of my 31 hdcp entries (the IPs were still correctly assigned, but all the names had been lost).
Concerning the arp problem I will see on some day how it goes.
It is often the same devices that have trouble (like ip cams, air conditioning modules...)

BR
 
Hi,

The downgrade in 386.3_2 was successful. I just had to reset the hostnames of my 31 hdcp entries (the IPs were still correctly assigned, but all the names had been lost).
Concerning the arp problem I will see on some day how it goes.
It is often the same devices that have trouble (like ip cams, air conditioning modules...)

BR
did this solve your issue?
 
Hello there,

I was witnessing the exact same behaviour as the OP on a pair of RT-AC86U connected over AiMesh on Merlin firmware 386.11 as well as the Asus stock firmware before that. Much like Frank Munroe in #4, the problem shows up over time and is largely mitigated by reboot. Switching from Asus stock to Merlin 386.11 made it better, or maybe it was a placebo effect from the reboots involved and I got distracted by having more toys in the Merlin firmware to play with.

Regardless, under either of those newer firmwares, arp discovery would fail for... 10 seconds to five minutes or so. The same was true of IPv6 neighbor discovery. Devices that kept their arp cache current and/or kept their ipv6 neighbor "reachable" would stay fine.

This looks like some strictly layer2 access point level shenanigans. I'm giving the 386.3 downgrade a try, but sadly I'm probably going to have to wait a few days to see if the fix stays fixed, or I have to zigbee plug reboot these things once every few days, or I take them into the back garden and smash them to bits.

Does anyone else have something definitive to latch on to as a root cause that's observable from the router? I'm kinda getting the idea that Asus introduced "terrible magic" as part of whatever the AIMesh process is doing that is causing horribleness.
 
There was an issue that Asus introduced with networkmap repeatedly flushing the arp table. That may or may not be related to your problem.


I believe Asus has now fixed this problem but I don't know whether that change has made it into Merlin yet.

EDIT: No, it looks like 386.11 is too old to have picked up that fix. Try again when 386.12 is released.
 
Last edited:
Try turning off adaptive QoS. I had a similar issue with a router that I was doing priority tags on my LAN interfaces with my Asus AP in access point only mode. It can killing the broadcasting of finding each other out with arp.
 
I've had adaptive QoS off the entire time. One of the RT-AC86Us is the router, the other is just an AP in AiMesh. I'm not using QoS in any way on the network.

The downgrade to 386.3_2 does not seem to have had any measurable effect. Wireless to Wireless clients still have about 10 seconds to five minutes of discovery failure and then they're fine after that initial day or two of everything being more or less fine.

Anything else I could try?
 

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top