What's new

IoT Devices falling off nodes (only) always requires reboot to fix

Sorry but this statement is not correct. I have such a Switch in my system and I have recorded this a number of times.
With VLAN Access mode enabled, lots of device drops. Disabled, none. I am unable to remove the Switch remotely to conduct that test.
Did I miss something? I thought there was a couple of reports where this was occurring without any TP-Link switch involvement at all.
 
I don't use YazDHCP or YazFi since moving to BE and having GNP support. The clients on the MB are low bandwidth, just that they don't support wireless, so this let's them connect since they are not near a router they could connect to directly. Since I have the ability, running a cable through the garage space a couple of rooms over to where the need is, this will take away one variable regardless; though it's more a matter of when I can do so as it's not the easiest take and requires some extra hands.
 
Did I miss something? I thought there was a couple of reports where this was occurring without any TP-Link switch involvement at all.
Yes @penguin22 does not have one, so that is correct and your statement as far as his system is concerned is correct. But I do have one installed, so it’s not. I take your point though that if it’s happening to folks without a switch, then it’s not the switch. Unfortunately there’s just not enough people testing it. I’m not sure if it’s the cause of issues or not, as I cannot remove it to test.
 
Last edited:
Yes @penguin22 does not have one, so that is correct and your statement as far as his system is concerned is correct. But I do have one installed, so it’s not. I take your point though that if it’s happening to folks without a switch, then it’s not the switch. Unfortunately there’s just enough people testing it. I’m not sure if it’s the cause of issues or not, as I cannot remove it to test.
Ok, that's what I thought. I do know there's some historical reports about a high amount of bad Rx packets showing in the logs on these types of switches but that was addressed in a post from 2019 I saw by an engineer at TP-Link with the following:

"Issue? I have a high RxBadPkt count on my network which is causing delays in some parts of my network, noticeable when i'm trying to open my online banking app, which just seems to load forever and times out eventually and when trying to stream a news video through an app (nu.nl), which will load eventually.

The high RxBadPkt count is on port 2 (EAP245) and port 9 (Trunk port to ER-LITE). Currently (after a reset due to some config changes) the statistics are :

Port 2 TxGoodPkt 56279 TxBadPkt 0 RxGoodPkt 35465 RxBadPkt 30455
Port 9 TxGoodPkt 37162 TxBadPkt 0 RxGoodPkt 64527 RxBadPkt 30449

all remaining ports have no bad packets on either the receiving or transmitting end and the ER-LITE statistics display no bad packets either.



Response:

When TL-SG1016PE receives the tagged packets of 64 bytes, the packets will be consider as RxBadPkt and RxGoodPkt together. So you will see the RxBadPkt.


But we don't need to worry about this RxBadPkt. The switch will still forward the tagged packets of 64 bytes. It will not block your network."

I can confirm I also see a significant number of bad Rx packets in my switch logs but in the case of both switches these are on the uplink ports only (i.e. the ones I'm connecting to the nodes) and there are zero bad Tx packets showing in the logs. I did some reading up on this and my understanding is VLAN-tagged packets should be 68 bytes and since the 802.1Q tag is 4-bytes I'm assuming any untagged traffic coming into that port is being logged as "Bad" but forwarded anyway as it should, which is what that engineer was probably referring to.

I'm not noticing any slowdown or disconnecting devices on my network which currently has 27 different devices connected via WiFi to VLAN ID 53. I am using YazDHCP but, prior that being updated to support GNP I was assigning static IPs to all the clients right in GNP and there doesn't seem to be any difference either before or after that change. I can't do anything to help test anything in regards to wired VLAN on a node LAN port but one thing I could do is flip one of the ports on my main router to VLAN tag a wired device and connect something to it to see if that produces any adverse results; if the TP-Link switches are wreaking havoc on Asus LAN ports being used for VLAN handling you'd think that it'd occur whether the port is on the main router or a node. If that would be helpful let me know.
 
Last edited:
I just went and re-read all of the posts in this thread from @penguin22 and here's something from one of them that caught my eye:

"Day 1 Morning; clearly have some instability, even with no wired VLAN configured on nodes, though I do have wired Access VLANs for several devices on the BE88U router as needed."

I'm curious what would happen on his setup if he temporarily disabled those Access VLANs and flipped those ports to "All (Default)" to see if the instability goes away (did you try that, @penguin22?).
 
I just went and re-read all of the posts in this thread from @penguin22 and here's something from one of them that caught my eye:

"Day 1 Morning; clearly have some instability, even with no wired VLAN configured on nodes, though I do have wired Access VLANs for several devices on the BE88U router as needed."

I'm curious what would happen on his setup if he temporarily disabled those Access VLANs and flipped those ports to "All (Default)" to see if the instability goes away (did you try that, @penguin22?).
I did not try that as I do require those to be wired Access VLANs as they are IoT devices I don't want on my main network. Even the upcoming plan with the TP-Link switch taxing the place of the AX1800S would be an Access VLAN off the BE88U.

While this could be something, it doesn't make sense to me since I don't have IoT devices, including those wired to the BE88U losing connection. This includes no loss of connectivity from the BE82U and BE58 Go. If ASUS does have an issue outstanding, it should require customers to turn off features that are supposedly supported.
 
I did not try that as I do require those to be wired Access VLANs as they are IoT devices I don't want on my main network. Even the upcoming plan with the TP-Link switch taxing the place of the AX1800S would be an Access VLAN off the BE88U.

While this could be something, it doesn't make sense to me since I don't have IoT devices, including those wired to the BE88U losing connection. This includes no loss of connectivity from the BE82U and BE58 Go. If ASUS does have an issue outstanding, it should require customers to turn off features that are supposedly supported.
I can actually test this theory by doing what I suggested in a recent post and flipping one of my RT-AX88U Pro ports to Access and hooking a wired device to it to see if it causes any instability and, if so, where. The complexity here is reports from people where some versions of a configuration involving GNP, AiMesh, and wired VLANs is causing devices to behave erratically but (and maybe I missed it) I haven't seen any posts saying to which node these wireless devices were connected while exhibiting instability and if those devices were limited to nodes where LAN ports had been enabled for wired VLAN tagging. I've got IoT devices connected wirelessly to the main router and both of my AiMesh nodes so if I do this test I'm curious if I see any instability and, if so, on what node or nodes the instability is manifesting.

Edit: I started the test. I hooked a wired device to LAN port 4 on my RT-AX88U Pro and it's configured for "Access" and the dropdown is set to my VLAN which is for my IoT devices (53). The device I wired in has full connectivity to the Internet and is being assigned a DHCP IP out of the 192.168.53.x block which is correct. Switching the LAN port setting causes a router reboot and when that happens there's always a couple of devices that don't reconnect cleanly. In this case, it was my WiFi doorbell camera and one of my WiFi security cameras. Power cycling both of them (like usual) fixed the problem and they're both back online and working as expected. I can leave this setup as is for as long as needed to test, but can someone give me a general idea of how long it normally takes for you to start seeing instability?
 
Last edited:
I haven't seen any posts saying to which node these wireless devices were connected while exhibiting instability and if those devices were limited to nodes where LAN ports had been enabled for wired VLAN tagging.
On mine two RT-AX86U Pro nodes (only) had VLAN (Access) enabled both on Ethernet Port 1. The wired devices were simple wired ESP32s for use as BT Proxies in HA.

Instability was observed affecting wireless devices thought the mesh (dropping off, getting stuck, often appearing online but not accessible via their own WebGUI or ping) remaining mesh is GT-AX6000 plus 3rd node RT-AX58U all wired BH.

Difficult to pin timescale down but for me probably after one overnight if not earlier.

Thanks for testing !
 
Last edited:
point. On mine two RT-AX86U Pro nodes had VLAN (Access) enabled boys on Ethernet Port 1. The wired devices were simple wired ESP32. Instability was observed affecting wireless devices thought the mesh (dropping off, getting stuck, often appearing online but not accessible via their own WebGUI or ping) remaining mesh is GT-AX6000 plus 3rd node RT-AX58U all wired BH.

Difficult to pin timescale down but for me probably after one overnight if not earlier.
Ok, thanks. This test I'm doing has no affect at all on my overall network configuration (the device I've wired in for the test isn't critical) so I can leave it configured the way it is now for as long as necessary.
 
I also see a significant number of bad Rx packets in my switch logs but in the case of both switches these are on the uplink ports only
Not to depart too much from the issue at hand but yep, too true, had me scratching my head for a long while as I kept thinking this cannot be right. RxBadPkt was running about 10% of the RxGoodPkt figures. This accepted solution made sense. It’s a chipset level issue that no amount of FW updates will fix (apart from hiding it completely).

IMG_3155.jpeg



one thing I could do is flip one of the ports on my main router to VLAN tag a wired device and connect something to it to see if that produces any adverse results
Thanks Seth, appreciate you running that test, it’ll hopefully help steer us in some direction just a wee bit more.
 
Will be watching with great interest but TBH there is nothing in the Improvements or Bug Fixes lists that leaps out at me ...
I didn't see it explicitly called out, however do believe that several could be a factor, most notably (with help in research using AI):

1.​

High relevance

Many IoT devices:

  • Implement 802.11k partially
  • Do not support or actively reject 802.11v
  • Behave badly when APs try to steer them anyway
In AiMesh environments, this often manifests as:

  • Device associates fine initially
  • Roaming or BSS transition info gets exchanged
  • Device never properly reassociates after a sleep / power‑save cycle
  • Appears “connected” but stops passing traffic until reconnect
Govee devices are known to:

  • Be extremely conservative with roaming
  • Prefer “sticky” AP behavior
  • Misbehave when k/v expectations aren’t aligned
This change suggests ASUS refined how the controller behaves when k is present but v is not, which is exactly the IoT danger zone in mesh setups.

✅ This is one of the strongest signals that your issue could improve.


2.​

Moderate relevance (indirect)

Why this matters in a mesh + VLAN setup:

  • AiMesh backhaul often runs over the 2.5G port on the main router
  • VLAN tagging (GNP) increases buffer and driver complexity
  • Port instability can cause brief control‑plane interruptions without a full link drop
What IoT devices experience:

  • DHCP lease still valid
  • Wi‑Fi association intact
  • Traffic silently blackholed after a transient backhaul hiccup
These are classic “it works for a day, then dies” symptoms.

✅ Not IoT‑specific, but very relevant if your mesh backhaul or uplink uses 2.5G.


3.​

Moderate relevance

This is vague, but in ASUS firmware language this usually touches:

  • NAT state handling
  • Conntrack cleanup
  • Background watchdog behavior
IoT devices are:

  • Long‑idle
  • NAT‑dependent
  • Often rely on keepalives that are barely frequent enough
If conntrack or session cleanup was too aggressive—especially across VLANs—devices can appear “online” locally but unreachable by cloud services.

✅ Possible improvement, especially if Govee devices stop responding before Wi‑Fi disconnects.
 
I didn't see it explicitly called out, however do believe that several could be a factor, most notably (with help in research using AI):
I'm not sure I trust that AI result ... or maybe you didn't ask it the right question and didn't scold it when it gave you the wrong answer :-).

Whilst there may be documented improvements in wireless connectivity in isolation, nothing in the list suggested they fixed wireless instability occurring as a result of VLAN propogation to Ethernet Ports via Access/Trunk mode has been addressed.; IMO.

The ASUS reference to enhanced stability for 2.5G LAN/WAN port operations is for me a non-starter as it is limited to the 2.5G LAN/WAN ports and not any Ethernet port on Main or Nodes.

And what do you know, I did that all without AI ... just I.
 

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Back
Top