What's new

RT-AX3000 crashing every few days in bcmsw_rx (Tried hw/sw reset/rebuilds and even new replacement router)

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

T

The Marantz may have violated some signaling or voltage requirement, sure it is possible. I don't know which chipset(s) Marantz is using, but for sure they are not making their own chipset. Out of curiosity had you used your Marantz device with any other router without crashing it prior to the RT-AX58U? Just curious.....

No I never did. It never used to do it on my AC-86U (although I'm not 100% sure but pretty confident), but suddenly it started, maybe the Marantz had a firmware update, who knows.
 
Seeing the same crashes here on my RT-AX3000 w/386.3_2. Happens once every 1-3 days. Don't use guest network at all.

Interestingly, I see a TON of "wl_cfg80211_change_station : WLC_SCB_AUTHORIZE sta_flags_mask not set " errors right before the kernel panic. Not sure that this happens every time, but it's happened for at least 3 of the crash logs I've looked at.

I've attached the crashlog, which shows all sorts of errors. Going to downgrade to 384.

May 5 01:05:02 crashlog: <6>CFG80211-ERROR) wl_cfg80211_change_station : WLC_SCB_AUTHORIZE sta_flags_mask not set
May 5 01:05:02 crashlog: <6>CFG80211-ERROR) wl_cfg80211_change_station : WLC_SCB_AUTHORIZE sta_flags_mask not set
May 5 01:05:02 crashlog: <6>CFG80211-ERROR) wl_cfg80211_change_station : WLC_SCB_AUTHORIZE sta_flags_mask not set
May 5 01:05:02 crashlog: <6>CFG80211-ERROR) wl_cfg80211_change_station : WLC_SCB_AUTHORIZE sta_flags_mask not set
May 5 01:05:02 crashlog: <6>CFG80211-ERROR) wl_cfg80211_change_station : WLC_SCB_AUTHORIZE sta_flags_mask not set
May 5 01:05:02 crashlog: <6>CFG80211-ERROR) wl_cfg80211_change_station : WLC_SCB_AUTHORIZE sta_flags_mask not set
May 5 01:05:02 crashlog: <6>CFG80211-ERROR) wl_cfg80211_change_station : WLC_SCB_AUTHORIZE sta_flags_mask not set
May 5 01:05:02 crashlog: <6>CFG80211-ERROR) wl_cfg80211_change_station : WLC_SCB_AUTHORIZE sta_flags_mask not set
May 5 01:05:02 crashlog: <6>CFG80211-ERROR) wl_cfg80211_change_station : WLC_SCB_AUTHORIZE sta_flags_mask not set
May 5 01:05:02 crashlog: <1>Unable to handle kernel NULL pointer dereference at virtual address 00000198
May 5 01:05:02 crashlog: <1>pgd = c0014000
May 5 01:05:02 crashlog: <1>[00000198] *pgd=00000000
May 5 01:05:02 crashlog: <0>Internal error: Oops: 17 [#1] PREEMPT SMP ARM
May 5 01:05:02 crashlog: <4>CPU: 0 PID: 239 Comm: bcmsw_rx Tainted: P O 4.1.52 #1
May 5 01:05:02 crashlog: <4>Hardware name: Generic DT based system
May 5 01:05:02 crashlog: <4>task: d7752400 ti: d77c6000 task.ti: d77c6000
 

Attachments

  • AsusCrash.txt
    60.5 KB · Views: 95
Yup. Still waiting for fixes. Earlier in the thread the version this problem started with was isolated and relates to newer/unstable Broadcom drivers (binaries that are not part of the open source project). I went back to 384.19 since 386 is not stable at this point and no new binaries have been integrated or workarounds attempted in 386. I went for 32 days on 384.19 until I had a weird issue that required a reboot. Once a meaningful attempt at a fix/update is made, I will give 386 a try again.
 

Attachments

  • uptime.png
    uptime.png
    29.4 KB · Views: 104
@seanwo I will upgrade and let you know. That said, I've had a single crash on the 384 in the same bcmsw_rx module, after the router ran out of memory. People having this issue should check top to see if they have much free memory and hit M after running top to see if conn_diag is using a lot of memory. 1 crash in 15 days is a lot more reliable than 386.3 was (but I'm still not happy with that reliability).

There appears to be a memory leak in conn_diag (currently using 98MB of RAM after 3 days). I see conn_diag was disabled by merlin, but it doesn't look like it got merged into any branch?


Upgrading to 386.4 now and will test.
 
I have upgraded this morning to 386.4_0 and am testing. On 384.19_0, memory usage of conn_diag had grown by 6MB in the last 16 hours, confirming that there was a memory leak in that old firmware, at least with my hardware.

After upgrade, conn_diag is using 13MB RAM total.
 
8 hours later, conn_diag is still using 13MB RAM. So the memory leak present in 384.19_0 is fixed in 386.4_0 (and likely some earlier firmware). I have not had a crash, but 8 hours is not enough time to confirm stability.

I also had some errors in dmesg: "nf_conntrack: expectation table full".
I adjusted TCP connections limit (under tools->other settings) to 49152 (previously 300000) and set ct_hashsize=16384 via nvram set.

I then restarted the service with service restart_conntrack. I wanted to note this, as I made these changes in addition to the firmware upgrade. I don't expect it has anything to do with the crashing, but it is a difference.
 
RT-AX1000 crashed also with 386.4 after almost two days of stable run, but that is a different problem...
 
conn_diag memory leak is gone. Still using 13MB after a week.

Crashes might also be fixed... 1 week later and no crashes. Also no random wifi interrupts on 5ghz, which happened occasionally (once an hour for ~5 seconds) before.

Will need more time to test, but this is looking like it might finally make the router work well...
 
No more unexpected crashes or events. Pretty confident things are good with the 386.4 firmware. Thank you Merlin for all your great work!

Now configuring dual lan and installing AP's, so I'll be disabling some of the wireless features. This may invalidate my testing or possibly discover new bugs, but if you're having this issue, I strongly recommend you upgrade to 386.4_0.
 
FYI: After upgrading, I've been having this issue. I don't know what's causing it, and it could be that I configured dual wan, or it could be a problem with the firmware.
 
FYI: After upgrading, I've been having this issue. I don't know what's causing it, and it could be that I configured dual wan, or it could be a problem with the firmware.
386.4 was the first firmware for the RT-AX3000 where I encountered issues with performance. I reverted to 386.3_2.
 

Similar threads

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top