What's new
  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

Halp! Getting router bootloop after latest firmware upgrade.

AppleBag

Regular Contributor
hi all :)

My GT-AX6000 was running the last latest version 3004 of merlin, and tonight I flashed to the latest 3006. I added about 3 guest IoT isolated vlans and suddenly I'm getting stuck in a bootloop where my router is only running for about 1 or 2 minutes after it's booted up before it crashes and boots again. Reading Linux device logs isn't really my strong smoot and I can't seem to glean the problem. Can anyone please help me out?

Here's my log just before a crash: clicky
 
Why do you have YazFi on 3006 firmware?

Code:
May 13 01:23:05 YazFi: YazFi v4.4.4 starting up


Seems to be shutting down shortly after mounting the USB and starting scripts.
1. Remove YazFi and try again
2. If it continues, remove the USB completely, reboot, and try again.
 
Why do you have YazFi on 3006 firmware?

Code:
May 13 01:23:05 YazFi: YazFi v4.4.4 starting up


Seems to be shutting down shortly after mounting the USB and starting scripts.
1. Remove YazFi and try again
2. If it continues, remove the USB completely, reboot, and try again.
Thank you, I read the included docs and saw no mention of removing YazFi or any others first. 🤷🏻

I uninstalled it and it was still freezing and rebooting, and then also removed the USB and rebooted, but unfortunately it's still doing the same. Is there anything else insidious spotted in the log I linked?

I'm not home for a few hours, but I can post the most recent logs again from after following your instructions above to see if it helps?
 
Thank you, I read the included docs and saw no mention of removing YazFi or any others first. 🤷🏻

I uninstalled it and it was still freezing and rebooting, and then also removed the USB and rebooted, but unfortunately it's still doing the same. Is there anything else insidious spotted in the log I linked?

I'm not home for a few hours, but I can post the most recent logs again from after following your instructions above to see if it helps?

Keep the USB out; and post complete logs of the bootloop and hopefully we will able to see more
 
Thank you, I read the included docs and saw no mention of removing YazFi or any others first. 🤷🏻
See the following link for a list of 3006 supported and unsupported addons. YazFi is no longer being actively developed and is not supported under 3006 firmware.
https://www.snbforums.com/threads/c...ons-with-gt-be98-pro.90657/page-7#post-947006
Several of the Guest Network Pro features are similar to those found in YazFi.

You may have to do a hard factor reset if things are still wonky after uninstalling YazFi post update from 3004.388.x firmware to 3006.102.x firmware. If you don't want to do a hard factory reset then use SSH and access the JFFS directory on the router and see if there are any left over YazFi files post YazFi uninstall. If there are, remove them and reboot the router.
 
Does this ubi error appear frequently?
Code:
May 13 01:23:07 kernel: bcm63xx_nand ff801800.nand: program failed at 6540600
May 13 01:23:07 kernel: ubi0 error: ubi_io_write: error -5 while writing 2048 bytes to PEB 794:0, written 0 bytes
May 13 01:23:07 kernel: CPU: 1 PID: 118 Comm: ubi_bgt0d Tainted: P           O      4.19.183
May 13 01:23:07 kernel: Hardware name: GTAX6000_50991 (DT)
May 13 01:23:07 kernel: Call trace:
May 13 01:23:07 kernel:  dump_backtrace+0x0/0x150
May 13 01:23:07 kernel:  show_stack+0x14/0x20
May 13 01:23:07 kernel:  dump_stack+0x94/0xc4
May 13 01:23:07 kernel:  ubi_io_write+0x574/0x690
May 13 01:23:07 kernel:  ubi_io_write_ec_hdr+0xc4/0x110
May 13 01:23:07 kernel:  sync_erase.isra.0+0x11c/0x1f0
May 13 01:23:07 kernel:  __erase_worker+0x5c/0x4a0
May 13 01:23:07 kernel:  erase_worker+0x18/0x80
May 13 01:23:07 kernel:  do_work+0x98/0x120
May 13 01:23:07 kernel:  ubi_thread+0x108/0x190
May 13 01:23:07 kernel:  kthread+0x118/0x150
May 13 01:23:07 kernel:  ret_from_fork+0x10/0x24
May 13 01:23:07 kernel: ubi0: dumping 2048 bytes of data from PEB 794, offset 0
May 13 01:23:07 kernel: ubi0 error: __erase_worker: WAR: failed to erase PEB 794, retry count 1
 
Sorry for the wait guys, just got home, and the way way this thing is currently behaving, it takes like 20 mins just to be able to load a page in the router. Constant freezes and stuff. I'm even having a hard time SSHing into the router due to long freezes.

I have been running it overnight w/out the USB in, and just now finally got a log I could copy via the WebUI general logs tab, though it seems to only have copied a few minutes for some reason(?) Here's the log.
 
Does this ubi error appear frequently?
Code:
May 13 01:23:07 kernel: bcm63xx_nand ff801800.nand: program failed at 6540600
May 13 01:23:07 kernel: ubi0 error: ubi_io_write: error -5 while writing 2048 bytes to PEB 794:0, written 0 bytes
May 13 01:23:07 kernel: CPU: 1 PID: 118 Comm: ubi_bgt0d Tainted: P           O      4.19.183
May 13 01:23:07 kernel: Hardware name: GTAX6000_50991 (DT)
May 13 01:23:07 kernel: Call trace:
May 13 01:23:07 kernel:  dump_backtrace+0x0/0x150
May 13 01:23:07 kernel:  show_stack+0x14/0x20
May 13 01:23:07 kernel:  dump_stack+0x94/0xc4
May 13 01:23:07 kernel:  ubi_io_write+0x574/0x690
May 13 01:23:07 kernel:  ubi_io_write_ec_hdr+0xc4/0x110
May 13 01:23:07 kernel:  sync_erase.isra.0+0x11c/0x1f0
May 13 01:23:07 kernel:  __erase_worker+0x5c/0x4a0
May 13 01:23:07 kernel:  erase_worker+0x18/0x80
May 13 01:23:07 kernel:  do_work+0x98/0x120
May 13 01:23:07 kernel:  ubi_thread+0x108/0x190
May 13 01:23:07 kernel:  kthread+0x118/0x150
May 13 01:23:07 kernel:  ret_from_fork+0x10/0x24
May 13 01:23:07 kernel: ubi0: dumping 2048 bytes of data from PEB 794, offset 0
May 13 01:23:07 kernel: ubi0 error: __erase_worker: WAR: failed to erase PEB 794, retry count 1
Hi Dave 👋:)

I don't see any ubi errors in the recent log I just linked, but, and I could be completely wrong here, the ubi is related to the USB? I haven't been running it with the USB in while the above log was collected, so I'm not sure if I get a lot of those until I try running it with the USB in again, but I guess I should wait to tackle that issue after tackling the issue of the problem running it with the USB out?
 
Disable the logging of packets on the Firewall page. It’s polluting the system log and probably overworking the router.

Set the “Disable Asusnat tunnel” option to Yes on Administration / Tweaks page. That aae/mastiff process seems to keep dying anyway, based on the log.

ubi is related to the internal storage of the router, not the USB.
 
ok so I was only using the firewall for the "block ddos" option (not sure if it's required for aimesh or not though?) so just for now I completely disabled it in the UI. As an aside, I'm also running skynet so do I need them both running at the same time? Or is skynet enough? But I ALSO temporarily disabled skynet for now via amtm.

And I disabled the Asusnat tunnel (again, no idea what that really does and not sure whether aimesh uses that?)

I've disabled all guest networks for now, as well.

BUT I'm still having the same behavior. Not sure if it's still doing full reboots or not, but the webui is achingly slow; so much so that it usually times out before finishing loading a page, if it laods it at all. As far as trying to use any network devices, such as my RIng cameras and such, they will work for about 10 seconds, and then stop working for much longer - maybe 15-20 minutes, in a repeating fashion.

Even trying to SSH into the router often hangs and/or disconnects (screenshot shows it hung for about 15 secs before menu finished loading, then disconnected):

1747391536912.png


I think I got better logs this time though, and in one of them I happened to notice an error screaming at me about a couple things:

NTP Failed To Start After 5 Minutes - Please Fix Immediately!

Private WAN IP Detected 192.168.100.20 - Please Put Your Modem In Bridge Mode / Disable CG-NAT

I have no idea whether that WAN error has always been there on the old firmware, but I didn't actually change anything between the last firmware and this firmware and it was working fine on the last firmware.


Also, I see this in there; is this something?

kernel: br0: received packet on wds0.0.1 with own address as source address (addr:04:42:1a:59:70:08, vlan:0)

Please see the logs. This has turned into a mini-nightmare. 😅

messages.log skynet-0.log
 
Last edited:
The Skynet log is aged and irrelevant based on the dates within it. Your use of Scribe seems to create gaps in the messages.log file when the router is actually booting up, so it’s very hard to see what’s happening. It certainly looks like frequent reboots since the date resets often to Dec 31.

Factory reset the router, install no addons or USB stick, and see if it stabilizes for a few days.
 
Just an update, it turns out that one of the routers in the mesh, a RT-AC68U which had not cause an issue on 3004 started causing a loop on 3006. The line:

kernel: br0: received packet on wds0.0.1 with own address as source address (addr:04:42:1a:59:70:08, vlan:0)

is what tipped me off. I unplugged all LAN cables from the main router, GT-AX6000, left the USB in, and 1 by 1 started adding each 1 back until I noticed the behavior. Currently everything original is back in place as was except for that 1 lan cable to the mentioned router, and everything seems ok going on about 24+ hours now.

And to be even more specific, the 68U is even still working fine, but it's connected to the mesh via Wi-Fi rather than ethernet; so somehow it appears it's just something to do with it's ethernet config that causing the calamity.

Now I'm just not exactly sure how to resolve that issue. Technically I could just shrug my shoulders and be happy it all seems to mostly be working and leave it as is, or try to resolve the loop caused by the ethernet cable to that router being connected to the main router, which I'd rather do, since as we all know, hard wired is generally more stable.

Biggest problem is when I try to plug that cable in, trying to access the main router, whether by ssh, or web ui, is an exercise in sheer determination and the ability to fend off insanity. Between the constant looooooong freezes, reboots, etc. trying to change any setting (which I'm not sure what to do) is painful.

Any ideas anyone?
 
try a different port on the main router. And the node, if possible.
try a new / known good cable.

does the issue follow either port or cable ?
 

Latest threads

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Back
Top