What's new

AC68u fast blinking power light from time to time (nvram issue)

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

bibikalka

Regular Contributor
I have this flaky AC68u which from time to time totally forgets its NVRAM. Usually power light is blinking rapidly.

I can usually restore the router after multiple attempts of pressing WPS button upon turning on. It'll come back up fully factory reset. But it'll forget again soon.

I am using it an an AiMesh node, and saved it's config via ssh. It's also wired to one of my computers. So whenever it comes back up blank, I can re-load the config on the very first screen, and it becomes the AiMesh node again (the head node picks it back up). Unfortunately, it takes several attempts (~5) to make the re-loaded config stick. It'll be good for a couple of weeks, then it'd forget again.

The ROM appears to be OK, I can upload a new version easily, no issues.

Is there anything I could do to force the router to move the NVRAM area to a more stable flash region? I feel this has something to do with flaky flash chip, and NVRAM is hitting bad blocks.
 
This is a strong indication that this node of the mesh will soon be going to the great electronics recycler in the sky.
(I wonder if using it as a Media Bridge rather than AiMesh node will lessen the load on NVRAM?)
 
@bibikalka, what firmware do you have running on your router? What version was installed when you saved your config file that you ssh install/configure? Have you tried adding the AiMesh node properly, via the GUI of the main router, after you have fully reset the errant node?

Have you tried following the process below (and not using ssh to configure the router afterward) to see if you can truly and fully reset it?

Nuclear Reset https://www.snbforums.com/threads/major-issues-w-rt-ac86u.56342/page-4#post-495710
 
@bibikalka, what firmware do you have running on your router? What version was installed when you saved your config file that you ssh install/configure? Have you tried adding the AiMesh node properly, via the GUI of the main router, after you have fully reset the errant node?

Have you tried following the process below (and not using ssh to configure the router afterward) to see if you can truly and fully reset it?

Nuclear Reset https://www.snbforums.com/threads/major-issues-w-rt-ac86u.56342/page-4#post-495710

It's running Merlin 386.12_0 - and that's the same config I saved. I was tired of moving it back and forth to the main router to setup AiMesh, so saved the config via ssh. At least now I don't have to physically move anything! I can restore the AiMesh node from the same physical location, and it still takes several attempts.

I don't see the resets in the list above would do anything - I am using the WPS button a bunch of times anyway to revive it. What I need is to force the device to move the NVRAM area somehow to a more stable flash memory location.
 
What the link I provided above does is reset the router (in as many ways as possible, along with formatting the JFFS partition) and re-flash the same firmware (that you want to use) as many times as you are willing to do. This has helped more than a few routers become stable enough to be used reliably again.


 
I have this flaky AC68u which from time to time totally forgets its NVRAM. Usually power light is blinking rapidly.

I can usually restore the router after multiple attempts of pressing WPS button upon turning on. It'll come back up fully factory reset. But it'll forget again soon.

I am using it an an AiMesh node, and saved it's config via ssh. It's also wired to one of my computers. So whenever it comes back up blank, I can re-load the config on the very first screen, and it becomes the AiMesh node again (the head node picks it back up). Unfortunately, it takes several attempts (~5) to make the re-loaded config stick. It'll be good for a couple of weeks, then it'd forget again.

The ROM appears to be OK, I can upload a new version easily, no issues.

Is there anything I could do to force the router to move the NVRAM area to a more stable flash region? I feel this has something to do with flaky flash chip, and NVRAM is hitting bad blocks.

My AC1900 has a couple times stopped allowing clients to join the 5ghz network (existing clients are fine, but if I leave and come back my phone can't reconnect, etc). 2.4 may have the issue too but I don't have 2.4 devices that I would have noticed it on. I didn't look at the light but I don't think it was fast blinking. The first time, power cycle fixed it. Second time the power cycle reset the router back to factory defaults. So obviously a bad cell or flash chip. I did a hard factory reset, soft factory reset, several "format NVRAM at next boot" (once before enabling scripts, once after). The hope being that the controller will mark the cell(s) bad and be ok, but obviously I'll be keeping an eye out for a good deal on a replacement this black friday. I did update to 386.12 as part of the process too. We'll see. But may be something you want to try, flash memory should be able to mark bad cells and skip them, just like a hard drive with bad sectors, but if cells are starting to die, often it is a snowball effect and they'll just keep going, so usually it is a sign that it is on its way out.
 
My AC1900 has a couple times stopped allowing clients to join the 5ghz network (existing clients are fine, but if I leave and come back my phone can't reconnect, etc). 2.4 may have the issue too but I don't have 2.4 devices that I would have noticed it on. I didn't look at the light but I don't think it was fast blinking. The first time, power cycle fixed it. Second time the power cycle reset the router back to factory defaults. So obviously a bad cell or flash chip. I did a hard factory reset, soft factory reset, several "format NVRAM at next boot" (once before enabling scripts, once after). The hope being that the controller will mark the cell(s) bad and be ok, but obviously I'll be keeping an eye out for a good deal on a replacement this black friday. I did update to 386.12 as part of the process too. We'll see. But may be something you want to try, flash memory should be able to mark bad cells and skip them, just like a hard drive with bad sectors, but if cells are starting to die, often it is a snowball effect and they'll just keep going, so usually it is a sign that it is on its way out.
I thought the option was "format jffs at next boot" - not "format nvram ...".

Mine does work for stretches of time - and then it loses it. If it's stable - ROM updates seem to work fine.

Overall, I do have client connection issues after a long uptime when my main router needs a reboot. But! A normal router would just reboot, and that's it. This flaky one - can lose it ...
 
Sounds like it has nothing to do with the flash chip, check your WPS button, there may be a fault with your router's WPS button causing it to keep being pressed, in which case it causes the router to clear the nvram when you power it on.
 
I thought the option was "format jffs at next boot" - not "format nvram ...".

Mine does work for stretches of time - and then it loses it. If it's stable - ROM updates seem to work fine.

Overall, I do have client connection issues after a long uptime when my main router needs a reboot. But! A normal router would just reboot, and that's it. This flaky one - can lose it ...

Yeah I meant jffs, typo.
 
Sounds like it has nothing to do with the flash chip, check your WPS button, there may be a fault with your router's WPS button causing it to keep being pressed, in which case it causes the router to clear the nvram when you power it on.
Interesting thought!

I notice sometimes it sort of loses functionality while running, and when I reboot - it starts blinking. The prior reboot was good, and it even worked for a bit. Alright, I think I've exhausted my patience with it - will replace!
 
Nothing wrong with trying to wring every last drop of usefulness out of stuff. Sometimes we get carried away, though.

Does that firmware use mtd and have the "mtdinfo" and "lsmtd" executables? "mtdinfo -M /dev/mtd[pertinent#]" will give a listing of bad erase blocks. In that scheme "cat /proc/mtd" and "lsmtd" commands ought to help you determine where your nvram is being mapped.

Apologies if that paragraph is a "Mr. Obvious" sort of thing.
 
Nothing wrong with trying to wring every last drop of usefulness out of stuff. Sometimes we get carried away, though.

Does that firmware use mtd and have the "mtdinfo" and "lsmtd" executables? "mtdinfo -M /dev/mtd[pertinent#]" will give a listing of bad erase blocks. In that scheme "cat /proc/mtd" and "lsmtd" commands ought to help you determine where your nvram is being mapped.

Apologies if that paragraph is a "Mr. Obvious" sort of thing.

This actually sounds interesting! I will try it, and see where it goes.
 
If it's pertinent, you can go ahead and check each "partition" for bad blocks to get an idea of the chip's general health.
 
Here's a script I call "badblocks" if you can use it:
Bash:
#!/bin/sh

case $1 in
        [0-9]* )
echo
sed -n '1p; /^mtd'$1':/p' /proc/mtd
echo
echo mtdinfo -M /dev/mtd${1} says:
mtdinfo -M /dev/mtd${1} | awk -F 'BAD' '
                                BEGIN { bads = 0 }
                                /BAD/{ bads = bads + ( NF - 1 ); print  }
                                END { printf "( %d bad blocks )\n", bads }'
echo
echo /sys/class/mtd/mtd${1}/bad_blocks says:
cat /sys/class/mtd/mtd${1}/bad_blocks
echo
;;

        * ) cat <<EOF

usage: $0 <mtd number>

EOF
;;
esac
 
Here's a script I call "badblocks" if you can use it:
Bash:
#!/bin/sh
...

OK, I tried it, but it does not report bad blocks on AC68U. It seems such piece of info is missing.

I was wondering if I could somehow just write zeros to mtd1 a bunch of times, and have it deal with badblocks.

Code:
admin@RT-AC68U-FF68:/jffs/scripts# ls -lta /sys/class/mtd/mtd1
drwxr-xr-x    2 admin    root             0 Dec 31  1969 .
drwxr-xr-x   14 admin    root             0 Dec 31  1969 ..
lrwxrwxrwx    1 admin    root             0 Dec 31  1969 block:mtdblock1 -> ../../../block/mtdblock1
-r--r--r--    1 admin    root          4096 Dec 31  1969 dev
-r--r--r--    1 admin    root          4096 Dec 31  1969 erasesize
-r--r--r--    1 admin    root          4096 Dec 31  1969 flags
-r--r--r--    1 admin    root          4096 Dec 31  1969 name
-r--r--r--    1 admin    root          4096 Dec 31  1969 numeraseregions
-r--r--r--    1 admin    root          4096 Dec 31  1969 oobsize
-r--r--r--    1 admin    root          4096 Dec 31  1969 size
-r--r--r--    1 admin    root          4096 Dec 31  1969 subpagesize
lrwxrwxrwx    1 admin    root             0 Dec 31  1969 subsystem -> ../../mtd
-r--r--r--    1 admin    root          4096 Dec 31  1969 type
-rw-r--r--    1 admin    root          4096 Dec 31  1969 uevent
-r--r--r--    1 admin    root          4096 Dec 31  1969 writesize
admin@RT-AC68U-FF68:/jffs/scripts# cat /sys/class/mtd/mtd1/*
cat: read error: Is a directory
90:2
131072
0x400
nvram
0
0
1572864
2048
cat: read error: Is a directory
nand
MAJOR=90
MINOR=2
DEVNAME=mtd1
DEVTYPE=mtd
2048

I decided to hammer nvram partition (I did stuff below several times):
Code:
md5sum /dev/mtd1
mtd-erase2 nvram
dd if=/dev/urandom of=/dev/mtd1 bs=131072
md5sum /dev/mtd1
mtd-erase2 nvram
nvram commit
strings /dev/mtd1

If it's indeed nvram flash area, this may help. Will monitor what it does!

P.S. Also cleared jffs - just to be sure:
Code:
nvram set jffs2_format=1
nvram commit
reboot

P.P.S. Actually, the following command hammers flash like there is no tomorrow - the write size is way too small for how much flash wants in 1 chunk (this took a few minutes to finish vs. bs=131072 which finishes in a couple of seconds):
Code:
dd if=/dev/urandom of=/dev/mtd1 bs=512
 
Last edited:
Short term report - so far the node survived a few reboots just fine, and seems to be working OK (knock on wood).

Hopefully, the flash cleansing above killed all the flaky cells, so only the good ones are now in the circulation. Will see how it goes.
 
How do you know there were flash cells that were killed? Can you get a report with reallocated sectors or something like that?
 
How do you know there were flash cells that were killed? Can you get a report with reallocated sectors or something like that?

No report that I could get. I am not sure that eMMC can reveal this type of granularity.

It's more cause/effect type of deduction. I did a bunch of writes to NVRAM partition, now things appear more stable. The hope was that the eMMC controller would notice the flaky flash cells upon multiple writes, and retire them.
 

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top