What's new

USB drive failure...or something worse?

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

if replacement is $14

This is one of the reasons they fail. They were expensive in the past. I have some >10 years old and still good. Same goes for newer SSD drives. SLC drives are expensive. MLC/TLC/QLC are cheap. Everything is made cheap today and it doesn't last. Plus more false advertising involved.
 
Indeed - and as I mentioned - these drives are so low cost (just picked up a 128GB SanDisk for 14 bucks over a BestBuy), if one fails, try to erase it, and drop it in the e-waste bin...

It's just not worth the trouble to try to recondition a thumb drive that has failed - if replacement is $14, what is your time worth?

Not to mention once they start to fail, even if you revive them, it is temporary, what is your DATA worth is the real question (maybe not a big concern here but in many cases it is).
 
Thanks, I had never been aware of this issue.

Currently I show:
Code:
size: 62133 bytes (3403 left)

I ran the command to see my largest variables:

Code:
1345 custom_clientlist
931 nc_setting_conf
761 sshd_authkeys
549 rc_support
375 wl1_chansps
362 vts_rulelist
...

It appears my model (RT-AC68) doesn't have anything in the /jffs/nvram directory, so I am unable to move those top 2 offenders off to there. Additionally it doesn't appear that any of the cleanup options is really recommended because while you can clear them after a reboot to free up NVRAM, a warning from RMerlin stated this could cause a problem if you then add too much back in and then reboot. I can probably move the vts_rulelist out of the GUI and just create IP Table rules, but for the 362 bytes it doesn't seem worth it.

In any event, while the number is a bit high, there is still enough left that it doesn't look like I am running into that issue. I still have 3.4 bytes left which is more than twice my current largest variable. But, I'm glad you brought this to my attention and will keep my eye on it.

I wonder if the upgrade from 386_3.2 to the latest version will exacerbate the issue. In any event, I currently don't see a reason to update my FW at all as I don't seem to have any issues. Of course I will have to eventually, but every year this configuration works for me is one year less I need to pay for an upgrade and another year (or even just 6 months) of allowing prices to fall, etc.

If your NVRAM is that full do NOT upgrade to 386.9 or 386.10. In order to free up NVRAM you will need to create a custom client list file and direct dnsmasq to that file, that should help quite a bit. Then you can consider 386.10. But I have same/similar router and I'm sticking with 386.7_2 for now.
 
Not to mention once they start to fail, even if you revive them, it is temporary, what is your DATA worth is the real question (maybe not a big concern here but in many cases it is).
* or your TIME!
 
I've run all sorts of tools on the drive, including those that look for bad sectors. I've also run some linux ones on it through WSL. There is no indication that the drive is bad or that you would be able to distinguish it from any other drive you just unwrapped.

While I agree that if you can validate that a drive failed, even once, it is a good idea to replace that drive it is just as bad an idea to assume that a failure (especially how it is exhibiting in my explanation) is the fault of the drive. There is a good probability (strictly speaking odds) that the drive is at fault, but based on the symptoms it could be just as likely a problem with the USB port or bus on the router.

I think blaming the drive as bad is extremely poor troubleshooting in this case given what the logs show (both on the router at fail time and the lack of any error logs when testing). It isn't that I am trying to save $15 (technically I only need a 4GB drive, so really it's probably less than $10) because I don't want to replace the drive. It's that I don't want to blame a drive when the evidence isn't clear that the drive is at fault.

Until I can determine to a reasonable degree of certainty that the router itself isn't damaged, discussing replacing the drive with whatever is putting the cart before the horse.

Not saying not to rule anything out, it is good due diligence. Just saying if you run 2 other drives on the router and they work fine, it is highly unlikely to be the USB port or bus, and I also would not assume that just because the original one works fine again that it is now "OK", it may simply have had its bad cells marked as dead and not accessing them anymore, but more bad cells are imminent.
 
Yes, that is a key point and I have some additional logging going on in hope to catch it.

At this point I'm not actually sure I'm going to wait through the weekend testing this 2.0 drive, I think I'm going to put my other 3.0 in tomorrow and try to kill 2 birds with one stone. If I get a failure on that, I'll either try it in 2.0 mode or put the 2.0 drive back in, but I think testing with another 3.0 drive first will save time if it doesn't exhibit failure within a several day time frame.

I missed the fact that it was 2.0 vs 3.0. Earlier revisions of the AC68 do have known interference issues with USB3. They added extra shielding at some point to resolve it, don't recall which revision. Typically this results in issues with 2.4Ghz wifi but there is no reason that it couldn't go the other way. Who knows, maybe your router just happened to start using a channel that is interfering now (or that drive is sensitive to that particular frequency). Most likely the drive can't actually exceed USB2 speeds anyway, you could change the USB 3 port to 2 (an option they added specifically due to that issue) and see if it still fails. Of course that doesn't necessarily mean the interference was the issue, maybe your USB is dying and just can't handle 3.0 anymore. I guess the only way to be sure about that would be to disable the 2.4ghz radio completely and see if it still fails while running at 3.0.
 
This is one of the reasons they fail. They were expensive in the past. I have some >10 years old and still good. Same goes for newer SSD drives. SLC drives are expensive. MLC/TLC/QLC are cheap. Everything is made cheap today and it doesn't last. Plus more false advertising involved.

I have a kingston 1GB drive that is about 20 years old. Still the fastest thumbdrive I have and still going strong. I keep it around mostly because some devices when you do a bios or firmware update do not like FAT32 or EXFAT, or drives over a certain size, but it got very heavy use in its earlier days.

From what I recall the price was $100.
 
If your NVRAM is that full do NOT upgrade to 386.9 or 386.10. In order to free up NVRAM you will need to create a custom client list file and direct dnsmasq to that file, that should help quite a bit. Then you can consider 386.10. But I have same/similar router and I'm sticking with 386.7_2 for now.
I've been looking around for this. Apparently some of the models moved some NVRAM info to /jffs/nvram, but I don't see anything in that folder on mine. I saw people mentioning you could move the dhcp client list, but I don't this is the same as the custom_clientlist which are the friendly names of all the mac/IP pairs you see when you click "client list" in the GUI.

If there is a way to move this out of NVRAM, I definitely want to do that...can you provide a link?
 
technically I only need a 4GB drive, so really it's probably less than $10

Technically you should use a bigger drive - there will be less erase/write cycles per flash block, extending the overall life the the drive.

Just saying - 32's are almost free, and the delta in the price difference between 32 and 128 these days is almost silly - 4 times the capacity for perhaps 25 percent more cost...
 
Technically you should use a bigger drive - there will be less erase/write cycles per flash block

I'm not sure cheap USB sticks have any wear levelling though.
 
I've been looking around for this. Apparently some of the models moved some NVRAM info to /jffs/nvram, but I don't see anything in that folder on mine. I saw people mentioning you could move the dhcp client list, but I don't this is the same as the custom_clientlist which are the friendly names of all the mac/IP pairs you see when you click "client list" in the GUI.

If there is a way to move this out of NVRAM, I definitely want to do that...can you provide a link?

Yeah I was thinking the static DHCP. I'm not sure honestly, someone else may need to chime in there as to whether that is possible. That seems like a lot of entries for custom client names, are they all in use, maybe time for a cleanup?

Honestly I'd stay away from anything after 386.7_2 for the AC68 variants anyway, at least until things settle down and maybe some NVRAM stuff gets fixed. A lot of unnecessary variables in there it seems that keep getting recreated on startup.
 
If you do it few times quickly with drives like SanDisk Ultra Flair it may not survive 10 cycles. The drive heats up >60C and dies. When I was playing with AdGuard Home the system it was running on was Ubuntu Server installed on 32GB Ultra Flair. The USB drive died in about 4 days uptime with 0 swap in use (x86 board with 8GB RAM). Few OS updates and reboots - gone. This is what you get for $10. Few dollars more 64/128GB won't last any longer.
 
I know what's in theory, but I also know $0.02 cheaper component will be used in next version, and $0.02 cheaper in the one after. This is how we got to $10 drives with actual cost around $2. Using defective chips is a fact. Replacing chips with cheaper is a fact too. Most chips don't even have any markings manufactured by hard to pronounce company in China following all the standards for sure. Some brand name companies like Kingston got caught sending one drive for review and selling another cheaper and slower with the same product number. Did I mention false advertising above?
 
I know what's in theory, but I also know $0.02 cheaper component will be used in next version, and $0.02 cheaper in the one after.

You seem to have an answer for pretty much everything - have you ever considered not commenting on something?
 
I do have industrial automation device in production and know some practical things different than theory. Moved it out of China for some reasons. I also have purchased from Amazon here in Canada fake SanDisk drives as well as fake capacity drives. Wikipedia doesn't say much about it, but it's a real thing.
 
Similar threads

Similar threads

Top