AC88U Crashing At Random Times

  • ATTENTION! As of November 1, 2020, you are not able to reply to threads 6 months after the thread is opened if there are more than 500 posts in the thread.
    Threads will not be locked, so posts may still be edited by their authors.
    Just start a new thread on the topic to post if you get an error message when trying to reply to a thread.

danjackson

New Around Here
Hey all,

I've got my AC88U with the latest merlin firmware on it. Over the past couple of weeks, I've noticed random crashes and I can't put my finger on the issue.

Here are the symptoms:

- Web Interface is inaccessible (Cannot connect to 192.168.0.1 over HTTP)
- Gateway functionality is offline (Can't connect to WAN - Cannot connect to any range outside of 192.168.0.0/24)
- Hardware switch still works normally
- WiFi Access Point still works normally (WiFi devices can access any other devices on the network)

Details about my router that may be non standard:

- DHCP Disabled - DNS uses TLS to Cloudflare
- Runs VPN client for a site to site connection from home to colo servers
- IPv6 Enabled
- The 2.4Ghz is broken (Not sure how this happened, started acting weird then one day just broke and Linux no longer detects the 2.4 radio as a device whatsoever - pretty sure its a hardware issue - This isn't a big deal, I don't use 2.4Ghz at all)
- I have a laptop cooling pad on the bottom which is externally powered by the USB on another device
- Connects to WAN using PPPoE
- Uses the AIProtection IPS
-

I've been struggling with this for weeks so I've decided to enable the internal debug log and I've decided over the last few days to use Zabbix for SNMP monitoring to another server.

Today, it crashed again. Zabbix reported a spike in CPU load average which I had not seen before. I also got triggers that the RAM went over 90% usage. The load average jumps from around 1 to around 6 and possibly higher and then the device fails. I've attached a screenshot of this. The debug log reports nothing out of the ordinary:

https://pastebin.com/jwz44VNr

I've also attached a screenshot of the RAM usage over time, it appears that the RAM keeps building until the system crashes? Could this happen, could this be an OOM? I don't have any custom scripts running on this, it should be standard setup other than the SNMP which I've only setup very recently

Does anyone have and ideas? Can anyone help me? :)

CPUUsage-AC88U.PNG
RamUsage-AC88U.PNG

TrafficUsage-AC88U.PNG
 

L&LD

Part of the Furniture
Does it crash when you disable SNMP?

What is the 'latest RMerlin firmware' in your case? 384.17 or 384.18 Alpha 1?

When was the last time a full reset to factory defaults applied to the router?
 

danjackson

New Around Here
Does it crash when you disable SNMP?

What is the 'latest RMerlin firmware' in your case? 384.17 or 384.18 Alpha 1?

When was the last time a full reset to factory defaults applied to the router?
I don't think I was clear here, I enabled SNMP because the router was crashing to try and diagnose it better. It's a recent addition and it was crashing before I enabled it.

I thought I was on the latest firmware because the update checker didn't return anything, after you're reply I've now noticed that I'm a couple of versions out of date! I'll try an update this first. I'm on version 384.14_2 which I don't think has been up to date since January :( I think I might try a complete reset for firmware 384.17 too to see if it fixes the issue. I was hoping it wouldn't come to that but it shouldn't take too long I suppose :)

Thanks for the help!:D
 

danjackson

New Around Here
Hey again,

So I cleared the NVRAM (factory reset via the GUI), updated it to 384.17, cleared the NVRAM again (for good measure) and erased the jffs partition.

The only non standard setup on there now: SNMP, OpenVPN client for my site to site, and a RoboCFG script from here which only runs once on boot.

https://www.snbforums.com/threads/wan-port-vlan-trunking.58921/#post-518140

Everything seems to have been going fine but then logging shows that Saturday night at around 23:00, the RAM started building again - Today it crashed at when the RAM reached 100% usage.

I'm leaning towards the 2.4Ghz hardware issue being the root cause. I've checked my browser history and around the Saturday when this RAM issue started, I noticed I was trying to access the Advanced_Wireless_Content.asp page which doesn't load properly at all due to issues with the 2.4Ghz device. I'm leaning towards requests to that page causing a memory leak of some sort as the page doesn't load and the httpd process seems to restart (The Web Interface becomes inaccessible for about 30 seconds then comes back online). The issue with the 2.4Ghz device is that the OS can't seem to properly communicate with the WiFI driver:

Test on 5Ghz:
[email protected]:/jffs# wl -i eth2 status
SSID: "Redacted"
Mode: Managed RSSI: 0 dBm SNR: 0 dB noise: -88 dBm Channel: 52/80
BSSID: 2C:4D:54:21:8F:74 Capability: ESS
Supported Rates: [ 6(b) 9 12(b) 18 24(b) 36 48 54 ]
VHT Capable:
Chanspec: 5GHz channel 58 80MHz (0xe03a)
Primary channel: 52
HT Capabilities:
Supported HT MCS : 0-31
Supported VHT MCS:
NSS1 Tx: 0-11 Rx: 0-11
NSS2 Tx: 0-11 Rx: 0-11
NSS3 Tx: 0-11 Rx: 0-11
NSS4 Tx: 0-11 Rx: 0-11

Test on 2.4Ghz ???

[email protected]:/jffs# wl -i eth1 status
wl: wl driver adapter not found

I'm guessing that eth1 is the 2.4Ghz driver because I know it has issues (because I can't connect anything to it) and because its the only network adapter without a mac address:

[email protected]:/jffs# ip a
7: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
link/ether 2c:4d:54:21:8f:70 brd ff:ff:ff:ff:ff:ff
inet 169.254.33.239/16 brd 169.254.255.255 scope global eth0
8: dpsta: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
inet6 fe80::200:ff:fe00:0/64 scope link tentative
valid_lft forever preferred_lft forever
9: eth1: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
inet6 fe80::200:ff:fe00:0/64 scope link
valid_lft forever preferred_lft forever
10: eth2: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
link/ether 2c:4d:54:21:8f:74 brd ff:ff:ff:ff:ff:ff
inet6 fe80::2e4d:54ff:fe21:8f74/64 scope link
valid_lft forever preferred_lft forever

Pretty sure its a hardware issue. But it may be related to the memory issue?

Any thoughts?
 

Attachments

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top