What's new
  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

/sbin/preinit Memory Leak in 380 series FW

kvic

Part of the Furniture
Asus ARM routers. 380 series of firmware (stock, Merlin or variants). Memory leak in /sbin/preinit process. Can inspect and monitor in htop (installable from Entware-ng) or by command line:

$ cat /proc/1/status|grep VmRSS

VmRSS: 11944 kB

This process normally occupies 1500 - 2000 KB of RAM. My RT-AC56U being up for 7 days accumulates over 10 MB.

The leak is quite slow but definitely can see on daily basis.

Anyone else see the leakage on ARM routers with 380 series firmware?
 
Have you tried going old school with valgrind?

Thanks for the suggestion. Looks like a useful tool for user space programs.

/sbin/preinit ... the program name is a misnomer IMO. It has process Id = 1. The parent of all user space programs...

Seems Asus makes funky use of this daemon.

Also can't restart the program without reboot unfortunately.
 
preinit = rc. It's that one single daemon that does virtually everything in the router. Good luck tracking down any leak in that monster...
 
preinit = rc. It's that one single daemon that does virtually everything in the router. Good luck tracking down any leak in that monster...

The leak probably isn't in init - as the mother of all processes...

It's going to be one of the children, so chase down the PID tree, and look for the the new code if one is debugging the 280 releases... it's likely there...
 
Before digging, would like to have an independent confirm of the memory leak...

My kernel is also heavily patched but I shall be doing a good job there (*cough* *cough*)

Perhaps no body is running a 380 series firmware long enough to notice the leak?
 
Hi, definetly I do see this leak (about 1% of available memory per day) with my rt-ac3200 and merlin 380.58. This leak can be observed w/378 official versions as well. In any case, if allows for 20-30 days without requiring a reboot.
 
Hi, definetly I do see this leak (about 1% of available memory per day) with my rt-ac3200 and merlin 380.58. This leak can be observed w/378 official versions as well. In any case, if allows for 20-30 days without requiring a reboot.

Thanks for your feedback.

I had been running 378.55 for close to half year. The longest non-stop run was more than 20 days. I hadn't seen a memory leak in /sbin/preinit. Which official 378 firmware has this issue? I would be interested in checking its GPL code.

Just a friendly reminder.

This thread is about a specific memory leak that can be verified to exist or not by using the command line that I provided in post #1.
 
Hi, well, I ran and saw this problem with official 378.9459 and 378.9529 firmwares before migrating to merlin (current situation) 380.58. I have not tested with earlier versions.
 
Hi, well, I ran and saw this problem with official 378.9459 and 378.9529 firmwares before migrating to merlin (current situation) 380.58. I have not tested with earlier versions.

Thanks. That helps. I will look at 378.9529 GPL.

I found a way to accelerate the memory leak. Are you willing to SSH into your router and perform some test on 380.58?

Here are the steps

Step 1: run the following line

$ cat /proc/1/status|grep VmRSS

Record down on paper the VmRSS reported.

Step 2: paste line by line the following two lines:

$ i=0
$ while [ $i -lt 2000 ]; do killall hotplug2; let i++; done

Step 3: run the below line

$ cat /proc/1/status|grep VmRSS

Record down on paper the VmRSS reported.

Can you or someone report the VmRSS seen in Step 1 and 3?

FWIW, Step 2 will attempt to kill off the process hotplug2 2000 times. No other adverse effect but memory leak in /sbin/preinit will increase by an observable amount. A reboot will release the leaked memory.
 
The leak probably isn't in init - as the mother of all processes...

It's going to be one of the children, so chase down the PID tree, and look for the the new code if one is debugging the 280 releases... it's likely there...
Indeed.
 
I may have found the culprit. Need to find time to test. @sfx2000, your guess is not right this time round..
 
https://github.com/RMerl/asuswrt-merlin/blame/master/release/src/router/rc/init.c#L6677

In rc/init.c, Asus changed to use sigwaitinfo() in 380.x instead of sigwait() in 378.x. Apparently in an attempt to debug some mysterious issues known to them.

Reverting commit bf42308 [Merged with 380_858 (AC88U)] in this function to use sigwait() as in 378.x and solved the problem for me. @RMerlin, you may want to have a look.

Both usages seem equivalent in the context. Couldn't explain how the leak happens.
 
Last edited:
Did you notice that the next code sequence is part of the infamous 'drop_caches' change on ARM? Are you sure you aren't just seeing an increase in page cache use?
 
Did you notice that the next code sequence is part of the infamous 'drop_caches' change on ARM? Are you sure you aren't just seeing an increase in page cache use?

I saw that :) My immediate reaction.. it's not a good idea to put the cache purge there. Waaayyy to frequent.

On the leakage, in both good and bad cases, I have drop_caches=0.
 
Kvic, your test (the 2000 loops) in my case makes VmRSS jump from 1740 to 2756Kb
 
I saw that :) My immediate reaction.. it's not a good idea to put the cache purge there. Waaayyy to frequent.

I never understood the technical reason behind Asus doing this, aside from always showing the highest amount of free memory possible. Even the Kernel documentation advises against using that proc interface except for debugging purposes.
 
https://github.com/RMerl/asuswrt-merlin/blame/master/release/src/router/rc/init.c#L6677

In rc/init.c, Asus changed to use sigwaitinfo() in 380.x instead of sigwait() in 378.x. Apparently in an attempt to debug some mysterious issues known to them.

Reverting commit bf42308 [Merged with 380_858 (AC88U)] in this function to use sigwait() as in 378.x and solved the problem for me. @RMerlin, you may want to have a look.

Both usages seem equivalent in the context. Couldn't explain how the leak happens.

Wow, that's not very bright of them - I could see perhaps in debug, but not in shipping code - no wonder...

the two do pretty much the same thing, but it's the outcome that is different, in that sigwaitinfo() stores results..

good catch.
 

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Back
Top