Aegis Aegis 1.7.x

  • ATTENTION! As of November 1, 2020, you are not able to reply to threads 6 months after the thread is opened if there are more than 500 posts in the thread.
    Threads will not be locked, so posts may still be edited by their authors.
    Just start a new thread on the topic to post if you get an error message when trying to reply to a thread.

HELLO_wORLD

Very Senior Member
Release 1.7.8

In the web interface, added a DOC tab, which includes the Read Me, a Web Companion Doc, the Change Log, and Links.
 

n1llam1

Regular Contributor
@HELLO_wORLD,

On my R7800 device in router mode with Voxel V1.0.2.83SF, Kamoj Add-on V5.4b24, Aegis 1.7.8 (logging enabled), after around 20 days of continuous up time, I noticed that the CPU usage became pegged near 100% most of the time.

Using top2, I saw that iptables remained in the process list. On a hunch, I used the Aegis web companion to stop Aegis and then start it back up again. After that, the CPU usage dropped and stabilized at the low end in the range of 3% - 20% most of the time. iptables also disappeared from the process list.

During the 20 days of uptime, the enabled Kamoj cron job to refresh Aegis was running daily and my wlan was restarted multiple times (whenever either 2.4G or 5G band appeared to have stopped working properly). I don't know if wlan restarts would have contributed to this scenario.

Any idea on what could trigger this scenario to occur and how to capture more information to help determine the root cause if it happens again? Thanks.
 

HELLO_wORLD

Very Senior Member
@HELLO_wORLD,

On my R7800 device in router mode with Voxel V1.0.2.83SF, Kamoj Add-on V5.4b24, Aegis 1.7.8 (logging enabled), after around 20 days of continuous up time, I noticed that the CPU usage became pegged near 100% most of the time.

Using top2, I saw that iptables remained in the process list. On a hunch, I used the Aegis web companion to stop Aegis and then start it back up again. After that, the CPU usage dropped and stabilized at the low end in the range of 3% - 20% most of the time. iptables also disappeared from the process list.

During the 20 days of uptime, the enabled Kamoj cron job to refresh Aegis was running daily and my wlan was restarted multiple times (whenever either 2.4G or 5G band appeared to have stopped working properly). I don't know if wlan restarts would have contributed to this scenario.

Any idea on what could trigger this scenario to occur and how to capture more information to help determine the root cause if it happens again? Thanks.
This is hard to figure out what was going on, and if it was related to aegis or not (stopping aegis and restarting it restarts the whole iptables firewall, and that could have solved a problem coming from somewhere else).
It would have helped to get the output of aegis debug (or get the debug output from the web companion), to get the iptables rules.

The cron job is supposed to reset aegis rules, but in your case, the only scenario where aegis would be involved might have been a bug where multiple similar rules were adding and adding each time. Or maybe other rules (not aegis) were duplicating...

Putting this in the cron would prevent this (if it was this) forcing the firewall to restart: /opt/bolemo/scripts/aegis up -refresh -net-wall

But this should not happen anyway, as aegis is supposed to prevent multiple identical rules.

I really would like to know what is going on, and if it comes from aegis, be able to fix this.

So, could you wait a few days before making any change and then run the debug output? The idea is to have the problem reproduce some to see what is going on.
 

n1llam1

Regular Contributor
This is hard to figure out what was going on, and if it was related to aegis or not (stopping aegis and restarting it restarts the whole iptables firewall, and that could have solved a problem coming from somewhere else).
It would have helped to get the output of aegis debug (or get the debug output from the web companion), to get the iptables rules.

The cron job is supposed to reset aegis rules, but in your case, the only scenario where aegis would be involved might have been a bug where multiple similar rules were adding and adding each time. Or maybe other rules (not aegis) were duplicating...

Putting this in the cron would prevent this (if it was this) forcing the firewall to restart: /opt/bolemo/scripts/aegis up -refresh -net-wall

But this should not happen anyway, as aegis is supposed to prevent multiple identical rules.

I really would like to know what is going on, and if it comes from aegis, be able to fix this.

So, could you wait a few days before making any change and then run the debug output? The idea is to have the problem reproduce some to see what is going on.

Definitely, I will not make any change and if it happens again then I will capture the aegis debug output and follow up in this thread. Thanks.
 

HELLO_wORLD

Very Senior Member

n1llam1

Regular Contributor
You’re welcome @kamoj and all others as well :)

@n1llam1 : I will be releasing minor upgrade soon, and before doing so, I wanted to check with you if your problem reappeared or not?
@HELLO_wORLD, Thanks for following up. That problem has not reoccurred.

I think the reason that I saw iptables showing up the the process list from time to time is that it is being used by the @kamoj add-on periodically to gather information and check on things. I'm wondering if on such periodic checks, the iptables command could get hung up for some reason and thus showed up on the top2 process list when I looked. It may just have been a one-off thing.

Thanks again for continuing to work on and providing support for Aegis.
 

HELLO_wORLD

Very Senior Member
Version 1.7.9

  • calling net-scan daemon directly to get LAN devices names, instead of using the NG web page net-cgi trick.
  • added a pre-upgrade process.
  • improved upgrade process output.
  • improved CSS for documentation display.
 

n1llam1

Regular Contributor
Version 1.7.9

  • calling net-scan daemon directly to get LAN devices names, instead of using the NG web page net-cgi trick.
  • added a pre-upgrade process.
  • improved upgrade process output.
  • improved CSS for documentation display.
Thanks @HELLO_wORLD. Upgrade went smoothly and seamlessly as usual on my R7800 device running in router mode with Voxel V1.0.2.84SF and Kamoj Add-on V5.4b26. Documentation appearance is nice and easy to read.
 

n1llam1

Regular Contributor
@HELLO_wORLD,

Circling back to the issue that I previously ran into with the CPU usage on my R7800 device in router mode becoming pegged near 100% most of the time, I have again noticed that the System Load Average numbers started creeping up after about 12 days of continuous up time.

Again, using top2, I saw that iptables remained in the process list. This time I was able to track it down to the repeated use of the "iptables -nL RRDIPT" command by the @kamoj add-on, particularly for bandwidth monitoring. Since I do not have a need for bandwidth monitoring, I have disabled that feature and observed that the System Load Average numbers dropped and remained relatively lower.

For example, currently I see (System Load Average last: 1/5/15 minutes: 0.37 / 0.83 / 0.86). Prior to disabling the bandwidth monitoring, the load average numbers were all above 2.00.

I have around 50 devices (wired and wireless) connected to the router so bandwidth monitoring would need more CPU time to process the list of connected devices. In my case, disabling bandwidth monitoring helps to keep the CPU usage as low as possible.
 

n1llam1

Regular Contributor
What is the disabled option (for bandwidth monitoring) that you refer to?
Is it this one "Disable: Traffic Meter service"
It is at the top of the Kamoj Menu->Settings page, as seen in the screen snippet below.

Disable_Bandwidth_Monitoring.JPG
 

HELLO_wORLD

Very Senior Member
It is at the top of the Kamoj Menu->Settings page, as seen in the screen snippet below.

View attachment 33528
Thank you for the follow up :)


So the issue seems to be unrelated to Aegis.

Since it is linked to the bandwidth monitoring feature of the @kamoj addon, have you reported this on the thread for the addon?

I am a bit less available lately to check all threads as often.
 

n1llam1

Regular Contributor
Thank you for the follow up :)


So the issue seems to be unrelated to Aegis.

Since it is linked to the bandwidth monitoring feature of the @kamoj addon, have you reported this on the thread for the addon?

I am a bit less available lately to check all threads as often.

@kamoj has made available a newer version of the add-on beta. When I get some time over the next week or so I will upgrade the add-on to 5.4b27 and look more into the bandwidth monitoring function. Hopefully there will be something more useful to share on the new thread for the add-on after that.
 

kamoj

Very Senior Member
I love your Aegis and it's very impressive to me.

I just missed the statistics, so I added the list of worst abusers (Blocked by Aegis i.e.) to the add-on.
It would make sense to have this as part of Aegis, but I know you want to keep Aegis minimal.
However I wish you consider to implement something like that with possibility to sort by clicking headings etc.
But until then I'll continue to support Aegis as much as I can.
I mean, who want to run these routers without Aegis?
And I'm so happy to bring up the abuse list, because just then I can get an understanding of how good Aegis is!
Aegis rocks!
Example:
Code:
 POS   NUM   FIRST LAST       BPM IP              ORG                                 TIMEZONE             COUNTRY REGION               CITY                 LOC                  HOSTNAME                 
____ _____ _______ _______ ______ _______________ ___________________________________ ____________________ _______ ____________________ ____________________ ____________________ _________________________
   1   122   28485-43830     0.48 45.143.200.102  AS212283 ROZA HOLIDAYS EOOD         Europe/Sofia         BG      Sofia-Capital        Sofia                42.697,23.3241                                 
   2   102   28523-41026     0.49 125.64.94.134   AS38283 CHINANET SiChuan Telecom In Asia/Shanghai        CN      Sichuan              Deyang               31.130,104.3820                               
   3    96   29014-43407     0.40 146.88.240.4    AS20052 Arbor Networks              America/Detroit      US      Michigan             Southfield           42.473,-83.2219      "www.arbor-observatory.com"
   4    81   28600-37351     0.56 89.248.165.98   AS202425 IP Volume inc              Europe/Amsterdam     NL      North Holland        Amsterdam            52.374,4.8897        "recyber.net"             
   5    74   33079-44000     0.41 167.99.243.100  AS14061 DigitalOcean                Europe/Berlin        DE      Hesse                Frankfurt am Main    50.115,8.6842                                 
   6    71   26561-39751     0.32 89.248.165.69   AS202425 IP Volume inc              Europe/Amsterdam     NL      North Holland        Amsterdam            52.374,4.8897        "recyber.net"             
   7    66   27303-44229     0.23 183.136.225.16  AS58461 CT-HangZhou-IDC             Asia/Shanghai        CN      Shanghai             Shanghai             31.222,121.4581                               
   8    65   29758-40300     0.37 89.248.165.63   AS202425 IP Volume inc              Europe/Amsterdam     NL      North Holland        Amsterdam            52.374,4.8897        "recyber.net"             
  ..
  32    27   32843-32846   405.00 81.161.63.100   AS202984 Chernyshov Aleksandr Aleks Europe/Moscow        RU      Moscow               Moscow               55.752,37.6156                         
  ..
  56    18   32765-32768   270.00 193.123.70.211  AS31898 Oracle Corporation          Asia/Dubai           AE      Dubai                Dubai                25.077,55.3093
  ..
  86    11   28250-28250   660.00 172.105.77.209  AS63949 Linode                      Europe/Berlin        DE      Hesse                Frankfurt am Main    50.115,8.6842        "li2038-209.members.linode.com"
  87    11   27946-33767     0.11 167.248.133.25  AS398722 Censys                     America/Chicago      US      Illinois             Chicago              41.850,-87.6500      "scanner-03.ch1.censys-scanner.com"
  88    10   33339-33339   600.00 74.82.47.59     AS6939 Hurricane Electric LLC       America/Los_Angeles  US      California           San Jose             37.339,-121.8950     "scan-10n.shadowserver.org"
  89    10   29263-43667     0.04 5.188.206.157   AS59900 Balkan Internet Exchange Lt Europe/Sofia         BG      Sofia-Capital        Sofia                42.697,23.3241                                 
  90    10   42167-43877     0.35 193.27.229.47   AS49505 OOO Network of data-centers Europe/Moscow        RU      St.-Petersburg       Saint Petersburg     59.938,30.3141                                 
  91    10   33248-33248   600.00 150.109.182.140 AS132203 Tencent Building           Asia/Bangkok         TH      Bangkok              Bangkok              13.754,100.5014                               
  92     9   35917-35917   540.00 89.40.70.51     AS3280 LayerBridge SRL              Europe/Bucharest     RO      Bucure##ti           Bucharest            44.432,26.1063       "hecadigi.co.uk"         
  93     9   30149-30149   540.00 89.248.169.12   AS202425 IP Volume inc              Europe/Amsterdam     NL      North Holland        Amsterdam            52.374,4.8897                                 
  94     9   30616-30616   540.00 89.248.168.220  AS202425 IP Volume inc              Europe/Amsterdam     NL      North Holland        Amsterdam            52.374,4.8897        "security.criminalip.com"
  95     9   28161-39566     0.05 89.248.165.73   AS202425 IP Volume inc              Europe/Amsterdam     NL      North Holland        Amsterdam            52.374,4.8897        "recyber.net"
 

kamoj

Very Senior Member
@HELLO_wORLD,

Circling back to the issue that I previously ran into with the CPU usage on my R7800 device in router mode becoming pegged near 100% most of the time, I have again noticed that the System Load Average numbers started creeping up after about 12 days of continuous up time.

Again, using top2, I saw that iptables remained in the process list. This time I was able to track it down to the repeated use of the "iptables -nL RRDIPT" command by the @kamoj add-on, particularly for bandwidth monitoring. Since I do not have a need for bandwidth monitoring, I have disabled that feature and observed that the System Load Average numbers dropped and remained relatively lower.

For example, currently I see (System Load Average last: 1/5/15 minutes: 0.37 / 0.83 / 0.86). Prior to disabling the bandwidth monitoring, the load average numbers were all above 2.00.

I have around 50 devices (wired and wireless) connected to the router so bandwidth monitoring would need more CPU time to process the list of connected devices. In my case, disabling bandwidth monitoring helps to keep the CPU usage as low as possible.
Could it be that the Aegis "nightly update" is run by a cron job at the same time as the Bandwidth monitoring?
Both "programs" use the net-wall to update the iptables.
You can on the net-wall log by:
Code:
touch /var/log/net-wall.log
If you use the add-on default, try to change the Aegis cron job to not run simultaneously with Bandwidth Monitor:
Code:
[ -x /opt/bolemo/scripts/aegis ] && sleep 29 && /bin/sh /opt/bolemo/scripts/aegis refresh -html

(The Bandwidth Monitoring code is almost unchanged since implemented.
It scans for new devices every minute,
updates the counters every 2:nd minute (30:th minute at night)
)
 

Similar threads

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top