What's new

Scribe Scribe v3.2.6 [2025-Dec-22] - Entware syslog-ng and logrotate installer for Asuswrt-Merlin

Yep, the message has already been modified accordingly.
That line in the .asp must set a record for length!

If I follow, this message will show if the timeout is triggered. That would also happen if the file is too big (takes too long) to load. And, can the log file ever not exist for uiScribe? I know syslog-ng won't create the file if it doesn't exist until it wants to write to the destination file.

One of the problems with scribe is its dependence on logrotate being configured exactly right. You have to have a logrotate.d configuration file for every log file. Otherwise, the overall maxsize of 4mb contained in A00global will trigger a rotation of the log but no restart of syslog-ng, and no logging at all. I've been caught out with that error a number of times, and your implementation of showing the size of a log file is going to be a huge help. I have been noodling, though, on the idea of implementing syslog-ng's native rotation, with a setting of
Code:
file("/var/log/my-logfile.log" logrotate(enable(yes), size(4MB), rotations(5)));
which would need to be in every syslog.d destination. But that would eliminate the problem of a non-existent logrotate.d file, the problem of syslog-ng hanging, and the problem of any log file becoming too big, the need to adjust the cron job, and maybe some other user errors. Dunno.
 
That line in the .asp must set a record for length!
LOL!! 😄👍

... That would also happen if the file is too big (takes too long) to load.
Nope, that's incorrect. There's a separate message for the case where the file cannot be loaded for some reason (e.g. being too large - well over 4MB).

And, can the log file ever not exist for uiScribe?
It's unusual, but it can certainly happen. A friend of mine was running the BETA version last night, and right after the initial installation, he found that the wlceventd.log log file did not exist for some time.

I know syslog-ng won't create the file if it doesn't exist until it wants to write to the destination file.
Yep, another case in point.

One of the problems with scribe is its dependence on logrotate being configured exactly right. You have to have a logrotate.d configuration file for every log file. Otherwise, the overall maxsize of 4mb contained in A00global will trigger a rotation of the log but no restart of syslog-ng, and no logging at all. I've been caught out with that error a number of times, and your implementation of showing the size of a log file is going to be a huge help. I have been noodling, though, on the idea of implementing syslog-ng's native rotation, with a setting of
Code:
file("/var/log/my-logfile.log" logrotate(enable(yes), size(4MB), rotations(5)));
which would need to be in every syslog.d destination. But that would eliminate the problem of a non-existent logrotate.d file, the problem of syslog-ng hanging, and the problem of any log file becoming too big, the need to adjust the cron job, and maybe some other user errors. Dunno.
Some people would say the combination of the logging utility plus the log rotating tool makes it more powerful because you can fine-tune the combo to meet your own specific needs and requirements, based on your system resources, the volume of log entries being generated on a daily basis, and the purpose of keeping more than 3MB to 4MB worth of log files - especially if you're not a system admin doing weekly reviews and historical analysis of the system logs looking for any signs of system malfunctions, external intrusions, unusual resource consumption, etc.

OTOH, I do understand that using those tools properly and to their full potential requires users to be a bit more technically savvy about their system and Linux in general. It's really a double-edged sword. ;)

I'm still relatively new to scribe and logrotate myself and still learning the ins and outs, so I can't really help you with the details of all possible configurations.

Just my 2 cents.
 
Well, at current exchange rates the 2 cents of the maintainer are worth more than the 2 cents of this here gawker.

I just harbor the suspicion that no one is paying the slightest attention to the logrotate options. Not minsize, not compression, not the rotate number and cetainly not daily. 10 log files, max 4MB, with 4 prior versions, is only 200MB total. And folks actively scrutinizing the log files would be better sending them to one of the services with more analytics. We fooled a bit with loggly once.
 
Last edited:
Holy Smoke, Batman!! You seem to want "credit" for posting or even being on this forum!!
LOL, I'm not slipping down again into another argument with such a rude and disrespectful person. Especially knowing that you have the higher hand, and people usually cheer bosses rather than rights! That's just pathetic and such a shame.
it was agreed that you would submit a PR with your solution so we could review it for integration.
Well, after the last few incidents, it's obvious that I shouldn't invest any more time/effort into a PR (other than convincing you and explaining the feature in FULL details) to not be "credited" -the word that you, and some others, hate- in the end!

I'll take your reply as a rude admission that you used my proposed idea in details, and the idea from @penguin22 for log rotation:
It would be very helpful to have the ability to clear specific logs or logrotate from the uiScribe GUI. I was hunting for this capability, which appears to only be without Scribe installed and had to go hunting through SSH to get it cleared for the router to be responsive again.
Crediting it to "some people" as if they were too many and hard to mention! I feel just sorry for such mentalities!

worth more than the 2 cents of this here gawker
LOL, congrats for your 2 cents! Hoping that you find another two specific issues with Scribe for the benefit of everyone! That gawker already acknowledges more bugs and flaws within Scribe, he's just glad now that he didn't and won't report them.

I won't comment anymore in this regard.
 
Well, at current exchange rates the 2 cents of the maintainer are worth more than the 2 cents of this here gawker.

I just harbor the suspicion that no one is paying the slightest attention to the logrotate options. Not minsize, not compression, not the rotate number and cetainly not daily. 10 log files, max 4MB, with 4 prior versions, is only 200MB total. And folks actively scrutinizing the log files would be better sending them to one of the services with more analytics. We fooled a bit with loggly once.
You're probably correct; the vast majority of users are likely leaving the default options & settings assigned in the configuration files and don't check other possible options. For ASUS routers, though, I can only assume that @cmkelley, with the help of @Jack Yaz and other forum users who participated in testing and troubleshooting the early development and production releases, arrived at those default settings after some experimentation and consideration of how most users configure their home routers.
 
The screenshots and output shown in your latest post look correct, and they indicate that your Scribe setup is currently configured as expected. If your JFFS partition continues to get filled up again, I suspect it might be caused by something else, but perhaps it leads to a corruption in the Scribe functionality.

I have a custom shell script that I've used before as a diagnostic tool to monitor cases when either the TMPFS or the JFFS file system was being filled up slowly with large files over a period of a few weeks. IIRC, the last time was about 2 years ago, where the Traffic Monitor (or Traffic Analyzer??) was generating a very large database file (a little over 40MB). The log file generated by the diagnostic script and stored in the USB-attached drive showed the database file slowly increasing and filling the JFFS partition during the previous 3 weeks.

In your particular case, I don't really know what's causing the problem, but you could set up the diagnostic script to run as a cron job at a fixed interval (e.g. every 20 mins or every 4 hours), depending on how fast the JFFS is getting filled up and, hopefully, the log file will capture some rogue file(s) getting larger over time.

You can use the following commands to download the custom script from my personal GitHub repo:
Bash:
curl  -LSs --retry 3 --retry-delay 5 --retry-connrefused \
https://raw.githubusercontent.com/Martinski4GitHub/CustomMiscUtils/master/Diags/LogMemoryStats.sh \
-o /jffs/scripts/LogMemoryStats.sh && chmod a+x /jffs/scripts/LogMemoryStats.sh

Once downloaded into the router, type the following command:
Bash:
/jffs/scripts/LogMemoryStats.sh -help
The output will provide some useful CLI syntax to set up the cron job and other parameters to configure the diagnostic script for your own specific needs. Try to run the script by itself first to see the kind of output that gets generated, and if you're willing to let it run as a cron job, execute the call to set it up with your preferred time interval. If you want the cron job to persist across reboots, copy & paste the given command at the end of your '/jffs/scripts/post-mount' hook script.

Example call:
Bash:
/jffs/scripts/LogMemoryStats.sh -setcronjob 1hour

HTH.
thank you, i will download and run the script. Really appreciate your time and efforts.

David
 
Release Notes for Scribe v3.2.6_25120600 BETA-2 development version now available
[2025-Dec-06]


1) NEW: Added a new 'A01global' configuration file to set more default global directives for log rotation.
These default global directives take effect unless overridden by options set in a separate configuration file associated with a specific log file (e.g. "/opt/etc/logrotate.d/messages").​

2) IMPROVED: Cron job to rotate logs was modified. All calls to logrotate will now include filtered log files that don't have a corresponding configuration file in the "/opt/etc/logrotate.d" directory.
This means that users don't need to add a logrotate configuration file for each filtered log file unless they want to apply additional or different options/directives from the global defaults. Upon a fresh installation of Scribe, the only individual configuration files included in the "/opt/etc/logrotate.d" directory by default are messages, logrotate, and syslogng in addition to the global files (A00global, A01global).​

3) IMPROVED: Added a mutually exclusive lock for the calls to logrotate made from Scribe and uiScribe to prevent a possible case where simultaneous executions will attempt to rotate the same set of log files.

IMPORTANT NOTE:
Installation of this latest BETA development version of Scribe is required for any future uiScribe BETA development versions.

To get the latest BETA development v3.2.6_25120600 version, run the following commands:
<CODE>
/jffs/scripts/scribe develop
/jffs/scripts/scribe forceupdate
<CODE>

Scribe_v3.2.6_HelpDevelop.jpg


P.S.
I ran out of time to test and validate the latest changes of uiScribe before making the next BETA-2 release, so I'll do that when I get some free time over the weekend.
 
Last edited:
Beta-2-dev running sweetly on my GT-AX6000.

After a full reset and rebuild from scratch a while back I had forgotten to add a logrotate file for my custom "notmesh" syslog-ng filter ... had to uninstall and reinstall after runaway on my webui access to System Log tab.

At least now - when I forget next time - your A01Global will be there as a catch-all - thanks 👍
 
Release Notes for Scribe v3.2.6 production version now available
[2025-Dec-22]


1) NEW: CLI menu option to set the hour interval to run the logrotate cron job.
Users can now set the cron job to run every 6, 8, 12, or 24 hours.​
This should help with automatically rotating log files that tend to grow faster and larger before the default 24-hour interval is reached.​

2) NEW: Added a new 'A01global' configuration file to define more default global directives for log rotation.
These default global directives take effect unless overridden by options set in a separate configuration file associated with a specific log file (e.g. "/opt/etc/logrotate.d/messages").​

3) IMPROVED: The cron job to rotate logs was modified. Calls to logrotate will now include filtered log files that don't have a corresponding configuration file in the "/opt/etc/logrotate.d" directory.
This change in effect means that users no longer have to add a specific logrotate configuration file for each filtered log file unless they want to apply additional or different directives from the global defaults. During a fresh installation, the only individual configuration files included in the "/opt/etc/logrotate.d" directory by default are messages, logrotate, and syslogng in addition to the global files (A00global, A01global). Also, note that those 3 config files have had their individual "postrotate/endscript" directive removed, since that's no longer required for each configuration file. The default global configuration takes care of that.​

4) IMPROVED: Added a mutually exclusive lock for the calls to logrotate made from Scribe and uiScribe to prevent a possible scenario where simultaneous executions will attempt to rotate the same set of log files.

5) Modifications to support calls from uiScribe to rotate and clear logs from its WebUI.

6) Miscellaneous code improvements.

IMPORTANT NOTE:
Installation of the v3.2.6 production version of Scribe is required for uiScribe v1.4.10 (or later) version.

For users currently running any of the BETA development versions, run the following commands to switch to the latest production release:
Bash:
/jffs/scripts/scribe stable
/jffs/scripts/scribe forceupdate

Sample screenshots:

Scribe_v3.2.6_CLI_MainMenu.jpg


Scribe_v3.2.6_LR_CronJobInterval.jpg


Enjoy and Happy Holidays!!! 🎄
 
RT-AX86U_Pro (aarch64) Kernel-4.19.183 - FW-3006.102.6

Installed latest scribe for 1st time but now after reboot, I dont see those initial kernel log messages - Is that expected?

Example reboot logs missing

Code:
Dec 31 19:00:21 kernel: broadcomThermalDrv ubus@ff800000:brcm-therm: init (CPU count 4 4 4 4)
Dec 31 19:00:21 kernel: brcm_otp_init entry
Dec 31 19:00:21 kernel: Loading wlshared Module...
Dec 31 19:00:21 kernel: platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
Dec 31 19:00:21 kernel: Loading HND Module
Dec 31 19:00:21 kernel: igs_module_init:840:     IGS 29 create network socket successful
Dec 31 19:00:21 kernel: osl_skb_audit: PASS
Dec 31 19:00:21 kernel: wl_module_init: passivemode set to 0x1
Dec 31 19:00:21 kernel: wl_module_init: txworkq set to 0x0
Dec 31 19:00:21 kernel: ^[[0;32mwl_pktfwd_sys_init:  ^[[1m^[[34mWL_PKTFWD[1.1.0]^[[0m Success^[[0m
Dec 31 19:00:21 kernel: BME Service Initialization
Dec 31 19:00:21 kernel: BME: M2M Core id 0x844 rev 129
Dec 31 19:00:21 kernel: M2M Service Initialization: M2M_DD_ENAB 0x00000003
Dec 31 19:00:21 kernel: M2M: M2M Core id 0x844 rev 129
Dec 31 19:00:21 kernel: M2M: SUCCESS eng 0 alloc rings <0000000059a97c6d,00000000b1be07dd>,00000000aff01e46 depth 256
Dec 31 19:00:21 kernel: M2M: SUCCESS eng 1 alloc rings <00000000e88fe7e6,000000005ae4836e>,00000000fbbbcb56 depth 256
Dec 31 19:00:21 kernel: M2M: DMA 0 IntRcvLazy 01000000
Dec 31 19:00:21 kernel: M2M: DMA 0 RcvCtrl XmtCtrl enabled
Dec 31 19:00:21 kernel: M2M: DMA 1 IntRcvLazy 01000000
Dec 31 19:00:21 kernel: M2M: DMA 1 RcvCtrl XmtCtrl enabled
Dec 31 19:00:21 kernel: wl0: wlc_ap_attach dynamic_ed_thresh_enable = 0
Dec 31 19:00:21 kernel: BME:register key<1010802> user RLM sel IDX set 1 mem 2 8 hi 0 0
Dec 31 19:00:21 kernel: CSIMON module registered
Dec 31 19:00:21 kernel: wfd0-thrd WL 0 FLowControl total<55924> lo<16777> hi<8388> favor prio<4>
Dec 31 19:00:21 kernel: ^[[0;31mwfd_bind: wfd0-thrd initialized pktlists: radio 0 nodes 1032 xfer wl_pktfwd_xfer_callback+0x0/0x450 [wl]
Dec 31 19:00:21 kernel: ^[[0m
Dec 31 19:00:21 kernel: Creating CPU ring for queue number 5 with 1024 packets descriptor=0xffffff8000d86f40, size_of_entry 16
Dec 31 19:00:21 kernel: Done initializing Ring 5 Base=0xffffffc028200000 num of entries= 1024 RDD Base=28200000 descriptor=0xffffff8000d86f4
Dec 31 19:00:21 kernel: Creating CPU ring for queue number 6 with 1024 packets descriptor=0xffffff8000d86fc0, size_of_entry 16
Dec 31 19:00:21 kernel: Done initializing Ring 6 Base=0xffffffc028208000 num of entries= 1024 RDD Base=28208000 descriptor=0xffffff8000d86fc
Dec 31 19:00:21 kernel: ^[[1m^[[34m wfd_bind: Dev eth%d wfd_idx 0 wl_radio_idx 0 Type skb:sll configured WFD thread wfd0-thrd minQId/maxQId
Dec 31 19:00:21 kernel: Instantiating WFD 0 thread
Dec 31 19:00:21 kernel: ^[[0;32mwl_wfd_bind: wl0 wfd_idx 0 success^[[0m
Dec 31 19:00:21 kernel: alloc_netinfo enter bssidx=0 wdev=00000000e92043e4 ndev=00000000381f5104
Dec 31 19:00:21 kernel: alloc_netinfo exit iface_cnt=1
Dec 31 19:00:21 kernel: wfd_registerdevice Successfully registered dev eth6 ifidx 0 wfd_idx 0
Dec 31 19:00:21 kernel: eth6: Broadcom BCM6710 802.11 Wireless Controller 17.10.188.6401 (r808804)
Dec 31 19:00:21 kernel: dgasp: kerSysRegisterDyingGaspHandler: eth6 registered
Dec 31 19:00:21 kernel: DHD_FKB_POOL size is:1280 and entry size:2816
Dec 31 19:00:21 kernel: DHD_PKTTAG POOL size is:2592 and entry size:128
Dec 31 19:00:21 kernel: ^[[0;32mdhd_pktfwd_sys_init:  ^[[1m^[[34mdhd_PKTFWD[1.0.0]^[[0m Success^[[0m
Dec 31 19:00:21 kernel: PCI_PROBE:  bus 1, slot 0,vendor 14E4, device 6715(good PCI location)
Dec 31 19:00:21 kernel: dor1: runner supported ring format types TxP 0x5, RxP 0x3 TxC 0x3 RxC 0x3^M
Dec 31 19:00:21 kernel: dhd_runner_attach: Tx Offload - Enabled, Ring Size = 1024
Dec 31 19:00:21 kernel: dhd_runner_attach: Rx Offload - Enabled, Ring Size = 1024
Dec 31 19:00:21 kernel: TX wakeup info: reg = <0x82801004>, val = <0x00000006>
Dec 31 19:00:21 kernel: RX wakeup info: reg = <0x82801004>, val = <0x0000000d>
Dec 31 19:00:21 kernel: Registering Vendor80211
Dec 31 19:00:21 kernel: alloc_netinfo enter bssidx=0 wdev=00000000eefb5199 ndev=0000000029f7f26f
Dec 31 19:00:21 kernel: alloc_netinfo exit iface_cnt=1
Dec 31 19:00:21 kernel: Get event_msgs error (-19)
Dec 31 19:00:21 kernel: failed to set WLC_E_CAC_STATE_CHANGE bit
Dec 31 19:00:21 kernel: dhd_attach(): thread:dhd_watchdog_thread:3d8 started
Dec 31 19:00:21 kernel: ^[[0;31mwfd_bind: wfd1-thrd initialized pktlists: radio 1 nodes 1032 xfer dhd_pktfwd_xfer_callback+0x0/0x8a0


Below is my config:

Code:
     /',__) /'___)( '__)| || '_`\  /'__`\
     \__, \( (___ | |   | || |_) )(  ___/
     (____/`\____)(_)   (_)(_,__/'`\____)
     syslog-ng and logrotate installation
           v3.2.6 [Branch: master]
     https://github.com/AMTM-OSR/scribe
          Original author: cmkelley

 =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

      s.   Show scribe status
     rl.   Reload syslog-ng.conf
     lr.   Run logrotate now
     rs.   Restart syslog-ng
     st.   Stop syslog-ng & logrotate cron
     ct.   Set logrotate cron job time interval

      u.   Check for script updates
     uf.   Force update scribe with latest version
     ft.   Update filters
     su.   scribe utilities
      e.   Exit scribe

     is.   Reinstall scribe
     zs.   Remove scribe

 =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

 Please select an option: s

      checking syslog-ng daemon ... alive.
    syslog.log default location ... /jffs/syslog.log
  ... & agrees with config file ... okay!

 checking system for necessary scribe hooks ...

          checking S01syslog-ng ... present.
         checking service-event ... present.
            checking post-mount ... present.
               checking unmount ... present.
    checking logrotate cron job ... present.
       checking directory links ... present.

 checking syslog-ng configuration ...

   syslog-ng.conf version check ... in sync. (4.7)
    syslog-ng.conf syntax check ... okay!

          scribe installed version: v3.2.6 (master)
             scribe GitHub version: v3.2.6 (master)
                    scribe is up to date!
 

Attachments

The implementation of syslog-ng has grown immensely in its capabilities, first with the first implementations of scribe, and then the code for handling different routers, and now still more. That takes a bit of time before syslog-ng is up and running.

My routers generate over a thousand messages on boot up, and scribe's syslog-ng.conf sets a fifo queue size of 256 messages. I don't know how much the klog and syslogd devices can hold before they are harvested.

So I think there is some tension between when klogd and syslogd are killed, and when syslog-ng is ready to start harvesting messages, and when it does, how much it holds of what it harvested of what was retained while nothing was being harvested. What you are seeing may be that.

I'm sensitized to this because I've tried to push off the reboot messages to its own file, and starting a clean new boot syslog, with success years ago but not recently. I've boosted the fifo size to 2048, and I mean to fool around with S01 and the function addon to speed up the starting process.
 
Thanks for the response. Any idea why df and du are reporting such big difference?

Filesystem Size Used Available Use% Mounted on
ubi:jffs2 44.5M 15.4M 26.8M 37% /jffs


du -a /jffs | sort -nr | head
3116 /jffs
980 /jffs/signature/rule.trf
980 /jffs/signature
848 /jffs/addons
728 /jffs/addons/shared-jy
564 /jffs/scripts


lsof | grep jffs
2514 /usr/sbin/awsiot /jffs/awsiot_log
15713 /jffs/scripts/arp/mhttp_arm64 /dev/null
15713 /jffs/scripts/arp/mhttp_arm64 /tmp/mhttp_arm64.log
15713 /jffs/scripts/arp/mhttp_arm64 /tmp/mhttp_arm64.log
15713 /jffs/scripts/arp/mhttp_arm64 socket:[75160]
 
FYI,

There's a new 'develop' branch v3.2.7_26010323 version that includes 3-hour and 4-hour intervals for the logrotate cron job schedule. Also, it increases the log FIFO queue configuration option for the number of messages to 1024 (based on input from @elorimer).

Scribe_v3.2.7_LR_CronJobInterval.jpg


Scribe_v3.2.7_HelpDevelop.jpg
 
FYI,

There's a new 'develop' branch v3.2.7_26010323 version that includes 3-hour and 4-hour intervals for the logrotate cron job schedule. Also, it increases the log FIFO queue configuration option for the number of messages to 1024 (based on input from @elorimer).

View attachment 69793

View attachment 69794
Giving this a try. The last 12hrs at 6hr logrotate cron has been sweet though. The only thing I tried differently this addon buildup was leaving the dns watchdog disabled til the last moment. It's been running without any problems and no pageload lag since.
I've added all my filters back for the next 24hr run
 
If I follow, in the function add-on, the hunt for the syslog location wthin kill_logger only happens on start up, but not thereafter and that happens before syslogd and klogd are killed. That saves a bit of time. I had a few notions that I will share that I mean to try out, but don't actually have time in the near future:

1. The hunt for the syslog location is specific to router models. It should be possible to do this on install, and not make it part of kill_logger at all. Set and forget.
2. There are a fair number of commands after the processes are killed before syslog-ng starts. Would it not be possible to move some of those up in the sequence? It really doesn't matter how long it takes syslog-ng to get going (it could be S99 instead of S01, and maybe should be); I think what matters is the delay between killing one and starting the other. This only really matters on startup, when the message rate is fastest.
3. The log_msg_size of 10K is kind of large. For the kind of messages the routers kick off, I think 512 is enough. The product of log_msg_size and log_fifo_size establishes the buffer size for each destination, so 1024 and 512 would give you 512k (this is what I use), while 1024 and 10240 would give you 5M for each destination. Even so, htop gives me 368M as the virtual memory used by syslog-ng.
4. While it doesn't really matter for the rate of logging in the routers, the filter/destinations are processed alphabetically. That is to say, if the noisiest programs are last (unbound?) syslog-ng processes all the other possibilities first before it gets to the relevant filter. if you call the same filter/destination file "1unbound", syslog-ng will process it first, match, and stop further processing. So I've numbered my files by number of log statements. But as I say, syslog-ng can handle logging rates a hundred times faster than the router spins off.
 
If I follow, in the function add-on, the hunt for the syslog location wthin kill_logger only happens on start up, but not thereafter and that happens before syslogd and klogd are killed. That saves a bit of time. I had a few notions that I will share that I mean to try out, but don't actually have time in the near future:

1. The hunt for the syslog location is specific to router models. It should be possible to do this on install, and not make it part of kill_logger at all. Set and forget.
The path of the built-in system log file is obtained during initial installation, and it gets stored in a configuration file for future retrieval. The "kill_logger()" function knows about and has access to this configuration file, but it also has code to figure out the path on its own, just in case the configuration file is not found or it doesn't contain the expected path variable. Setting up this redundancy to be able to obtain a critical file path is good coding practice, especially for a function that must also execute independently of the shell script during the reboot sequence when the USB-attached drive containing Entware is mounted, and the S01syslog-ng service is started. If everything is set up and working as it should, the "kill_logger()" function simply reads the file path from the configuration file, and the rest of the redundant code may never execute after the initial installation.

Given the above, I fully disagree with your assessment that the "kill_logger()" function should not include the code to be able to obtain the path of the built-in system log file.

2. There are a fair number of commands after the processes are killed before syslog-ng starts. Would it not be possible to move some of those up in the sequence? It really doesn't matter how long it takes syslog-ng to get going (it could be S99 instead of S01, and maybe should be); I think what matters is the delay between killing one and starting the other. This only really matters on startup, when the message rate is fastest.
The commands found after the code that kills the built-in system loggers are really very simple and take only microseconds to execute. IMO, any significant delay(s) may be due to the "sleep 2" call made within the loop that kills the built-in system loggers.

kill_logger_loop.jpg


As explained in the comment, the sleep call is made to double-check afterward that the logging services have not been restarted by the system; but it's very likely that within these 2 seconds (or more if the sleep command is executed multiple times) some log messages are getting "lost" if nobody is "listening."

3. The log_msg_size of 10K is kind of large. For the kind of messages the routers kick off, I think 512 is enough. The product of log_msg_size and log_fifo_size establishes the buffer size for each destination, so 1024 and 512 would give you 512k (this is what I use), while 1024 and 10240 would give you 5M for each destination. Even so, htop gives me 368M as the virtual memory used by syslog-ng.
The original size of the "log_msg_size" option was set to 16K (which is the current setting in the production release), and I have found no explanation or reason in any of the code (or old posts) for making it that large. I can only assume that @cmkelley set that value based on user feedback or some empirical evidence showing that there are some log messages that are nearly that long. If so, those log entries are likely outliers and occur only in some extremely rare scenarios, which is why I decided to decrease the value to 10K while running some debugging code to experiment.

Ultimately, I think it may not be worth setting the "log_msg_size" to those large values just to capture some very rare event(s), and a more conservative setting would be more useful in the long run. For now, I've settled for 2K as the maximum message size while I run the development branch version myself, and other people have volunteered to experiment with it as well.

4. While it doesn't really matter for the rate of logging in the routers, the filter/destinations are processed alphabetically. That is to say, if the noisiest programs are last (unbound?) syslog-ng processes all the other possibilities first before it gets to the relevant filter. if you call the same filter/destination file "1unbound", syslog-ng will process it first, match, and stop further processing. So I've numbered my files by number of log statements. But as I say, syslog-ng can handle logging rates a hundred times faster than the router spins off.
I have no comment here as I didn't quite understand the main point of what you were trying to explain.

Just my 2 cents.
 
Thanks for the detailed response. That all makes sense. To follow up a bit:

2. I never quite understood why the wait loop was introduced in the kill process. We had gone for years without needing it. There must have been some instance of a restart. But if I follow, there are two seconds when no polling of either socket takes place. I think the cure might be worse than the disease, and I might try without the loop.

3. I think before I reduced the message size I was seeing memory use on the order of 968M.

4. At the risk of over explaining, let me try again. Scribe is set up for syslog-ng to process a waterfall of configurations. The configurations are processed alphabetically. Each configuration has a flags(final), so syslog-ng stops processing a message at the first match. A message that doesn't match any configuration drops to the messages bucket. If a configuration doesn't have a flags(final), a duplicate of the message will drop to the messages bucket. The sooner a message finds a match, the less processing time syslog-ng is using. So ordering the configurations to match the noisiest programs first will be more efficient. I gather messages from 4 other network hosts, so I deal with those configurations first. But as I say, the number of messages is no where near taxing the router, so this is academic in our use case.
 
Thanks for the detailed response. That all makes sense. To follow up a bit:

2. I never quite understood why the wait loop was introduced in the kill process. We had gone for years without needing it. There must have been some instance of a restart. But if I follow, there are two seconds when no polling of either socket takes place. I think the cure might be worse than the disease, and I might try without the loop.
This is only speculation, but perhaps @cmkelley ran into some scenario(s) where one or both of the system loggers (i.e. klogd & syslogd) were not shutting down completely after being killed, so he had to introduce a delay to double-check and make sure they were both fully gone before proceeding. Yes, it's not an ideal solution, but I suppose it was deemed "better" than possibly having syslog-ng *and* the system loggers all running simultaneously. :eek:🤷‍♂️

I have a couple of ideas to experiment with, but I don't know when I'll be able to focus on this specific task. The workload in my day job will gradually but surely ramp up in the coming days (as usual after the holidays), so I'll be pretty busy and tired by the time I get home...

3. I think before I reduced the message size I was seeing memory use on the order of 968M.
You likely have more than a handful of filtered log files getting created/updated by syslog-ng. You mentioned that after your changes, the virtual memory size was about 368M (down from 968M). On my RT-AX86U_PRO, htop currently shows 280M, but I have only a few filtered log files at the moment: logrotate.log, syslog-ng.log, wlceventd.log, acsd.log, hostapd.log, and messages.

4. At the risk of over explaining, let me try again. Scribe is set up for syslog-ng to process a waterfall of configurations. The configurations are processed alphabetically. Each configuration has a flags(final), so syslog-ng stops processing a message at the first match. A message that doesn't match any configuration drops to the messages bucket. If a configuration doesn't have a flags(final), a duplicate of the message will drop to the messages bucket. The sooner a message finds a match, the less processing time syslog-ng is using. So ordering the configurations to match the noisiest programs first will be more efficient. I gather messages from 4 other network hosts, so I deal with those configurations first. But as I say, the number of messages is no where near taxing the router, so this is academic in our use case.
Ah, OK. I think I understand now. Thanks for the clarification.
 
This is only speculation, but perhaps @cmkelley ran into some scenario(s) where one or both of the system loggers (i.e. klogd & syslogd) were not shutting down completely after being killed, so he had to introduce a delay to double-check and make sure they were both fully gone before proceeding.
You're right. Not so much the daemons were not shutting down as syslog-ng not starting. I can't remember what I had for breakfast, so I looked. There were some issues two years ago that at least 3 people had with an AX88U and syslog-ng not starting, so this was a brute force fix: https://www.snbforums.com/threads/s...-logrotate-installer.84720/page-7#post-844843.
I took out the loop and I'll try a bit. I have an AX88 as well.
 
Last edited:

Similar threads

Latest threads

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Back
Top