What's new

BACKUPMON BACKUPMON v1.5.10 -Mar 1, 2024- Backup/Restore your Router: JFFS + NVRAM + External USB Drive! (**Thread closed due to age**)

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

I truly have no earthly idea about this one... could it be that cru is crashing and taking its little database of scheduled jobs with it? I have a mystery instance as well where my backups didn't kick off for about 5 days. I may have rebooted on day 6, there's no telling. May need to bring this behavior up in the main Merlin thread. Are any other cron jobs missing?
I have seen some instances where cru and the crontab glitches. it is usually tied into more than one scheduled job disappearing though. not generally related to one single script.
 
I couldn't sleep on this so here's the before reboot and after (a lot more missing than I first thought)

Before:
Code:
12,27,42,57 * * * * /jffs/scripts/spdmerlin generate #spdMerlin#
30 19 */7 * * service restart_letsencrypt #LetsEncrypt#
0 1-23 * * * /jffs/scripts/uiDivStats generate #uiDivStats_generate#
1 0 * * * /jffs/scripts/uiDivStats trimdb #uiDivStats_trim#
* * * * * /jffs/scripts/uiDivStats querylog #uiDivStats_querylog#
4-59/5 * * * * /jffs/scripts/uiDivStats flushtodb #uiDivStats_flushtodb#
00 2 * * Fri /bin/sh /opt/share/diversion/file/update-bl.div reset #Diversion_UpdateBL#
20 5 * * * /bin/sh /opt/share/diversion/file/rotate-logs.div #Diversion_RotateLogs#
20 17 * * * diversion count_ads count #Diversion_CountAds#
30 1 * * Fri /bin/sh /opt/share/diversion/file/stats.div #Diversion_WeeklyStats#
50 */12 * * * sh /jffs/scripts/firewall debug genstats #Skynet_genstats#
*/10 * * * * /jffs/scripts/YazFi check #YazFi#

After:

Code:
45 */6 * * * /jffs/addons/amtm/routerdate cron #amtm_RouterDate#
10 7 * * Sun /bin/sh /jffs/addons/amtm/sc_update.mod -run #amtm_ScriptsUpdateNotification#
15 1 * * * sh /jffs/scripts/backupmon.sh -backup #RunBackupMon#
*/10 * * * * /jffs/scripts/YazFi check #YazFi#
5 0 * * * /opt/sbin/logrotate /opt/etc/logrotate.conf >> /opt/tmp/logrotate.daily 2>&1 #logrotate#
*/10 * * * * /jffs/scripts/ntpmerlin generate #ntpMerlin#
25 8 * * * sh /jffs/scripts/firewall banmalware #Skynet_banmalware#
4 1 * * Mon sh /jffs/scripts/firewall update #Skynet_autoupdate#
0 * * * * sh /jffs/scripts/firewall save #Skynet_save#
12,27,42,57 * * * * /jffs/scripts/spdmerlin generate #spdMerlin#
57 9 */7 * * service restart_letsencrypt #LetsEncrypt#
0 1-23 * * * /jffs/scripts/uiDivStats generate #uiDivStats_generate#
1 0 * * * /jffs/scripts/uiDivStats trimdb #uiDivStats_trim#
* * * * * /jffs/scripts/uiDivStats querylog #uiDivStats_querylog#
4-59/5 * * * * /jffs/scripts/uiDivStats flushtodb #uiDivStats_flushtodb#
00 2 * * Fri /bin/sh /opt/share/diversion/file/update-bl.div reset #Diversion_UpdateBL#
20 5 * * * /bin/sh /opt/share/diversion/file/rotate-logs.div #Diversion_RotateLogs#
20 17 * * * diversion count_ads count #Diversion_CountAds#
30 1 * * Fri /bin/sh /opt/share/diversion/file/stats.div #Diversion_WeeklyStats#
45 */12 * * * sh /jffs/scripts/firewall debug genstats #Skynet_genstats#

So that's
Code:
amtm_RouterDate
amtm_ScriptsUpdateNotification
RunBackupMon
logrotate
ntpMerlin
Skynet_banmalware
Skynet_autoupdate
Skynet_save
All disappeared. I can't find any commonality _ RunBackupMon is the only one in services-start.

EDIT: Added this to the AMTM thread.
This could be "related" to some other underlying issue you are not aware of happening. I suggest try to review your logs for any abnormal entries. Also, make note of your routers current state including internet connection, NTP accuracy, and firewall entries. Something is happening that is causing quite a disturbance in your routers ecosystem.
 
This could be "related" to some other underlying issue you are not aware of happening. I suggest try to review your logs for any abnormal entries. Also, make note of your routers current state including internet connection, NTP accuracy, and firewall entries. Something is happening that is causing quite a disturbance in your routers ecosystem.
Is there any level of logging available that tracks what programs are adding and deleting cron jobs by chance? That might pinpoint the culprit?
 
First off -- HUGE thanks to @Jeffrey Young for sharing his original backup script. His script is the main engine of BACKUPMON, and all credit goes to him! BACKUPMON is simply a wrapper around Jeff's backup script functionality, adding easy-to-use menus, more status feedback, and the ability to launch a restore based on your previous backups. Also, huge props to @visortgw for contributing to the backup methodologies thread with his scripts and wisdom!

Couple of comments...

1) with Github - learn how to do version control - will make it easier for folks to make patches...

Screenshot 2024-02-03 at 7.05.00 PM.png


2) You might want to check your scripts for common errors - it will improve security...
 

Attachments

  • backupmon-1.46.txt
    534.8 KB · Views: 11
Is there any level of logging available that tracks what programs are adding and deleting cron jobs by chance? That might pinpoint the culprit?
Not that I know of, you would be relying on the logs of scripts whose scheduled jobs have gone missing. Maybe some clue in the scripts indicating a departure from normal service such as a shutdown message or logs that indicate a script failure of sorts. That is why a cron monitoring script will be helpful because you can log time stamps of the last time the number of scheduled jobs changed. Then compare those time stamps to syslog entries around the time the issue happen.
 
Couple of comments...

1) with Github - learn how to do version control - will make it easier for folks to make patches...
There have been a few that have submitted updates that I've let through and approved, but I primarily use github as a place to host the script, and keep version control locally.

2) You might want to check your scripts for common errors - it will improve security...
I've gone down the shellcheck rabbithole before... but it becomes a game of excluding multitudes of bogus messages in order to weed it down to the ones that might truly count. In other words, whack-a-mole. But this doesn't improve security... it improves stability and conformity with shell standards.
 
There have been a few that have submitted updates that I've let through and approved, but I primarily use github as a place to host the script, and keep version control locally.


I've gone down the shellcheck rabbithole before... but it becomes a game of excluding multitudes of bogus messages in order to weed it down to the ones that might truly count. In other words, whack-a-mole. But this doesn't improve security... it improves stability and conformity with shell standards.
Agreed. I looked at most of those suggestions and found them rather useless.

For the cron issues being reported here and elsewhere, this appears after the release of 388.6. One has to wonder if Asus's ASD daemon is now goofing around with cron in the name of security.
 
For the cron issues being reported here and elsewhere, this appears after the release of 388.6. One has to wonder if Asus's ASD daemon is now goofing around with cron in the name of security.
That is an interesting thought! You should add that tidbit to the official discussion... also, Martineau added an interesting wrapper script in that thread to keep an eye on what's happening with cru, so I've got my popcorn out and ready for that. ;)
 
That is an interesting thought! You should add that tidbit to the official discussion... also, Martineau added an interesting wrapper script in that thread to keep an eye on what's happening with cru, so I've got my popcorn out and ready for that. ;)

Seems each version of 388 has had an issue that affects me (asd being aggressive with SDD, asd zapping arp entries, etc.). I've never upgraded from 386.7. I was hoping this would be the version. Between the certificate issue and this crontab issue, I will be passing on 388.6 as well.

Seems to me that you can't use a cron job to test for changes in crontab jobs as the job used to check cron might disappear. You might need to resort to using your daemon of sorts. That is check things out, sleep for 5 minutes, check things out, sleep for 5 minutes, repeat .....
 
Seems each version of 388 has had an issue that affects me (asd being aggressive with SDD, asd zapping arp entries, etc.). I've never upgraded from 386.7. I was hoping this would be the version. Between the certificate issue and this crontab issue, I will be passing on 388.6 as well.
Living life on the bleeding edge here... it seems! lol ;) I can completely understand where you're coming from.

Seems to me that you can't use a cron job to test for changes in crontab jobs as the job used to check cron might disappear. You might need to resort to using your daemon of sorts. That is check things out, sleep for 5 minutes, check things out, sleep for 5 minutes, repeat .....
<sigh> You hit the nail on the head, @Jeffrey Young ... I'm just getting super tired of having to build things to check on things that you are reliant on the underlying OS for... The OS should be assumed to be foolproof and 100% stable. But we're seeing time and time again, like with the nvram issue that we've been dealing with since the AC86U and have had to use the "timeout" command on to help prevent lockups being caused by bugs in the OS... and now the GT-AX6000, while seemingly fixed, has now started showing symptoms of creating stuck nvram/wl commands after a certain span of time as well. Now a separate daemon to make sure stuff runs, or perhaps even just run it itself, because we are no longer able to trust cron? Ugh! :(

Guess I'll be reading up on the "Writing your own daemon 101" guide. lol
 
Living life on the bleeding edge here... it seems! lol ;) I can completely understand where you're coming from.


<sigh> You hit the nail on the head, @Jeffrey Young ... I'm just getting super tired of having to build things to check on things that you are reliant on the underlying OS for... The OS should be assumed to be foolproof and 100% stable. But we're seeing time and time again, like with the nvram issue that we've been dealing with since the AC86U and have had to use the "timeout" command on to help prevent lockups being caused by bugs in the OS... and now the GT-AX6000, while seemingly fixed, has now started showing symptoms of creating stuck nvram/wl commands after a certain span of time as well. Now a separate daemon to make sure stuff runs, or perhaps even just run it itself, because we are no longer able to trust cron? Ugh! :(

Guess I'll be reading up on the "Writing your own daemon 101" guide. lol
I apologize again for being ff-topic. I share your frustration. That said, I don't blame Asus - or @RMerlin. What Merlin has done with these routers and the relationship he has been able to forge with Asus has been rather remarkable. However, these are SOHO routers and you have to keep in mind who the target market is. Also keeping in mind that more and more, the home user is the subject of attack, I don't blame Asus for hardening their products. Especially since the vast majority of their target market has no clue of what goes on under the hood.

Everyone should be thinking of a plan B. No one knows the future. Circumstances way beyond Merlin's control could cause him to cease development, a management change at Asus could shut down Asus's support of this project. I've already experimented with a few alternatives (pfsense, OpenWRT, building my own Ubuntu Router on a RasPi) and have a plan for when staying on the 386 base is no longer feasible or secure (assuming 388 keeps on it's current track).

OK, done now - back on topic. Cheers all.
 
I apologize again for being ff-topic. I share your frustration. That said, I don't blame Asus - or @RMerlin. What Merlin has done with these routers and the relationship he has been able to forge with Asus has been rather remarkable. However, these are SOHO routers and you have to keep in mind who the target market is. Also keeping in mind that more and more, the home user is the subject of attack, I don't blame Asus for hardening their products. Especially since the vast majority of their target market has no clue of what goes on under the hood.
Listen, I agree completely. Much of the underlying stuff that we're dealing with here is probably off-limits to even touch... ie closed-source. And like you said, it is amazing that we're even able to do what we do as it is. I have nothing but the greatest respect for what @RMerlin has been able to accomplish alongside ASUS these many years... just wish that ASUS would be a bit more responsive when it comes to critical issues identified by their "hobbyist community" who punish their routers above & beyond the regular masses. I remember a year back or more, we tried to figure out to see what it would take to get ASUS's attention on the nvram bug on the AC86U... but in the end, it was pretty much a dead-end. Just wish they had more of that mentality to go out of their way to ensure their underlying OS is the most stable it can be, even if it means spending time, money & resources fixing bugs like these. I know they do this for security issues and whatnot, because those affect their entire base... but wishing they went beyond that. And for all I know, perhaps they do that... and I'm just unaware, and this nvram locking issue just wasn't high enough on the fix list.

But now with similar issues identified on the GT-AX6000, and now this weird cru issue... wishing they had a resource of sorts to determine if something needs to be patched. Heck... I wish they had some folks that work in this ASUS router OS development department as part of this community so they can see our struggles! :p

Everyone should be thinking of a plan B. No one knows the future. Circumstances way beyond Merlin's control could cause him to cease development, a management change at Asus could shut down Asus's support of this project. I've already experimented with a few alternatives (pfsense, OpenWRT, building my own Ubuntu Router on a RasPi) and have a plan for when staying on the 386 base is no longer feasible or secure (assuming 388 keeps on it's current track).
Kudos to you for playing around with the different options like that... I definitely need to start experimenting more. ;)
 
Living life on the bleeding edge here... it seems! lol ;) I can completely understand where you're coming from.


<sigh> You hit the nail on the head, @Jeffrey Young ... I'm just getting super tired of having to build things to check on things that you are reliant on the underlying OS for... The OS should be assumed to be foolproof and 100% stable. But we're seeing time and time again, like with the nvram issue that we've been dealing with since the AC86U and have had to use the "timeout" command on to help prevent lockups being caused by bugs in the OS... and now the GT-AX6000, while seemingly fixed, has now started showing symptoms of creating stuck nvram/wl commands after a certain span of time as well. Now a separate daemon to make sure stuff runs, or perhaps even just run it itself, because we are no longer able to trust cron? Ugh! :(

Guess I'll be reading up on the "Writing your own daemon 101" guide. lol
I got up in the middle of last night just to verify that my scripts and your Backupmon are NOT to blame.
Testing completed I headed straight back to the bedroom and knowing we are not resposible for the headaches I slept contentedly until this morning.
 
I got up in the middle of last night just to verify that my scripts and your Backupmon are NOT to blame.
Testing completed I headed straight back to the bedroom and knowing we are not resposible for the headaches I slept contentedly until this morning.
LOL :p ... believe me, I've similarly been checking my scripts as well and scrutinizing every cru command, playing what-if games. ;) Thanks much for helping me validate what I've concluded on my end. There must be something else going on. I certainly hope we can determine what, so we can put the necessary fixes/workarounds in place.
 
There have been a few that have submitted updates that I've let through and approved, but I primarily use github as a place to host the script, and keep version control locally.

To each their own - but github and community works best - you're likely missing out on contributions/bug fixes doing things the way you're managing things at present.
 
Long before backupmon.... I had issues with cron jobs disappearing. I would inject them via services-start, then later (days or weeks) notice they were missing.

I added various logging, but never determined a root cause. I did find that *sometimes* after the WAN had gone down, then come back up, those cron jobs were missing. Not consistently, but sometimes.

So, I moved my cru job injections to firewall-start. I wrapped them with a test that checks to see if each job is already there (i.e., `cru | grep -q $jobname`) then adds any that are missing. At boot, this gets them all added (just as in services-start). Whenever the firewall restarts, it checks and adds any missing jobs.

I haven't had a missing crontab job since. I don't know if you want to consider a "hook" in firewall-start to check from the crontab job or not, but thought I'd throw it out there.
 
Long before backupmon.... I had issues with cron jobs disappearing. I would inject them via services-start, then later (days or weeks) notice they were missing.

I added various logging, but never determined a root cause. I did find that *sometimes* after the WAN had gone down, then come back up, those cron jobs were missing. Not consistently, but sometimes.

So, I moved my cru job injections to firewall-start. I wrapped them with a test that checks to see if each job is already there (i.e., `cru | grep -q $jobname`) then adds any that are missing. At boot, this gets them all added (just as in services-start). Whenever the firewall restarts, it checks and adds any missing jobs.

I haven't had a missing crontab job since. I don't know if you want to consider a "hook" in firewall-start to check from the crontab job or not, but thought I'd throw it out there.
Excellent advice, @ScottW! Apparently @bibikalka found that the post-mount is another "stable" place to put them. I'm intrigued with the test you're using to validate the cru jobs... Would you be so kind to please share a sample of your firewall-start script, and I'll get this tested... Thanks! :)
 
Excellent advice, @ScottW! Apparently @bibikalka found that the post-mount is another "stable" place to put them. I'm intrigued with the test you're using to validate the cru jobs... Would you be so kind to please share a sample of your firewall-start script, and I'll get this tested... Thanks! :)
I don't think post-mount will help here. Like services-start, it (usually) only happens once, at boot, and doesn't get re-fired when the WAN goes down/up. But firewall-start gets called at boot, wan up/down, and several other cases. It is a little later than services-start, but getting the cron jobs added isn't time-critical.

EDIT: To be clear, my thought was that you could add a function within backupmon.sh that checked for the backupmon cron job (if a schedule was defined in settings), and if not in crontab then add it. You could use that function "on demand" (i.e., when someone used the backupmon.sh menus to set a schedule), *and* also add a line to firewall start which would call backupmon.sh with a parameter (i.e., "/jffs/backupmon.sh -checkcron) that tells backupmon.sh to simply invoke that function, then exit. That would be more "resilient" than just adding the cru command to services-start or post-mount, but it does require checking for their existence before adding since firewall-start gets called at multiple times.

I'll link my code below, with some caveats:
  1. I am not a pro script coder! It works, but it isn't necessarily pretty or efficient. :)
  2. The "checkjob ()" function is probably all that is of interest, but I included the entire script for for context.
  3. I don't ask backupmon to add a cron job; rather I manually add that job to this script for the redundancy it provides (and to have all my jobs in one place).
  4. The reason for the $1 parameter in mainline script was solely for logging purposes. When I was calling this from multiple places (services-start, firewall-start), I wanted the log entries to indicate where it was called from, so the invocations from services-start and firewall-start passed "services-start" or "firewall-start", respectively, to this script so that the caller could be logged. I later concluded that firewall-start invocation was soon enough and that the call from services-start was redundant.
  5. There is lots of logging. I like logs. Lots and lots of them. :)
Code:
#!/bin/sh

## ADD PERSONAL CRON JOBS
## --   Called by /jffs/scripts/firewall-start, since crontab jobs are sometimes lost for unknown
## --   reasons when modem restarts, WAN goes down/up, etc., and services-start is not called
## --   again after these events.
##-
## --   The loss of cron jobs also affects scribe's logrotate cron, since it is added via
## --   service-event(postmount), so this script checks that job as well.

usbname=ASUS

#log to usb/logs if available, otherwise fallback to jffs
if [ -d /mnt/$usbname/logs/ ]; then
  logfile=/mnt/$usbname/logs/UserCron.log
 else
  logfile=/jffs/scripts/sw/UserCron.log
fi

scname=UserCron
caller=$1
sng_rng(){ if [ -n "$( pidof syslog-ng )" ]; then true; else false; fi; }

checkjob () {  #FUNCTION TO CHECK CRON JOB EXISTENCE, AND ADD IF NOT FOUND
  # param1 is jobname, param2 is cron-entry
  if cru l | grep -q "$1"
    then
      msg="+ Job $1 is already in crontab."
      logger -t $scname "$msg"
      echo $(date) $scname "$msg" >>$logfile
    else
      msg="+ Job $1 NOT IN CRONTAB; adding now."
      logger -t $scname "$msg"
      echo $(date) $scname "$msg" >>$logfile
      set -f #disable shell expansion to prevent asterisks in cron schedule from being expanded
      cru a $1 "$2"
      set +f #re-enable shell expansion
  fi
}

msg="Running user cron job check ($0)"
logger -t $scname "$msg"
echo >>$logfile
echo $(date) "$scname Called with option [$1]" >> $logfile
echo $(date) $scname $msg >>$logfile

sec_up=$(cat /proc/uptime | cut -d "." -f 1)
echo $(date) $scname "Router has been up for $sec_up seconds." >>$logfile

jobname='SW-LogIP'
jobcmd='59 23 * * * /bin/sh /jffs/scripts/sw/ip_daily_log.sh >> /mnt/ASUS/logs/ip_daily_log.sh.log 2>&1'
checkjob $jobname "$jobcmd"

jobname='SW-BackASUS'
jobcmd='0 23 * * Sat /bin/sh /jffs/scripts/sw/backupASUS.sh >> /mnt/ASUS/logs/backupASUS.sh.log 2>&1'
checkjob $jobname "$jobcmd"

jobname='SW-BackupMon'
jobcmd='30 23 * * Sat /bin/sh /jffs/scripts/backupmon.sh -backup >/dev/null 2>&1' #backupmon logs to /jffs/addons/backupmon.d/backupmon.log
checkjob $jobname "$jobcmd"

jobname='SW-OvpnScan'
jobcmd='0 8,16,20 * * * /bin/sh /jffs/scripts/sw/ovpn-logscan.sh -n'
checkjob $jobname "$jobcmd"

#If syslog-ng is running, check / add logrotate cron if its gone. This isn't needed at first boot, and
#  checking/restarting scribe/syslog-ng before it is initially started can be problematic.  So skip this
#  part if just booted.
if [ $sec_up -lt 240 ]; then #4 minutes
  msg="+ Skipping logrotate check (router only up $sec_up seconds)"
  logger -t $scname "$msg"
  echo $(date) "$msg" >>$logfile
else
  if sng_rng
   then
     if cru l | grep -q "logrotate"
       then
        msg="+ Job logrotate (for scribe) is already in crontab."
         logger -t $scname "$msg"
        echo $(date) $scname "$msg" >>$logfile
       else
        msg="+ Job logrotate is not in crontab; calling scribe(restart) to fix."
        logger -t $scname "$msg"
        echo $(date) $scname "$msg" >>$logfile
        sh /jffs/scripts/scribe restart >/dev/null
        sleep 3
        #probably should test for success, but going to assume it worked...
        msg="+ scribe restart completed."
        logger -t $scname "$msg"
        echo $(date) $scname "$msg" >>$logfile
     fi
   else
    msg="+ syslog-ng not running, skipping logrotate cron job check."
    logger -t $scname "$msg"
    echo $(date) "$msg"  >>$logfile
  fi
fi

msg="User cron job check complete."
logger -t $scname "$msg"
echo $(date) $scname "$msg" >>$logfile
echo >>$logfile
 
Last edited:

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top