Frequent syslog entries "kernel: BUG: scheduling while atomic"

joegreat · Mar 5, 2015

Hi,

Since I have switched to "ARM Entware" on my main router and enabled Transmission I am getting quite requently strange log entries into the syslog: "kernel: BUG: scheduling while atomic" - see details below.

The parameter is now always "mtdblock3/254/0x00000002", but before my last reboot I had also mtdblock4 (JFFS file system!) entries.

Searching the forum and the internet did not really bring valuable information (rather then re-installing the firmware - which I did multiple times in the last days).

Any ideas how to solve this issue or how to debug it further (Transmission?) to find the root cause?

With kind regards
Joe

Full entry of the log event:

Code:

Mar  5 20:01:35 kernel: BUG: scheduling while atomic: mtdblock3/254/0x00000002
Mar  5 20:01:35 kernel: module:  wl     bf7c5000     4008462
Mar  5 20:01:35 kernel: module:  nf_nat_sip     bf7be000     5586
Mar  5 20:01:35 kernel: module:  nf_conntrack_sip     bf7b2000     16679
Mar  5 20:01:35 kernel: module:  nf_nat_h323     bf7ab000     5137
Mar  5 20:01:35 kernel: module:  nf_conntrack_h323     bf79a000     34844
Mar  5 20:01:35 kernel: module:  nf_nat_rtsp     bf794000     3400
Mar  5 20:01:35 kernel: module:  nf_conntrack_rtsp     bf78d000     4268
Mar  5 20:01:35 kernel: module:  nf_nat_ftp     bf787000     1314
Mar  5 20:01:35 kernel: module:  nf_conntrack_ftp     bf780000     5131
Mar  5 20:01:35 kernel: module:  ip6table_mangle     bf77a000     1093
Mar  5 20:01:35 kernel: module:  ip6t_LOG     bf773000     4705
Mar  5 20:01:35 kernel: module:  ip6table_filter     bf76d000     893
Mar  5 20:01:35 kernel: module:  sr_mod     bf764000     11507
Mar  5 20:01:35 kernel: module:  cdrom     bf755000     33318
Mar  5 20:01:35 kernel: module:  cdc_mbim     bf74f000     3313
Mar  5 20:01:36 kernel: module:  qmi_wwan     bf747000     5948
Mar  5 20:01:36 kernel: module:  cdc_wdm     bf740000     7856
Mar  5 20:01:36 kernel: module:  cdc_ncm     bf737000     9210
Mar  5 20:01:36 kernel: module:  rndis_host     bf730000     5209
Mar  5 20:01:36 kernel: module:  cdc_ether     bf72a000     3456
Mar  5 20:01:36 kernel: module:  asix     bf721000     11741
Mar  5 20:01:36 kernel: module:  usbnet     bf717000     11691
Mar  5 20:01:36 kernel: module:  mii     bf711000     3484
Mar  5 20:01:36 kernel: module:  usblp     bf708000     11162
Mar  5 20:01:36 kernel: module:  ohci_hcd     bf6fd000     19288
Mar  5 20:01:36 kernel: module:  ehci_hcd     bf6ee000     34220
Mar  5 20:01:36 kernel: module:  xhci_hcd     bf6d9000     54334
Mar  5 20:01:36 kernel: module:  thfsplus     bf6bc000     82205
Mar  5 20:01:36 kernel: module:  tntfs     bf642000     446031
Mar  5 20:01:36 kernel: module:  tfat     bf60c000     179421
Mar  5 20:01:36 kernel: module:  ext2     bf5f6000     55581
Mar  5 20:01:36 kernel: module:  ext4     bf5ad000     234233
Mar  5 20:01:36 kernel: module:  jbd2     bf596000     52386
Mar  5 20:01:36 kernel: module:  crc16     bf590000     1081
Mar  5 20:01:36 kernel: module:  ext3     bf569000     113117
Mar  5 20:01:36 kernel: module:  jbd     bf554000     45524
Mar  5 20:01:36 kernel: module:  mbcache     bf54c000     5156
Mar  5 20:01:36 kernel: module:  usb_storage     bf539000     35499
Mar  5 20:01:36 kernel: module:  sg     bf52c000     21138
Mar  5 20:01:36 kernel: module:  sd_mod     bf520000     23159
Mar  5 20:01:36 kernel: module:  scsi_wait_scan     bf51a000     502
Mar  5 20:01:36 kernel: module:  scsi_mod     bf4eb000     114857
Mar  5 20:01:36 kernel: module:  usbcore     bf4c0000     108358
Mar  5 20:01:36 kernel: module:  jffs2     bf49f000     94871
Mar  5 20:01:36 kernel: module:  zlib_deflate     bf495000     19990
Mar  5 20:01:36 kernel: module:  nf_nat_pptp     bf48f000     1796
Mar  5 20:01:36 kernel: module:  nf_conntrack_pptp     bf489000     3739
Mar  5 20:01:36 kernel: module:  nf_nat_proto_gre     bf483000     1047
Mar  5 20:01:36 kernel: module:  nf_conntrack_proto_gre     bf47d000     3599
Mar  5 20:01:36 kernel: module:  igs     bf02d000     12935
Mar  5 20:01:36 kernel: module:  emf     bf023000     16346
Mar  5 20:01:36 kernel: module:  et     bf00b000     64092
Mar  5 20:01:36 kernel: module:  ctf     bf000000     18051
Mar  5 20:01:36 kernel: Modules linked in: wl(P) nf_nat_sip nf_conntrack_sip nf_nat_h323 nf_conntrack_h323 nf_nat_rtsp nf_conntrack_rtsp

nf_nat_ftp nf_conntrack_ftp ip6table_mangle ip6t_LOG ip6table_filter sr_mod cdrom cdc_mbim qmi_wwan cdc_wdm cdc_ncm rndis_host cdc_ether asix

usbnet mii usblp ohci_hcd ehci_hcd xhci_hcd thfsplus tntfs(P) tfat(P) ext2 ext4 jbd2 crc16 ext3 jbd mbcache usb_storage sg sd_mod

scsi_wait_scan scsi_mod usbcore jffs2 zlib_deflate nf_nat_pptp nf_conntrack_pptp nf_nat_proto_gre n
Mar  5 20:01:36 kernel: [<c0044000>] (unwind_backtrace+0x0/0xf8) from [<c02c7524>] (schedule+0x434/0x75c)
Mar  5 20:01:36 kernel: [<c02c7524>] (schedule+0x434/0x75c) from [<c02c7c3c>] (schedule_timeout+0x130/0x1c0)
Mar  5 20:01:36 kernel: [<c02c7c3c>] (schedule_timeout+0x130/0x1c0) from [<c02c7acc>] (io_schedule_timeout+0x5c/0x84)
Mar  5 20:01:36 kernel: [<c02c7acc>] (io_schedule_timeout+0x5c/0x84) from [<c00aa610>] (congestion_wait+0x74/0x94)
Mar  5 20:01:36 kernel: [<c00aa610>] (congestion_wait+0x74/0x94) from [<c00a46a4>] (try_to_free_pages+0x2ac/0x33c)
Mar  5 20:01:36 kernel: [<c00a46a4>] (try_to_free_pages+0x2ac/0x33c) from [<c009de88>] (__alloc_pages_nodemask+0x31c/0x698)
Mar  5 20:01:36 kernel: [<c009de88>] (__alloc_pages_nodemask+0x31c/0x698) from [<c0371964>] (__slab_alloc+0x1b4/0x814)
Mar  5 20:01:36 kernel: [<c0371964>] (__slab_alloc+0x1b4/0x814) from [<c00c3f20>] (__kmalloc+0xf4/0x104)
Mar  5 20:01:36 kernel: [<c00c3f20>] (__kmalloc+0xf4/0x104) from [<c01b6fa0>] (_nflash_mtd_read+0x9c/0x3a0)
Mar  5 20:01:36 kernel: [<c01b6fa0>] (_nflash_mtd_read+0x9c/0x3a0) from [<c01b7788>] (nflash_mtd_read+0x4c/0x64)
Mar  5 20:01:36 kernel: [<c01b7788>] (nflash_mtd_read+0x4c/0x64) from [<c019a3fc>] (part_read+0x64/0xe8)
Mar  5 20:01:36 kernel: [<c019a3fc>] (part_read+0x64/0xe8) from [<c019d658>] (mtdblock_readsect+0x40/0x114)
Mar  5 20:01:36 kernel: [<c019d658>] (mtdblock_readsect+0x40/0x114) from [<c019ce58>] (mtd_blktrans_thread+0x204/0x294)
Mar  5 20:01:36 kernel: [<c019ce58>] (mtd_blktrans_thread+0x204/0x294) from [<c007a0c4>] (kthread+0x88/0x90)
Mar  5 20:01:36 kernel: [<c007a0c4>] (kthread+0x88/0x90) from [<c003eb8c>] (kernel_thread_exit+0x0/0x8)

joegreat · Mar 5, 2015

List of events during the last two days (after last reboot):

Code:

Mar  4 20:52:35 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  4 20:52:36 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  4 21:02:56 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  4 21:02:56 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  4 21:36:20 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  4 22:31:52 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  4 23:09:56 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  4 23:09:57 kernel: BUG: scheduling while atomic: mtdblock3/254/0x00000002
Mar  4 23:13:08 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  4 23:13:09 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  4 23:19:28 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  4 23:57:56 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  4 23:57:56 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 00:12:56 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 00:28:55 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 00:48:25 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 00:58:55 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 01:22:25 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 01:48:56 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 01:48:57 kernel: BUG: scheduling while atomic: mtdblock3/254/0x00000002
Mar  5 02:39:51 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 02:41:19 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 02:44:00 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 03:06:00 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 03:21:30 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 03:30:44 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 03:38:30 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 03:56:05 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 03:56:48 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 03:56:48 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 03:56:49 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 04:19:38 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 04:38:59 kernel: BUG: scheduling while atomic: mtdblock3/254/0x00000002
Mar  5 04:54:28 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 05:05:24 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 05:08:06 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 05:08:06 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 05:27:04 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 05:34:00 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 05:54:35 kernel: BUG: scheduling while atomic: mtdblock3/254/0x00000002
Mar  5 06:14:57 kernel: BUG: scheduling while atomic: mtdblock3/254/0x00000002
Mar  5 06:53:10 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 06:55:34 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 06:55:35 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 07:04:18 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 07:50:07 kernel: BUG: scheduling while atomic: mtdblock3/254/0x00000002
Mar  5 08:08:35 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 08:08:35 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 11:12:33 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 12:40:05 kernel: BUG: scheduling while atomic: mtdblock3/254/0x00000002
Mar  5 13:53:35 kernel: BUG: scheduling while atomic: mtdblock3/254/0x40000002
Mar  5 18:06:28 kernel: BUG: scheduling while atomic: mtdblock3/254/0x00000002
Mar  5 20:01:35 kernel: BUG: scheduling while atomic: mtdblock3/254/0x00000002

john9527 · Mar 5, 2015

Two possibilities that come to mind...
(1) Your flash memory is starting to go bad, and when you switched over to Entware-ARM you started to access a bad cell where maybe you weren't before.
(2) There's a bug in Transmission where it's trying to read from somewhere it's not supposed to. You might try altering the Transmission runtime parameters and see if it changes the error.

Nullity · Mar 5, 2015

Have you tried reformatting the JFFS partition?

joegreat · Mar 6, 2015

Hi,

Nullity said:
Have you tried re-formating the JFFS partition?

Hmm, interesting... I was trying to delete the JFFS partition by reformating it, but it did not work (see posting here). Does this mean anything? Maybe I have an issue with my flash memory...

john9527 said:
Two possibilities that come to mind...
(1) Your flash memory is starting to go bad, and when you switched over to Entware-ARM you started to access a bad cell where maybe you weren't before.

Is there a way to test the flash memory, by reading all of it and getting "good error or success diag messages"?

With kind regards
Joe

john9527 · Mar 6, 2015

joegreat said:
Hi,

Hmm, interesting... I was trying to delete the JFFS partition by reformating it, but it did not work (see posting here). Does this mean anything? Maybe I have an issue with my flash memory...

It may be the same quirk as when you enable jffs......try rebooting twice without changing settings after you disable jffs.

Is there a way to test the flash memory, by reading all of it and getting "good error or success diag messages"?

With kind regards
Joe

There's a basic test run during boot.....do a search for 'Bad eraseblock' in the syslog. A couple are OK (I've had two since I got the router)...but if you see a lot, it's not a good sign.

joegreat · Mar 6, 2015

john9527 said:
It may be the same quirk as when you enable jffs......try rebooting twice without changing settings after you disable jffs.

Ahh, yes! I forgot about this. Will re-try later!

john9527 said:
There's a basic test run during boot.....do a search for 'Bad eraseblock' in the syslog. A couple are OK (I've had two since I got the router)...but if you see a lot, it's not a good sign.

OK. I have only one but regular entry in the last syslogs:

Code:

Jan  1 01:00:13 kernel: Bad eraseblock 664 at 0x000005300000

On the other hand I have found two strange JFFS related entries:

Code:

Jan  1 01:00:13 kernel: jffs2_scan_eraseblock(): Node at 0x00ffcff8 {0x1985, 0xe001, 0x00000036) has invalid CRC 0xffffffff (calculated 0xdf55e1a4)
Jan  1 01:00:13 kernel: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x00ffcffc: 0x0036 instead
Jan  1 01:00:13 kernel: jffs2_scan_eraseblock(): Node at 0x00ffcff8 {0x1985, 0xe001, 0x00000036) has invalid CRC 0xffffffff (calculated 0xdf55e1a4)
Jan  1 01:00:13 kernel: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x00ffcffc: 0x0036 instead

Will check later how old this entries are (have a backup folder for syslogs with many files in it).

With kind regards
Joe

john9527 · Mar 6, 2015

The first error is a common bad cell....I would expect you to find it as far as you go back.

The second set of errors are more troubling and could be an indicator that the flash is weak/going bad.

One thing you might try....do you have another power adapter you could try? (of course, check that the specs are the same) A flakey power supply can cause all sorts of weird symptoms.

joegreat · Mar 7, 2015

john9527 said:
One thing you might try....do you have another power adapter you could try? (of course, check that the specs are the same) A flakey power supply can cause all sorts of weird symptoms.

Hi,

I am still trying to get rid of the "kernel: BUG: scheduling while atomic" issue: Looks like I am making progress...
During the day I had only one occurence after I increase the reserved free memory to 32 MBytes and now I am testing with 48 MBytes - maybe this will make it go away.

The parameter "min_free_kbytes" is set by the init-start script:

Code:

echo 49152 > /proc/sys/vm/min_free_kbytes

Will post results tomorrow after having Transmission run for serveral hours.

The hint with the power supply will be tested also tomorrow: I have a general powersupply which can be configured regarding the voltage and polarity and has several plugs.

Thank you for your kind support!

With kind regards
Joe

joegreat · Mar 9, 2015

joegreat said:
Hi,
I am still trying to get rid of the "kernel: BUG: scheduling while atomic" issue: Looks like I am making progress...
The parameter "min_free_kbytes" is set by the init-start script:

Code:

echo 49152 > /proc/sys/vm/min_free_kbytes

Hi,

After some improvement with the min_free_kbytes setting above I searched further (with my friend Google) and got the impression that the error is caused by a simple memory shortage!

My theory: It looks like Tranmission is causing (via heavy I/O) the buffers and chache memory to raise until all memory is used up and it's no more left for other processes - or it it's not freed up inside a scheduled function (see error message above).
If you want more details look here and here for kmalloc with the options GFP_KERNEL vs. GFP_ATOMIC.

To release the always full memory, I did more search on cache and buffers memory usage and I tried to understand the parameters for memory handling and how to improve it (see init-start script below). Some testing with swapiness was also done, but did not improve the situation.

The result is that the cache memory is now flushed much earlier (more and earlier freed memory) and I think other tasks get better chance to grab (or release) the memory they need...
With the last tweaks Transmission runs now for more then 10 hours without "scheduling while atomic" error!

That's a really big improvment over the 1-2 hours between the error before!!!

The network limits in Transmission are currently set for 25 MBit down and 5 MBit up - during the night it raises to 50 MBit down and 10 MBit up - let's see if the addtional stress (and my theory) will survive the night...

With kind regards
Joe

#!/bin/sh
/usr/bin/logger -t START_$(basename $0) "started [$@]"
SCRLOG=/tmp/$(basename $0).log
touch $SCRLOG
NOW=$(date +"%Y-%m-%d %H:%M:%S")
echo $NOW "START_$(basename $0) started [$@]" >> $SCRLOG
# exec 3>&1 4>&2 >$SCRLOG 2>&1
#
# Receive and send buffers for Transmission (added 1/0.5 MByte over low limit)
echo 5242880 > /proc/sys/net/core/rmem_max
echo 1572864 > /proc/sys/net/core/wmem_max
# Memory and Cache/Buffer handling changes to manage the memory better
echo 49152 > /proc/sys/vm/min_free_kbytes
echo 1 > /proc/sys/vm/overcommit_memory
echo 5 > /proc/sys/vm/dirty_background_ratio
echo 384 > /proc/sys/vm/overcommit_ratio
echo 10 > /proc/sys/vm/dirty_ratio
echo 500 > /proc/sys/vm/vfs_cache_pressure
# IO Scheduler change on all disks
echo deadline > /sys/block/sda/queue/scheduler
echo deadline > /sys/block/mtdblock3/queue/scheduler
echo deadline > /sys/block/sdb/queue/scheduler
#
NOW=$(date +"%Y-%m-%d %H:%M:%S")
if [ "$?" -ne 0 ]
then
echo $NOW "Error in init-start execution! Script: $0" >> $SCRLOG
else
echo $NOW "Init-start execution OK. Script: $0" >> $SCRLOG
fi
/usr/bin/logger -t STOP_$(basename $0) "return code $?"
# exec 1>&3 2>&4
exit $?

john9527 · Mar 9, 2015

Do you have a swap file set up?

Scroll down a bit in this post and you'll find instructions on setting up a swap file (edit the directory as needed).....

http://forums.smallnetbuilder.com/showpost.php?p=52894&postcount=1

joegreat · Mar 9, 2015

john9527 said:
Do you have a swap file set up?

Yes, yes - even two swap partitions - on each USB device one...

The router swaps max. 20 MByte during the day - as said: changes in swapiness did not make a difference.

chief@RT-AC68U:/tmp/home/root# free
total used free shared buffers
Mem: 255744 222040 33704 0 3052
-/+ buffers: 218988 36756
Swap: 260300 19880 240420

With kind regards
Joe

joegreat · Mar 11, 2015

Hi,

The theory from the above posting is proven, but slightly differently the expected:

Having the shown settings in the init-start script leads to a nice bumpy memory curve (memory is used and freed up requently)
But under heavy load (higher bandwith with many torrents in parallel) it's still is the case that the memory is fully utilized and not freed up fast enough
The fully utilized memory leads to rare "kernel: BUG: scheduling while atomic" errors (last test showed three occurencies in 24 hours)

The real (and maybe final) solution seams to be rather simple:
Reduce the bandwith for upload and download to a level where the memory is NOT utilized fully!
In my case it's a upload limit of ~7 MBit and (currently) no downloads - to be tested further.

With the limited upload bandwith I can have 30-40 torrents seeding with up to 200 peers connected with no problems.
The memory curve has nice ups and downs, but never hits the limit of the min_free_kbytes=48 MBytes!

I am more and more convinced that this is a bug in the firmware (most likely the kernel in the area of GFP_KERNEL vs. GFP_ATOMIC) but I am not able to prove it.

With kind regards
Joe

PS.: The issue also leads to corrupted parts in the downloaded files - now I could finally safely download a 350 GByte sized file without errors (and download speed up to 40 MBit)!

PPS.: The statistics below show a pretty relaxed router - with only 3.5 MBit upload during the day (nightly doubled), but still 45 torrents seeding - with used/free memory as it should be...

Code:

----total-cpu-usage---- ------memory-usage----- ----swap--- -dsk/total- --io/total- -net/total-
usr sys idl wai hiq siq| used  buff  cach  free| used  free| read  writ| read  writ| recv  send
  2   1  81  14   0   2|75.2M 3080k 78.5M 93.0M|3380k  251M|1824k    0 |21.8     0 |  52k  447k
  3   1  82  12   0   2|75.5M 3224k 86.1M 85.1M|3380k  251M|1583k    0 |17.0     0 |  41k  441k
  2   1  85  10   0   2|75.6M 3256k 93.9M 77.1M|3380k  251M|1610k 2458B|16.4  0.40 |  41k  440k
  5   5  72  17   0   2|83.7M 5676k 99.1M 61.4M|3380k  251M|1298k    0 |61.4     0 |  40k  358k
  3   5  44  47   0   2|76.3M 4988k 84.6M 84.0M|3380k  251M|1066k  276k|20.4  2.00 |  44k  446k
  2   1  37  58   0   2|76.1M 4896k 72.9M 96.0M|3380k  251M|1732k    0 |17.0     0 |  41k  429k
  3   1  33  62   0   2|76.1M 4876k 64.8M  104M|3380k  251M|2265k    0 |23.4     0 |  40k  434k
  2   1  73  23   0   1|76.0M 4868k 66.9M  102M|3380k  251M|1261k    0 |17.8     0 |  40k  433k
  3   1  81  14   0   2|76.0M 4884k 73.2M 95.8M|3380k  251M|1281k  819B|20.2  0.20 |  41k  446k
  3   6  74  15   0   2|76.2M 5252k 82.7M 85.7M|3376k  251M|1963k    0 |21.8     0 |  42k  422k
  4   5  75  14   0   2|76.2M 5660k 96.2M 71.8M|3376k  251M|2509k   24k|36.6  3.40 |  69k  558k
  2   1  85  11   0   2|76.2M 5680k  100M 68.0M|3376k  251M| 943k    0 |14.8     0 |  52k  440k
  2   5  79  12   0   2|76.2M 5968k  106M 61.3M|3376k  251M|1283k    0 |18.6     0 |  54k  443k
  2   1  37  58   0   2|76.4M 5716k 94.8M 72.9M|3376k  251M|1736k 2458B|20.6  0.40 |  56k  454k
  2   1  33  62   0   1|76.4M 5200k 84.4M 83.8M|3376k  251M|1572k 3277B|19.0  0.40 |  38k  435k
  2   1  32  63   0   1|76.5M 5184k 75.3M 92.9M|3376k  251M|1327k    0 |19.8     0 |  39k  442k
  2   1  52  43   0   1|76.4M 4812k 70.6M 98.0M|3376k  251M|1298k    0 |18.6     0 |  38k  441k
  2   1  81  14   0   1|76.4M 4836k 78.0M 90.6M|3376k  251M|1510k    0 |20.0     0 |  43k  444k
  2   1  82  14   0   1|76.3M 4856k 86.9M 81.7M|3376k  251M|1838k    0 |18.8     0 |  42k  445k

Nullity · Mar 11, 2015

Thanks for sharing your results.

Just wondering, why not setup some old computer as your dedicated seedbox/NAS? Putting your gateway router under such a heavy system load seems overly risky.

joegreat · Mar 12, 2015

Nullity said:
Just wondering, why not setup some old computer as your dedicated seedbox/NAS? Putting your gateway router under such a heavy system load seems overly risky.

Hi,

I simply do not want to have another 24x7 system running for this (small task).

As stated above: The problem looks to me like an error in memory handling - maybe somebody picks it up and is able to fix it...

By the way: On the N66U router Transmission runs completely stable (with even higher transfer rates but on a single core CPU) - which is another indication that we have a problem that needs attention!

With kind regards
Joe

joegreat · Apr 18, 2015

Hi,

Quite long time no issues, but yesterday it came back: One scheduling while atomic: mtdblock3 during the day...

Looking on the the traceback, I am still conviced that this is a memory handling error:

Apr 17 09:38:36 kernel: [<c0044000>] (unwind_backtrace+0x0/0xf8) from [<c02c796c>] (schedule+0x434/0x75c)
Apr 17 09:38:36 kernel: [<c02c796c>] (schedule+0x434/0x75c) from [<c005bb08>] (__cond_resched+0x24/0x34)
Apr 17 09:38:36 kernel: [<c005bb08>] (__cond_resched+0x24/0x34) from [<c02c7db0>] (_cond_resched+0x34/0x44)
Apr 17 09:38:36 kernel: [<c02c7db0>] (_cond_resched+0x34/0x44) from [<c00a314c>] (shrink_page_list+0x48/0x728)
Apr 17 09:38:36 kernel: [<c00a314c>] (shrink_page_list+0x48/0x728) from [<c00a3b28>] (shrink_inactive_list+0x104/0x1dc)
Apr 17 09:38:36 kernel: [<c00a3b28>] (shrink_inactive_list+0x104/0x1dc) from [<c00a3fa4>] (shrink_zone+0x3a4/0x418)
Apr 17 09:38:36 kernel: [<c00a3fa4>] (shrink_zone+0x3a4/0x418) from [<c00a46b0>] (try_to_free_pages+0x150/0x33c)
Apr 17 09:38:36 kernel: [<c00a46b0>] (try_to_free_pages+0x150/0x33c) from [<c009dff0>] (__alloc_pages_nodemask+0x31c/0x698)
Apr 17 09:38:36 kernel: [<c009dff0>] (__alloc_pages_nodemask+0x31c/0x698) from [<c0371fc4>] (__slab_alloc+0x1b4/0x814)
Apr 17 09:38:36 kernel: [<c0371fc4>] (__slab_alloc+0x1b4/0x814) from [<c00c4088>] (__kmalloc+0xf4/0x104)
Apr 17 09:38:36 kernel: [<c00c4088>] (__kmalloc+0xf4/0x104) from [<c01b7100>] (_nflash_mtd_read+0x9c/0x3a0)
Apr 17 09:38:36 kernel: [<c01b7100>] (_nflash_mtd_read+0x9c/0x3a0) from [<c01b78e8>] (nflash_mtd_read+0x4c/0x64)
Apr 17 09:38:36 kernel: [<c01b78e8>] (nflash_mtd_read+0x4c/0x64) from [<c019a55c>] (part_read+0x64/0xe8)
Apr 17 09:38:36 kernel: [<c019a55c>] (part_read+0x64/0xe8) from [<c019d7b8>] (mtdblock_readsect+0x40/0x114)
Apr 17 09:38:36 kernel: [<c019d7b8>] (mtdblock_readsect+0x40/0x114) from [<c019cfb8>] (mtd_blktrans_thread+0x204/0x294)
Apr 17 09:38:37 kernel: [<c019cfb8>] (mtd_blktrans_thread+0x204/0x294) from [<c007a22c>] (kthread+0x88/0x90)
Apr 17 09:38:37 kernel: [<c007a22c>] (kthread+0x88/0x90) from [<c003eb8c>] (kernel_thread_exit+0x0/0x8)

Still nobody who can help to debug this?

With kind regards
Joe

Nullity · Apr 18, 2015

Stop abusing your dang router!

bROTHER · Apr 8, 2020

Hi,
Old thread but same problem.
I'm also suffering from this. RT-AC68U with Merlin 384.5, Transmission and swap partition.
Just lowered download / upload from 100 Mbps to 13 Mbps in sucesive stages and limited simultaneous downloads.
The problem appears again when I try to copy the downloaded files to another place; then the router his a CPU Load Average aroung 3 and errors appears again until finishing the copy. It is very frustrating.
I'm searching now a way to slow the copy process...

Regards.

L&LD · Apr 8, 2020

384.5? Why?

You should flash the latest stable 384.16_0 release final.

After a full reset and minimal and manual configuration, this may already be fixed.

bROTHER · Apr 9, 2020

L&LD said:
384.5? Why?

You should flash the latest stable 384.16_0 release final.

After a full reset and minimal and manual configuration, this may already be fixed.

Oops, I meant 384.15. And probably 384.16 within few hours now that you reminded me to see what's new ...
Regards

Thread starter	Title	Forum	Replies	Date
	Router threw a wobbla - "Top" info in syslog?	Asuswrt-Merlin	0	Apr 5, 2024
U	Syslog page causing browser to lock up when auto refresh turned on	Asuswrt-Merlin	1	Mar 29, 2024
P	Is there a way to suppress a specific message from syslog?	Asuswrt-Merlin	0	Mar 15, 2024
	Missing entries in syslog?	Asuswrt-Merlin	3	Jan 14, 2024
	Need help with syslog being spammed with kernel messages	Asuswrt-Merlin	8	Nov 4, 2023
G	What is this syslog message ?	Asuswrt-Merlin	5	Nov 3, 2023
A	Share /tmp syslog dir over Samba	Asuswrt-Merlin	11	Jun 26, 2023
M	Can someone read my syslog and tell me why I keep losing internet?	Asuswrt-Merlin	21	Jun 19, 2023
D	SysLog Logging Levels?	Asuswrt-Merlin	1	May 24, 2023
U	multiple entries using ‘arp-scan’ but not reflected in ‘arp’ command	Asuswrt-Merlin	3	Mar 12, 2024

Frequent syslog entries "kernel: BUG: scheduling while atomic"

Very Senior Member

Very Senior Member

Part of the Furniture

Very Senior Member

Very Senior Member

Part of the Furniture

Very Senior Member

Part of the Furniture

Very Senior Member

Very Senior Member

Part of the Furniture

Very Senior Member

Very Senior Member

Very Senior Member

Very Senior Member

Very Senior Member

Very Senior Member

Regular Contributor

Part of the Furniture

Regular Contributor

Similar threads

Similar threads

Sign Up For SNBForums Daily Digest