What's new

Locking/unlocking and page allocation failure errors

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

TonyK132

Senior Member
I was looking through my logs, and I found these 2 events that seem extraordinary.

May 31 09:03:06 RT-AC86U-1BD0 amas_lib[1198]: Error locking /var/lock/clientlist.lock: 0 Already locked
May 31 09:03:06 RT-AC86U-1BD0 amas_lib[1198]: Error locking /var/lock/clientlist.lock: 0 Already locked
May 31 09:03:06 RT-AC86U-1BD0 amas_lib[1198]: Error unlocking -1: 9 Bad file descriptor
May 31 09:03:06 RT-AC86U-1BD0 amas_lib[1198]: Error unlocking -1: 9 Bad file descriptor

Then later.

Looks like there were 3 instances of this, 2-3 secs apart. Could both of these indicate that my memory stick is going bad?

May 31 11:29:11 RT-AC86U-1BD0 kernel: httpds: page allocation failure: order:5, mode:0x2040d0
May 31 11:29:11 RT-AC86U-1BD0 kernel: CPU: 1 PID: 19531 Comm: httpds Tainted: P O 4.1.27 #2
May 31 11:29:11 RT-AC86U-1BD0 kernel: Call trace:
May 31 11:29:11 RT-AC86U-1BD0 kernel: [<ffffffc0000876d8>] dump_backtrace+0x0/0x150
May 31 11:29:11 RT-AC86U-1BD0 kernel: [<ffffffc00008783c>] show_stack+0x14/0x20
May 31 11:29:11 RT-AC86U-1BD0 kernel: [<ffffffc00050873c>] dump_stack+0x90/0xb0
May 31 11:29:11 RT-AC86U-1BD0 kernel: [<ffffffc000101e28>] warn_alloc_failed+0xd8/0x120
May 31 11:29:11 RT-AC86U-1BD0 kernel: [<ffffffc0001044ec>] __alloc_pages_nodemask+0x404/0x6a0
May 31 11:29:11 RT-AC86U-1BD0 kernel: [<ffffffc000138ce0>] cache_alloc_refill+0x350/0x610
May 31 11:29:11 RT-AC86U-1BD0 kernel: [<ffffffc000139230>] __kmalloc+0xb8/0x100
May 31 11:29:11 RT-AC86U-1BD0 kernel: [<ffffffbffc00025c>] wlcsm_nl_rcv_skb+0x25c/0x3c0 [wlcsm]
May 31 11:29:11 RT-AC86U-1BD0 kernel: [<ffffffc000401c50>] netlink_unicast+0x150/0x228
May 31 11:29:11 RT-AC86U-1BD0 kernel: [<ffffffc000402118>] netlink_sendmsg+0x2f0/0x350
May 31 11:29:11 RT-AC86U-1BD0 kernel: [<ffffffc0003a1ee0>] sock_sendmsg+0x18/0x58
May 31 11:29:11 RT-AC86U-1BD0 kernel: [<ffffffc0003a2a5c>] ___sys_sendmsg+0x26c/0x280
May 31 11:29:11 RT-AC86U-1BD0 kernel: [<ffffffc0003a5320>] __sys_sendmsg+0x40/0x80
May 31 11:29:11 RT-AC86U-1BD0 kernel: [<ffffffc0003e12f8>] compat_SyS_sendmsg+0x10/0x18
May 31 11:29:11 RT-AC86U-1BD0 kernel: Mem-Info:
May 31 11:29:11 RT-AC86U-1BD0 kernel: active_anon:1522 inactive_anon:1573 isolated_anon:0
May 31 11:29:11 RT-AC86U-1BD0 kernel: active_file:1469 inactive_file:778 isolated_file:0
May 31 11:29:11 RT-AC86U-1BD0 kernel: unevictable:0 dirty:5 writeback:0 unstable:0
May 31 11:29:11 RT-AC86U-1BD0 kernel: slab_reclaimable:807 slab_unreclaimable:67517
May 31 11:29:11 RT-AC86U-1BD0 kernel: mapped:1918 shmem:17 pagetables:552 bounce:0
May 31 11:29:11 RT-AC86U-1BD0 kernel: free:7941 free_pcp:3 free_cma:0
May 31 11:29:11 RT-AC86U-1BD0 kernel: DMA free:31764kB min:20480kB low:25600kB high:30720kB active_anon:6088kB inactive_anon:6292kB active_file:5876kB inactive_file:3112kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:495616kB managed:440420kB mlocked:0kB dirty:20kB writeback:0kB mapped:7672kB shmem:68kB slab_reclaimable:3228kB slab_unreclaimable:270068kB kernel_stack:2576kB pagetables:2208kB unstable:0kB bounce:0kB free_pcp:12kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:8 all_unreclaimable? no
May 31 11:29:11 RT-AC86U-1BD0 kernel: lowmem_reserve[]: 0 0 0
May 31 11:29:11 RT-AC86U-1BD0 kernel: DMA: 1499*4kB (UEM) 1565*8kB (UEM) 643*16kB (UEM) 55*32kB (UM) 1*64kB (R) 1*128kB (R) 0*256kB 0*512kB 1*1024kB (R) 0*2048kB 0*4096kB = 31780kB
May 31 11:29:11 RT-AC86U-1BD0 kernel: 3878 total pagecache pages
May 31 11:29:11 RT-AC86U-1BD0 kernel: 1556 pages in swap cache
May 31 11:29:11 RT-AC86U-1BD0 kernel: Swap cache stats: add 1065931, delete 1064375, find 15010041/15614179
May 31 11:29:11 RT-AC86U-1BD0 kernel: Free swap = 2051076kB
May 31 11:29:11 RT-AC86U-1BD0 kernel: Total swap = 2097148kB
May 31 11:29:11 RT-AC86U-1BD0 kernel: 123904 pages RAM
May 31 11:29:11 RT-AC86U-1BD0 kernel: 0 pages HighMem/MovableOnly
May 31 11:29:11 RT-AC86U-1BD0 kernel: 13799 pages reserved
May 31 11:29:14 RT-AC86U-1BD0 kernel: httpds: page allocation failure: order:5, mode:0x2040d0
May 31 11:29:14 RT-AC86U-1BD0 kernel: CPU: 1 PID: 19531 Comm: httpds Tainted: P O 4.1.27 #2
May 31 11:29:14 RT-AC86U-1BD0 kernel: Call trace:
May 31 11:29:14 RT-AC86U-1BD0 kernel: [<ffffffc0000876d8>] dump_backtrace+0x0/0x150
May 31 11:29:14 RT-AC86U-1BD0 kernel: [<ffffffc00008783c>] show_stack+0x14/0x20
May 31 11:29:14 RT-AC86U-1BD0 kernel: [<ffffffc00050873c>] dump_stack+0x90/0xb0
May 31 11:29:14 RT-AC86U-1BD0 kernel: [<ffffffc000101e28>] warn_alloc_failed+0xd8/0x120
May 31 11:29:14 RT-AC86U-1BD0 kernel: [<ffffffc0001044ec>] __alloc_pages_nodemask+0x404/0x6a0
May 31 11:29:14 RT-AC86U-1BD0 kernel: [<ffffffc000138ce0>] cache_alloc_refill+0x350/0x610
May 31 11:29:14 RT-AC86U-1BD0 kernel: [<ffffffc000139230>] __kmalloc+0xb8/0x100
May 31 11:29:14 RT-AC86U-1BD0 kernel: [<ffffffbffc00025c>] wlcsm_nl_rcv_skb+0x25c/0x3c0 [wlcsm]
May 31 11:29:14 RT-AC86U-1BD0 kernel: [<ffffffc000401c50>] netlink_unicast+0x150/0x228
May 31 11:29:14 RT-AC86U-1BD0 kernel: [<ffffffc000402118>] netlink_sendmsg+0x2f0/0x350
May 31 11:29:14 RT-AC86U-1BD0 kernel: [<ffffffc0003a1ee0>] sock_sendmsg+0x18/0x58
May 31 11:29:14 RT-AC86U-1BD0 kernel: [<ffffffc0003a2a5c>] ___sys_sendmsg+0x26c/0x280
May 31 11:29:14 RT-AC86U-1BD0 kernel: [<ffffffc0003a5320>] __sys_sendmsg+0x40/0x80
May 31 11:29:14 RT-AC86U-1BD0 kernel: [<ffffffc0003e12f8>] compat_SyS_sendmsg+0x10/0x18
May 31 11:29:14 RT-AC86U-1BD0 kernel: Mem-Info:
May 31 11:29:14 RT-AC86U-1BD0 kernel: active_anon:1473 inactive_anon:1506 isolated_anon:0
May 31 11:29:14 RT-AC86U-1BD0 kernel: active_file:1568 inactive_file:981 isolated_file:0
May 31 11:29:14 RT-AC86U-1BD0 kernel: unevictable:0 dirty:13 writeback:0 unstable:0
May 31 11:29:14 RT-AC86U-1BD0 kernel: slab_reclaimable:810 slab_unreclaimable:67568
May 31 11:29:14 RT-AC86U-1BD0 kernel: mapped:1975 shmem:38 pagetables:552 bounce:0
May 31 11:29:14 RT-AC86U-1BD0 kernel: free:7700 free_pcp:39 free_cma:0
May 31 11:29:14 RT-AC86U-1BD0 kernel: DMA free:30800kB min:20480kB low:25600kB high:30720kB active_anon:5892kB inactive_anon:6024kB active_file:6272kB inactive_file:3924kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:495616kB managed:440420kB mlocked:0kB dirty:52kB writeback:0kB mapped:7900kB shmem:152kB slab_reclaimable:3240kB slab_unreclaimable:270272kB kernel_stack:2592kB pagetables:2208kB unstable:0kB bounce:0kB free_pcp:156kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:364 all_unreclaimable? no
May 31 11:29:14 RT-AC86U-1BD0 kernel: lowmem_reserve[]: 0 0 0
May 31 11:29:14 RT-AC86U-1BD0 kernel: DMA: 1251*4kB (UEM) 1576*8kB (UEM) 639*16kB (UEM) 55*32kB (UM) 1*64kB (R) 1*128kB (R) 0*256kB 0*512kB 1*1024kB (R) 0*2048kB 0*4096kB = 30812kB
May 31 11:29:14 RT-AC86U-1BD0 kernel: 4105 total pagecache pages
May 31 11:29:14 RT-AC86U-1BD0 kernel: 1512 pages in swap cache
May 31 11:29:14 RT-AC86U-1BD0 kernel: Swap cache stats: add 1066253, delete 1064741, find 15010360/15614657
May 31 11:29:14 RT-AC86U-1BD0 kernel: Free swap = 2050728kB
May 31 11:29:14 RT-AC86U-1BD0 kernel: Total swap = 2097148kB
May 31 11:29:14 RT-AC86U-1BD0 kernel: 123904 pages RAM
May 31 11:29:14 RT-AC86U-1BD0 kernel: 0 pages HighMem/MovableOnly
May 31 11:29:14 RT-AC86U-1BD0 kernel: 13799 pages reserved

These was 1 more of these instances in the logs, but I ran out of space to show it.
 
I was looking through my logs, and I found these 2 events that seem extraordinary.

May 31 09:03:06 RT-AC86U-1BD0 amas_lib[1198]: Error locking /var/lock/clientlist.lock: 0 Already locked
May 31 09:03:06 RT-AC86U-1BD0 amas_lib[1198]: Error locking /var/lock/clientlist.lock: 0 Already locked
May 31 09:03:06 RT-AC86U-1BD0 amas_lib[1198]: Error unlocking -1: 9 Bad file descriptor
May 31 09:03:06 RT-AC86U-1BD0 amas_lib[1198]: Error unlocking -1: 9 Bad file descriptor

Don't know about these, possibly a bug, might be harmless or not, don't know.

Then later.

Looks like there were 3 instances of this, 2-3 secs apart. Could both of these indicate that my memory stick is going bad?

These traces look like kernel kmalloc() allocation failures.

The order indicates the size of the requested memory (or rather the size of the slab chunk that would be returned) and the failure says it can't be satisfied (obvious), it's been a long time since I looked into one of these so I don't know what memory size is being requested.

It could be because the kernel code is within a time critical section where it cannot wait so the allocation is tried and a flag is used to tell the kernel not to reclaim/reorganize the slab (or slub or whatever's being used) cache to satisfy the allocation which causes it to fail even if there would be enough memory around after a re-org. Or there simply might not be a suitable chunk of memory even after re-org. we can't know that. This can happen if the slab cache gets fragmented and re-org. can't free/merge enough memory perhaps due to a memory leak for example.

So it might not be your memory stick but I don't know how swap figures in this.
If you have a swap file on your memory stick it might be related to IO errors in some way but there's nothing here to indicate that.

The most likely cause is not enough available RAM.

Ian
 
Don't know about these, possibly a bug, might be harmless or not, don't know.



These traces look like kernel kmalloc() allocation failures.

The order indicates the size of the requested memory (or rather the size of the slab chunk that would be returned) and the failure says it can't be satisfied (obvious), it's been a long time since I looked into one of these so I don't know what memory size is being requested.

It could be because the kernel code is within a time critical section where it cannot wait so the allocation is tried and a flag is used to tell the kernel not to reclaim/reorganize the slab (or slub or whatever's being used) cache to satisfy the allocation which causes it to fail even if there would be enough memory around after a re-org. Or there simply might not be a suitable chunk of memory even after re-org. we can't know that. This can happen if the slab cache gets fragmented and re-org. can't free/merge enough memory perhaps due to a memory leak for example.

So it might not be your memory stick but I don't know how swap figures in this.
If you have a swap file on your memory stick it might be related to IO errors in some way but there's nothing here to indicate that.

The most likely cause is not enough available RAM.

Ian
Ian - Thanks for the good explanation. In order to debug this, if anyone needs any additional info, let me know.

I looked at my logs this morning. Here's something else I saw.

Jun 1 06:54:42 kernel: dcd[11720]: unhandled level 3 translation fault (11) at 0x00000000, esr 0x92000007
Jun 1 06:54:42 kernel: pgd = ffffffc017611000
Jun 1 06:54:42 kernel: [00000000] *pgd=000000001101c003, *pud=000000001101c003, *pmd=0000000014370003, *pte=0000000000000000
Jun 1 06:54:42 kernel: CPU: 1 PID: 11720 Comm: dcd Tainted: P O 4.1.27 #2
Jun 1 06:54:42 kernel: Hardware name: Broadcom-v8A (DT)
Jun 1 06:54:42 kernel: task: ffffffc0193daac0 ti: ffffffc01121c000 task.ti: ffffffc01121c000
Jun 1 06:54:42 kernel: PC is at 0xf6eb0f44
Jun 1 06:54:42 kernel: LR is at 0x1dc74
Jun 1 06:54:42 kernel: pc : [<00000000f6eb0f44>] lr : [<000000000001dc74>] pstate: 600e0010
Jun 1 06:54:42 kernel: sp : 00000000ff7f7d18
Jun 1 06:54:42 kernel: x12: 000000000009ff08
Jun 1 06:54:42 kernel: x11: 00000000f61ff024 x10: 00000000000a02ac
Jun 1 06:54:42 kernel: x9 : 00000000f61ffee0 x8 : 00000000000a0764
Jun 1 06:54:42 kernel: x7 : 00000000f61fff10 x6 : 00000000000a075e
Jun 1 06:54:42 kernel: x5 : 0000000000000000 x4 : 00000000f61ffec4
Jun 1 06:54:42 kernel: x3 : 0000000000000000 x2 : 0000000000000000
Jun 1 06:54:42 kernel: x1 : 000000000007c66c x0 : 0000000000000000

Here's another example. I actually see these regularly.

Jun 1 06:10:09 kernel: dcd[8141]: unhandled level 3 translation fault (11) at 0x00000000, esr 0x92000007
Jun 1 06:10:09 kernel: pgd = ffffffc015023000
Jun 1 06:10:09 kernel: [00000000] *pgd=0000000010d0a003, *pud=0000000010d0a003, *pmd=0000000011db7003, *pte=0000000000000000
Jun 1 06:10:09 kernel: CPU: 1 PID: 8141 Comm: dcd Tainted: P O 4.1.27 #2
Jun 1 06:10:09 kernel: Hardware name: Broadcom-v8A (DT)
Jun 1 06:10:09 kernel: task: ffffffc01e00e140 ti: ffffffc011264000 task.ti: ffffffc011264000
Jun 1 06:10:09 kernel: PC is at 0xf7552f44
Jun 1 06:10:09 kernel: LR is at 0x1dc74
Jun 1 06:10:09 kernel: pc : [<00000000f7552f44>] lr : [<000000000001dc74>] pstate: 600e0010
Jun 1 06:10:09 kernel: sp : 00000000ffd1c228
Jun 1 06:10:09 kernel: x12: 000000000009ff08
Jun 1 06:10:09 kernel: x11: 00000000f67ff024 x10: 00000000000a02ac
Jun 1 06:10:09 kernel: x9 : 00000000f67ffee0 x8 : 00000000000a0764
Jun 1 06:10:09 kernel: x7 : 00000000f67fff10 x6 : 00000000000a075e
Jun 1 06:10:09 kernel: x5 : 0000000000000000 x4 : 00000000f67ffec4
Jun 1 06:10:09 kernel: x3 : 0000000000000000 x2 : 0000000000000000
Jun 1 06:10:09 kernel: x1 : 000000000007c66c x0 : 0000000000000000

I also notice that Scribe is now not working. All the messages are in the System Messages tab with no messages in any of the other tabs. Guess I'll need to reboot.

I also just found that the system rebooted itself at 1:30am this morning, since it is now about 7am and the router has only been up for >5 hrs. Hmmm, what's that all about?

Before the reboot, I putty'ed into the router, did a cru l, and saw that almost none of the added scripts have their cron jobs registered. That explains Scribe not working.
 
I just rebooted. Looks like post-mount is not executing. Diversion got blown away. amtm does not think its there. Skynet just threw up a message that USB is not found. I guess that means something bad did happen to my memory stick.

Time to reinstall. But I just did it about a week ago.
 
I just rebooted. Looks like post-mount is not executing. Diversion got blown away. amtm does not think its there. Skynet just threw up a message that USB is not found. I guess that means something bad did happen to my memory stick.

Time to reinstall. But I just did it about a week ago.
I think I found the problem. I was stubbornly using the USB3 port for my flash drive. I'm thinking now that that was the root cause of all the problems, and that the flash drive itself is OK. I merely moved the flash from the USB3 port to the USB2 port, turned on the router, and all seems to be OK now. I'll see how long that lasts but I expect all that weirdness is behind me.
 
I think I found the problem. I was stubbornly using the USB3 port for my flash drive. I'm thinking now that that was the root cause of all the problems, and that the flash drive itself is OK. I merely moved the flash from the USB3 port to the USB2 port, turned on the router, and all seems to be OK now. I'll see how long that lasts but I expect all that weirdness is behind me.

Those errors look bad.

I'm not at all familiar with how the page tables work but have to think it could be a problem with the USB3 driver code, perhaps with how much RAM it wants, and the page table errors could be due to lack of RAM too, I'm just not sure about that.

Maybe you've got a bit too much running on there for the RAM you have to play with ...

Do you have a swap file configured?
Not sure swap on a USB device is a good idea either but there's no other choice.
 
Do you have a swap file configured?
Not sure swap on a USB device is a good idea either but there's no other choice.

Oh wait you have 2GB swap configured, ok.
I wonder if there's some problem with the buffer cache releasing pages when needed ... mmm ... really don't know ...
 

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top