What's new

RT-AC68U - Interrupts only handled by core 0 & router slowness

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

paulmarsy

New Around Here
I have a strange issue that has spanned the last few builds (both stable, beta and now the latest test alpha build) where accessing the router is slow e.g. SSH connections are prone to timing out and dropping, or pausing for 30+ seconds before suddenly becoming responsive again.

It doesn't appear to be affecting internet or general network (wired or wireless) usage though I have things like QOS turned off to reduce the workload.

I believe the problem is caused because nearly all interrupts are handled by CPU0 instead of being evenly distributed:
chcYsiI.png


The smp_affinity for all IRQ interrupts is set to '3' (cat /proc/irq/179/smp_affinity returns 3), I can manually change the smp_affinity to be 2 for one of the interrupts and then CPU1 will process them but my understanding is that the default value of 3 should cause them to be shared between the two cores evenly as it is a binary mask of '11'.

The CPU/core mappings appear correct:
$ cat /sys/devices/system/cpu/cpu0/topology/core_id
0
$ cat /sys/devices/system/cpu/cpu0/topology/core_siblings
1
$ cat /sys/devices/system/cpu/cpu0/topology/thread_siblings
1
$ cat /sys/devices/system/cpu/cpu1/topology/core_id
0
$ cat /sys/devices/system/cpu/cpu1/topology/core_siblings
2
$ cat /sys/devices/system/cpu/cpu1/topology/thread_siblings
2

I have made the assumption that for ASUS routers the IRQ workload should be distributed between the two cores and it is isn't a specific design choice for one core to handle them as I don't have a correctly performing router to check against.

The reason I believe it is the interrupts causing the performance issues is because in htop kworker/1:2 is the third highest user of cpu time, though mtdblock3 being the highest by a significant margin does make me wonder if the internal flash memory is starting to fail, or if it being the highest is expected, again I don't have another router to compare against.
FxEQxJu.png


Viewing just the kernal workers you can see kworker/1:2 is doing all the work while the others are doing pretty much nothing.

ei8XOTn.png


If anyone is able to check to see if what I am seeing is or isn't normal behaviour to know if I am looking at the right things, as well as any advice or guidance on what else to look into it would be much appreciated :)
 
... though mtdblock3 being the highest by a significant margin does make me wonder if the internal flash memory is starting to fail, or if it being the highest is expected, again I don't have another router to compare against.

Maybe the mtdblock3 indicates JFFS reads, not writes? You are running a web server on the router, that is reading web pages from JFFS, but it is not writing back to JFFS? Here's my htop:

Code:
 PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
  254 admin      20   0     0     0     0 S  0.0  0.0  4:06.42 mtdblock3
13161 admin      20   0 27708 25032  7956 S  0.0  9.8  1:23.21 tor -f /opt/etc/tor/torrc
  499 admin      20   0  1340   544   320 S  0.0  0.2  0:49.99 /usr/sbin/acsd
  522 admin      20   0  6120  1504   984 S  0.0  0.6  0:33.28 watchdog
  277 admin      20   0     0     0     0 S  0.0  0.0  0:31.31 kworker/1:1
  903 admin      20   0     0     0     0 S  0.0  0.0  0:23.85 usb-storage
  516 admin      20   0  6064  1816  1216 S  0.0  0.7  0:18.71 httpd
12489 admin      20   0 46960  1868  1008 S  0.0  0.7  0:13.67 privoxy /opt/etc/privoxy/config
1192 admin      20   0  5316   980   872 S  0.0  0.4  0:13.56 nmbd -D -s /etc/smb.conf
    1 admin      20   0  5880  1624  1308 S  0.0  0.6  0:11.32 /sbin/preinit
1080 admin      20   0  1136   460   352 S  0.0  0.2  0:09.79 /jffs/bin/dnscrypt-proxy --local-address=127.0.0.1:xxxxx
1185 admin      20   0 10456  2868  2132 S  0.0  1.1  0:08.30 /opt/bin/XMail
    3 admin      20   0     0     0     0 S  0.0  0.0  0:03.32 ksoftirqd/0
    9 admin      20   0     0     0     0 S  0.0  0.0  0:03.12 ksoftirqd/1
1228 admin      20   0 10456  2868  2132 S  0.0  1.1  0:02.46 /opt/bin/XMail
  544 admin      20   0  1356   652   336 S  0.0  0.3  0:02.18 rstats
  357 admin      30  10     0     0     0 S  0.0  0.0  0:02.12 jffs2_gcd_mtd4
20888 admin      20   0  1284   688   404 S  0.0  0.3  0:01.74 /usr/bin/dropbear -p xxxxx -s -j -k -b /etc/dropbear/logi
1082 admin      20   0  1128   448   348 S  0.0  0.2  0:01.69 /jffs/bin/dnscrypt-proxy --local-address=127.0.0.1:xxxxx
  520 admin      20   0  1292   476   364 S  0.0  0.2  0:01.40 networkmap --bootwait
1237 admin      20   0 10456  2868  2132 S  0.0  1.1  0:01.24 /opt/bin/XMail
  478 admin      20   0  5872  1292  1016 S  0.0  0.5  0:01.08 /sbin/wanduck
  494 admin      20   0  1764   656   496 S  0.0  0.3  0:00.88 /bin/wps_monitor
1138 admin      20   0  2264   852   536 S  0.0  0.3  0:00.84 avahi-daemon: running [RT-AC68U-5188.local]
  276 admin      20   0     0     0     0 S  0.0  0.0  0:00.75 kworker/0:1
  244 admin      20   0     0     0     0 S  0.0  0.0  0:00.73 mtdblock1
  249 admin      20   0     0     0     0 S  0.0  0.0  0:00.70 mtdblock2
21038 admin      20   0  4960  1324  1056 R  0.5  0.5  0:00.69 htop
  239 admin      20   0     0     0     0 S  0.0  0.0  0:00.68 mtdblock0
19917 admin      20   0  1092   600   372 S  0.0  0.2  0:00.63 dnsmasq --log-async
  497 admin      20   0  1872  1028   360 S  0.0  0.4  0:00.49 nas
11843 admin      20   0  5872  1184   916 S  0.0  0.5  0:00.42 ntp
    6 admin      RT   0     0     0     0 S  0.0  0.0  0:00.41 migration/0
   57 admin      20   0     0     0     0 S  0.0  0.0  0:00.40 sync_supers
  113 admin      20   0     0     0     0 S  0.0  0.0  0:00.37 kswapd0
  486 admin      20   0  1380   352   272 S  0.0  0.1  0:00.37 telnetd
1024 admin      20   0     0     0     0 S  0.0  0.0  0:00.37 kjournald
11884 admin      20   0  5872  1144   876 S  0.0  0.4  0:00.36 disk_monitor
  490 admin      20   0  1156   352   264 S  0.0  0.1  0:00.30 /bin/eapd
    7 admin      RT   0     0     0     0 S  0.0  0.0  0:00.28 migration/1
  518 admin      20   0  1400   408   308 S  0.0  0.2  0:00.28 crond
20777 admin      20   0  7092  2716  1044 S  0.0  1.1  0:00.25 minidlna -f /etc/minidlna.conf
 
Is mtdblock3 the nvram? That is strangely high usage...

Just to make sure, have you Reset the config to Default?
 
I realize this is an ancient thread, but I just noticed the same issue. High (maxed) core 1 usage and low core 2 and very high ksoftirqd in top. This is with QOS on and a very large download going from multiple addresses.

It appears the default default_smp_affinity is 3, as pointed out above. Might it be better to set the default to "f" (meaning that the IRQ can be serviced on any of the CPUs in the system)? I'm not sure what the difference is between 3 and f in a two core system.
 

Similar threads

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top