[Bug/Problem] Missing UDP packets on RT-AC68U breaks video sharing in Webex application

Atais

Occasional Visitor
Hello.

This is a quite complicated issue so I will try to divide it in smaller parts to describe it as short and precise as possible.
I have tried hard reset of the router, application, system etc. with no luck.

0. The problem​

Only when using my home network based on Asus RT-AC68U, for both wifi and wired connection,
while on Unix based systems (Linux and/or Mac) Cisco Webex application cannot establish screen-sharing sessions, both incoming and outgoing.

Exceptions:
  • If on Unix system I am using browser (Chrome/Firefox/Safari), screen sharing works fine.
  • On Windows both the application & browser sharing works fine, which makes it even more strange.

With help from Cisco support we narrowed down the issue to missing/cropped UDP packets that the application is using (more on that later).

1. My network setup​

I am using Asus RT-AC68U with newest MerlinWRT 386.7_2 but other versions are impacted as well (I have first observed this issue 6 months ago).

I am using Orange Polska FTTH Internet together with custom ONT router. For this setup to work properly I had to use VLAN 35 and PPPoE authorization. This setup is described (in English) here if you need more details.
When using original Orange router (Funbox 3.0), both with or without the ONT router, the application (and screen sharing) works 100% fine.
Also when using LTE Hotspot, everything works fine.

That why I am 100% certain the issue lays in the Asus RT-AC68U router or configuration.

2. Computer setup​

As mentioned above I have tested and confirmed this issue on (both wireless and wired):
  • Macbook M1 2021 / MacOS Monterey
  • Lenovo Legion 5 with Ryzen 7 4800H / Ubuntu 20.04
On the very same Lenovo, but using Windows 10, the issue does not occur.
So the issue only occurs on Unix systems.

3. Cisco support packet analysis​

Cisco support prepared a Wireshark comparison for incoming screen sharing session to my Mac, while using LTE (so it works) and while on WiFi (when it does not).

Results:
content server IP: 62.109.229.48

1. There is much less UDP packets while on WiFi. What is more, the packets are very diffetent in size, but max size is 300 bytes. While on LTE the packet size is more constant, and around 1152 bytes

Screenshot 2022-08-25 at 12.39.31.png


The larger packets most likely are the missing video. Maybe the router is blocking them?
I have tried experimenting with MTU before, but increasing from 1492 (current) to 1500 (max) did not affect anything.

Screenshot 2022-08-25 at 12.36.16.png


2. Another thing is that router asks for IP that did not offer any content back (192.168.1.24). It sends ARP requests and gets no answer? What could cause that?
12_45_25.jpg


While using LTE it looks more normal. It got some content from 192.168.64.241 and also gets ARP response:
12_45_57.jpg


4. Summary​

I am more than happy to register other scenarios or enable some debugging on router - the thing is - I don't really know how to do so.
I can also share some of my router configuration, to start with I guess these are the most important:

WAN
  • WAN Connection Type: PPPoE
  • MTU: 1492
  • MRU: 1492
Firewall
  • Enable Firewall: True (disabling did not help)
  • Enable DoS protection: False
LAN / IPTV
  • Internet VID: 35

So if you have any idea where should I start digging, lets try.
Thanks!
 

ColinTaylor

Part of the Furniture
Try disabling hardware acceleration in LAN - Switch Control.
 

Atais

Occasional Visitor
Wow... OK that was fast... and it solved the issue!
But the thing is my network is 1000 mbps / 300 mpbs and the router's CPU just can't handle it o_O

So......... its a hardware issue :eek: ?

I need to upgrade my router or what :D
 

RMerlin

Asuswrt-Merlin dev
Most likely a compatibility issue with Broadcom's CTF NAT acceleration.

It's possible one of the newer models might work better, as CTF was replaced by a completely different NAT acceleration engine. You'd have to test it however.
 

Atais

Occasional Visitor
@RMerlin sorry to bother, but this knowledge is rather hard to find :)

Which models have the CTF replaced already ?
We are talking RT-AX86U and newer?
What is the descriminator I should be looking for? Like some specific chip or CPU?

Quite frankly I like the Asus router's family and would like to stick with it.
 

RMerlin

Asuswrt-Merlin dev
We are talking RT-AX86U and newer?
RT-AC86U and newer.

What is the descriminator I should be looking for? Like some specific chip or CPU?
They must use the newer HND SDK, which is used by the RT-AC86U, GT-AC2900, and all of their AX models.
 

Atais

Occasional Visitor
OK. I will do some research, order one, and report back the results.

Maybe it is just me, but finding a CTF NAT hardware bug in Broadcom's chip that causes UDP communication to fail on Unix systems... feels like super niche to me... and you guys helped me in 5 minutes.
I can't believe it, you're awesome!

Thanks so much!
 
Last edited:

ColinTaylor

Part of the Furniture
But the thing is my network is 1000 mbps / 300 mpbs and the router's CPU just can't handle it o_O

So......... its a hardware issue :eek: ?
It's strange that it only effects the application, and then only on the Unix platforms.

It's a bit of a stretch but it's possible that the Unix application is using NAT loopback. There is a known bug in the UDP loopback code that was described in this thread. You could easily test if that's the case. If it is then you would be able to leave CTF enabled.

So to test, re-enable hardware acceleration and wait for the router to reboot. Then SSH into it and issue the following command:
Code:
iptables -t mangle -A PREROUTING -p udp -m state --state NEW -j MARK --set-mark 0x1/0x7
Then test the application. But like I said, this is a bit of a long shot.
 

capncybo

Senior Member
I'm not sure how often you actually need to use Webex but there is also the option of running a bash script to (enable/disable) your CTF as (needed/required)
In an ssh terminal you can type the following to see the values...
nvram show |grep ctf
OUTPUT ex)
ctf_fa_mode=0
ctf_disable=0
ctf_disable_force=0
ctf_fa_cap=1
ctf_nonat_force=0
ctf_pt_udp=0

And if you made a script with the following lines for example...
#!/bin/sh
nvram unset ctf_fa_cap
nvram set ctf_fa_mode=2
nvram commit

You would be changing some of the ctf settings.
 

Atais

Occasional Visitor
So to test, re-enable hardware acceleration and wait for the router to reboot. Then SSH into it and issue the following command:
Code:
iptables -t mangle -A PREROUTING -p udp -m state --state NEW -j MARK --set-mark 0x1/0x7
Then test the application. But like I said, this is a bit of a long shot.
How to set it back to defaults, whatever I am doing here :)?
I will be able to test it back on Monday.


I'm not sure how often you actually need to use Webex but there is also the option of running a bash script to (enable/disable) your CTF as (needed/required)
In an ssh terminal you can type the following to see the values...

Right, but this requires 1 minute of network downtime, which is not really acceptable. I would switch it way too often. Thanks anyway.
 
Last edited:

ColinTaylor

Part of the Furniture
How to set it back to defaults, whatever I am doing here :)?
I will be able to test it back on Monday.
It's just temporary change. Rebooting the router or changes to the router that effect the WAN would wipe it out. But to remove it immediately just issue the same command with -D (for delete) instead of -A (for append), i.e.:

Code:
iptables -t mangle -D PREROUTING -p udp -m state --state NEW -j MARK --set-mark 0x1/0x7
 

RMerlin

Asuswrt-Merlin dev
I'm not sure how often you actually need to use Webex but there is also the option of running a bash script to (enable/disable) your CTF as (needed/required)
In an ssh terminal you can type the following to see the values...
nvram show |grep ctf
OUTPUT ex)
ctf_fa_mode=0
ctf_disable=0
ctf_disable_force=0
ctf_fa_cap=1
ctf_nonat_force=0
ctf_pt_udp=0

And if you made a script with the following lines for example...
#!/bin/sh
nvram unset ctf_fa_cap
nvram set ctf_fa_mode=2
nvram commit

You would be changing some of the ctf settings.
The problem is, unlike archer/runner, CTF requires a reboot to enable/disable.
 

Atais

Occasional Visitor
It's strange that it only effects the application, and then only on the Unix platforms.

It's a bit of a stretch but it's possible that the Unix application is using NAT loopback. There is a known bug in the UDP loopback code that was described in this thread. You could easily test if that's the case. If it is then you would be able to leave CTF enabled.

So to test, re-enable hardware acceleration and wait for the router to reboot. Then SSH into it and issue the following command:
Code:
iptables -t mangle -A PREROUTING -p udp -m state --state NEW -j MARK --set-mark 0x1/0x7
Then test the application. But like I said, this is a bit of a long shot.

@ColinTaylor I have just used your advice and actually now the screensharing works with CTF enabled!

What have I actually done :)? Can I use it as a long term solution?

If I follow your response correctly, it means Webex on Unix uses NAT loopback and there is a known issue regarding using it.
I suppose there are some alternatives to doing so, so would that be a valid suggestion to Cisco that they should change this behaviour? Or it is commonly used practice?
 
Last edited:

ColinTaylor

Part of the Furniture
Well that's good news.

I don't think this is something that Cisco would change their product for as the problem is with the router. Maybe they would if lots of other routers suffered from the same problem. The real solution in this case is to fix the bug in the router's firmware.

In the meantime you can create a workaround script that runs each time the router boots up. See the wiki for creating scripts. But in summary;

1. Enable custom scripts and configs in the GUI (Administration - System).
2. Use vi or nano to create a script called /jffs/scripts/firewall-start
3. The contents of the script should look like this:
Code:
#!/bin/sh
iptables -t mangle -A PREROUTING -p udp -m state --state NEW -j MARK --set-mark 0x1/0x7
4. Make the script executable
Code:
chmod 755 /jffs/scripts/firewall-start
5. Run the script (or reboot the router)
Code:
service restart_firewall

@RMerlin Can you have a look at this again and consider applying the fix that was proposed before? John applied it to his firmware but it never made it into your firmware. Obviously in your case the bitmask is 0x7 rather than John's 0x1ff.
 

Atais

Occasional Visitor
I would really like to help but C is too far from my field of expertise and one definietly do not want to modify low level network settings blindly :)

For the moment I followed your workaround steps and it seems fine.
Thanks so much @ColinTaylor !
 

Similar threads

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top