What's new

IPv6 breakage caused by ipv6_neighsol_drop

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

Heath Kehoe

New Around Here
tldr: if IPv6 isn't working for you, it could be because of the ipv6_neighsol_drop nvram setting.

So IPv6 just wasn't working for me. We have an allocation from our ISP and my contact there said everything looked good on their end. But no v6 traffic was making it past the AC68.

I could configure my laptop with the router's v6 address and connect directly to the uplink ethernet; and that worked fine. So the problem was definitely in asuswrt. To help troubleshoot, I made a custom build of asuswrt-merlin with tcpdump enabled. From the packet captures, I determined that the ISP's router was sending Neighbor Solicitations to find my router but my router was not replying to the NSs.

Very weird, especially since ip6tables -L showed explicit rules allowing neighbor-solicitations and neighbor-advertisements. I dug around in /proc/sys/net, read docs, pulled out hair, but couldn't find any reason why this router wasn't responding to an NS.

About to give up, I finally thought to look at the "mangle" table (ip6tables -L -t mangle) and there it was: a rule explicitly dropping neighbor solicitations from eth0.

I deleted that rule, and BAM! IPv6 was back!

Since I already had asuswrt-merlin sources, I grepped the code looking for where that mangle rule came from and found the bit commented "Workaround for neighbor solicitation flood from Comcast", controlled by nvram variable ipv6_neighsol_drop, which defaults to 1.

It was then that I found the mention in the ChangeLog (from nearly a year ago) about that workaround. So I maybe could have saved myself a bunch of time if I had (1) happened to see it, and (2) realized how it applied.

Anyway, why is the default for that 1? Because it seems to me that NS is pretty fundamental to IPv6 operation and nuking it just to workaround one broken provider seems like the wrong approach.
 
Since you already did some analysis....what is the 'normal' rate of neighbor-solicitations you are seeing?
 
Anyway, why is the default for that 1? Because it seems to me that NS is pretty fundamental to IPv6 operation and nuking it just to workaround one broken provider seems like the wrong approach.

Because AFAIK, the type of packets that I drop should normally never reach your LAN in the first place (notice that I'm not dropping ALL NS packets). And in the numerous months with that setting there and enabled by default, nobody ever reported any issue related to it, so I kept it enabled by default. If you are positively certain they are causing issues (I'd like to get other user's feedback there as well, in case your ISP might just end up being as broken at Comcast ;) ).
 
I'm a little surprised that no one else ran into this issue.

My guess is that some v6 implementations populate their neighbor cache using received packets; so as long as our router sent a v6 packet recently enough then the remote router won't need to send an NS. In fact, it seems that in at least one version of Cisco IOS, that behavior is configurable (ipv6 nd na glean).

Since my ISP's router is some type of Cisco (at least, it has a Cisco MAC addr), it must be configured with 'na glean' off, which means that it won't talk to our router until it actually receives an NA from it. Since the NSs were being dropped by the mangle rule, that meant no NAs were going out and thus no worky.

Looks like john9527 suggests rate-limiting the NSs instead of just dropping them, which I think is a good idea.
 
My IPv6 works with the ip6tables set by Merlin for a while, maybe 2-10 hours but at some point I lose route to the WAN and can't ping6 / traceroute6 etc. The logs don't show anything but I do notice that the "WAN IPv6 Gateway" will disappear in the System Log screen. This is using Comcast with native ipv6, DHCP-PD/Router announcement enabled and stateless configuration.

What usually fixes it is if I unplug & plug power to my cable modem (DPC3008)
 
Thanks.....seems like we can re-craft that mangle rule then.

Not sure i understand 100% here but. John please dont remove Merlin's overflow fix for comcast IPv6, if you do i will not be able to use your firmware as my sys log will EXPLODE with Neighbor table errors. :eek:
 
Not sure i understand 100% here but. John please dont remove Merlin's overflow fix for comcast IPv6, if you do i will not be able to use your firmware as my sys log will EXPLODE with Neighbor table errors. :eek:
Don't worry, I would never want anyone's router to EXPLODE !:D
 
Flushing the ipv6-icmp neighbour-solicitation rule worked for me as well, although I'm not sure why.

I'm on TWC with native IPv6 and I would frequently lose IPv6 connectivity but not to all destinations. ping -6 www.google.com would time out (I could see the request packet going from the AC68U to the cable modem) but ping -6 www.comcast.net would work. Then a few minutes later the behavior would reverse or both would fail or both would pass.
 
My IPv6 works with the ip6tables set by Merlin for a while, maybe 2-10 hours but at some point I lose route to the WAN and can't ping6 / traceroute6 etc. The logs don't show anything but I do notice that the "WAN IPv6 Gateway" will disappear in the System Log screen. This is using Comcast with native ipv6, DHCP-PD/Router announcement enabled and stateless configuration.

What usually fixes it is if I unplug & plug power to my cable modem (DPC3008)

I see the same issue with U-Verse and the NV589. Quite difficult to pinpoint the issue. This ipv6_neighsol_drop nvram setting seems like an interesting parameter, but with all the different ISP implementation, I always remain skeptical.

IPV6 has been a pain. When it does work though, those routes seem less congested (Youtube over IPV6 rarely sutters ... etc). I have been striving to make it work for that tangible benefit.

I'll flush this rule and observe the reliability.
 
Last edited:
Merlin/ John

Would you consider moving this rule from mangle to the INPUT chain? I think that that mangle rules get processed for all packets, while the INPUT chain processes only a small subset of them.
 
Merlin/ John

Would you consider moving this rule from mangle to the INPUT chain? I think that that mangle rules get processed for all packets, while the INPUT chain processes only a small subset of them.

Someone on Comcast would have to confirm to me that the packet does reach the INPUT chain - which I doubt, as it's not targeting the router's IP specifically.
 
Thought I'd be clever and use the wan-start user script to remove this entry. It kept reappearing. Going to try the latest beta stock firmware with my connection to see if IPV6 is any better.

Shame there are some implementations that are just rude, and fixes like this are needed.
 
Because AFAIK, the type of packets that I drop should normally never reach your LAN in the first place (notice that I'm not dropping ALL NS packets). And in the numerous months with that setting there and enabled by default, nobody ever reported any issue related to it, so I kept it enabled by default. If you are positively certain they are causing issues (I'd like to get other user's feedback there as well, in case your ISP might just end up being as broken at Comcast ;) ).

I can verify that for Time Warner (TWC) deleting the mangle rule, and also changing the neighbor solicitation drop rule to 0 finally allows native IPv6
to work properly for me :)!

Previously for Windows machines would hold on for a while and reboots
of the modem and router would get it back for a while (a couple of days at
most)

Apple devices simply broke within less than an hour.

Now all devices are happy and passing all the IPV6 tests I've thrown at them
this afternoon. When the mangle rule had been in place, things broke
within less than an hour today.

The error that would have typically manifest as far as the test sites were
concerned, were primarily due PMTUD problems. Now there are no errors!

I would strongly endorse this as a potential fix for IPV6 problems
with TWC.

At one point, I was worried that the nvram setting was getting set back to 1
during reboots. In what startup script can I change this setting so that it
Survives reboots and firmware updates?

I'm on RT-AC56U, 378.52_2.

Merlin, thanks so much for the updated firmware and your efforts!
They are very much appreciated!

Pablo
 
Last edited:
yeah NS should be allowed for proper ipv6 operation.

All these comcast fixes I suggest to be very careful with as they seem to risk operation on other isp's.

comcast really need to sort out their ipv6 implementation.
 
yeah NS should be allowed for proper ipv6 operation.

All these comcast fixes I suggest to be very careful with as they seem to risk operation on other isp's.

comcast really need to sort out their ipv6 implementation.

I know. I was also leery of it, that's why I made it manually configurable through an nvram setting.

As I often say, IPv6 is one big experimentation that shouldn't have been allowed outside of a lab in its current state. It's a huge mess.
 

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top