What's new
  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

Issues with DCHP Server on AC87U

Status
Not open for further replies.

Guardian Hope

Occasional Visitor
Hey everyone, it's been awhile. So I have been using ASUSWRT Merlin for a very long time now and recently I had switched the access points from AP Mode to Router Mode to take over the task of issuing DHCP on a different VLAN due to inadequacies in newly installed equipment. However, I think I may have something misconfigured on the ASUS unit.

According to the system log, "dnsmasq-dchp-server" is erroring connecting clients with a DHCPDECLINED message with an error stating that it did not assign the connecting client a DHCP lease because it has already been registered in /etc/hosts and one other file (I'm on one of the problematic clients at the moment so I can't pull up the exact files).

So a part from one singular client equipped with an ASUS PCI-AC68 and extremely specific settings (static IP, DNS, NetBIOS Enabled) which we then tried to replicate to the other clients without success, no other clients can connect to the routers.

Whether it's Windows or OS X, both OS' report IP Address Conflicts even though the IP addresses aren't even in use.

Some say that it's likely a lost cause and time to just replace them when new Ubuquities but given the number of routers that are concurrently performing this error my belief is that it's some setting ticked wrong either on the connecting clients or on the router themselves.

Of important note, I flashed DD-WRT onto one of the routers just to see as they have the ability to turn off dnsmasq being used as a DHCP server and it does resolve it. Unfortunately, compared to Merlin, DD-WRT is no where near adequate (can't control 5GHz channel; no WAN load balancing; etc.).

After spending many hours on the problem I feel like I just missed something so simple but at this point I could use any assistance before I just turn in the routers for new ones to handle the new tasks of having to route DHCP.
 
Update - I have narrowed down the offending settings via the router's limited she. Via shell I ran:

Code:
dnsmask -2

On every LAN interface the router had including br0 and lo. It claims to turn off dnsmasq for DHCP but it didn't work. So after reading the errors causing DCHPDECLINED and saw that it was a result of two files:

Code:
/etc/hosts
/etc/hosts.dnsmasq

I ran the following in the shell:

Code:
dnsmasq -h
dnsmasq -R

To prevent the router from reading those files. This for the most part fixed it but I do believe there's a bug in ASUSWRT Merlin.

When you attempt to assign IP addresses manually through the router it locks out those IP addresses altogether and does not assign it to the matching MAC address. In addition. the designated client can't even obtain a lease by statically setting it on the client unless it's something other than the specified IP you had put in.

I have never had so many problems with a DHCP Server before. Is there an alternative one available in the firmware already or something that I can spool up a Linux system to build in if there's not?
 
Sounds more like something with dnsmasq than with the firmware to me. The firmware doesn't do anything related to DHCP, everything is handled by dnsmasq.
 
Sounds more like something with dnsmasq than with the firmware to me. The firmware doesn't do anything related to DHCP, everything is handled by dnsmasq.

Well, perhaps you can help with a little bit of a better explanation of my new current setup:

HzsEwSc.jpg


Now the entire left side is functioning fine under the new setup (and before you ask, yes, I disconnected the left side just to see). The right side on the other hand, isn't functioning anywhere near correctly.

I turned off another DNSMasq setting (something about another file; it was -9 but I have to check as I did it from my Mac).

In either case, I have been able to get every system except one back up and running on the network; ironically, this was also the system that worked before disabling all the dnsmasq-dhcp settings that I had quoted above. Every other system is able to switch IPs on a whim so I can setup static addresses without the use of the ASUS interface but one Windows 10 Pro device will accept all IPs except one as a static address (for the sake of this example let's go with 10.0.55.55). In this example, it'll take 56, 57, 200, and literally any combination available but not the one it needs to be assigned (in this example, 55).

I have tried rebooting the router and everything else in between to try and get this particular system to accept the IP but no matter what, when I assign the system the IP it needs via Windows it can't obtain it and goes to "No Network Access." Same if you try from the ASUS interface.

However, if I switch the Verizon router back to being the DNS Server for the right side and turn the acting router back into an access point, everything works as it should. The Verizon router can hand out the IPs perfectly fine (there's just security concerns with it). Move the same device to the left side and the same case: the system can retrieve the manually configured IP.

What's also odd is that the ASUS router seems to be recognizing this system as "WORKGROUP" while the other devices recognize it by the computer's actual hostname.
 
That network setup is way beyond what these routers were designed to handle (a normal home LAN, with no VLANs), and not something I really feel like spending an hour trying to understand and troubleshoot, sorry... Your usage scenario is atypical to what these routers were designed to handle.
 
That network setup is way beyond what these routers were designed to handle (a normal home LAN, with no VLANs), and not something I really feel like spending an hour trying to understand and troubleshoot, sorry... Your usage scenario is atypical to what these routers were designed to handle.

Normal fiber optic wired homes (such as Verizon FiOS in the United States) normally have VLANs. You shouldn't even be concerning yourself with the rest of the network and I wasn't even going to put it there because that's all working fine. The Quantum Gateway (G1100) acts as a glorified modem essentially. The only problem is on the right side where one (1) Windows 10 Pro computer can't obtain a specific IP address from the ASUS RT87U while every other wirelessly connected device can while also factoring in that on DD-WRT, there is no DHCP Server issues. So let's break down the problem: is there a file somewhere that stores formerly explicitly stated reserved IP addresses that are inputted into the ASUSWRT interface? Can that be cleared? Restarting the router doesn't clear it. There's no need to look at the whole picture when we only need to look at one instance.
 
Last edited by a moderator:
Final Update:

I have reverted the ASUS RT87U back to "AP Mode" and put a Windows 2012 R2 Datacenter Edition as in the DHCP/DNS Server Role(s). From there it was pretty easy to determine that the Windows Networking Stack needed to be reset and after doing so Windows was able to acquire the proper IP address from the DHCP Server.

It also seems the ASUS routers might be going bad. It's detecting bad blocks and specific offsets of the NAND and reporting them in the log.

In addition to that, now that the RT87U's are in Access Point mode again and I reviewed the logs in more detail I have noticed this repeating since "AP Mode" was re-enabled:

Code:
kernel: TCP: time wait bucket table overflow

Which is constantly repeating itself. For the uninitiated, that particular message means that the kernel can't allocate any data to put in the "TIME_WAIT" state for a socket. On networking equipment and servers this can come up for several reasons such as misconfiguration, hardware too slow to keep up with the network traffic, and even network attacks (a long with a lot more).

In this case it's two things: misconfiguration, and hardware. Apparently, the latest release of ASUSWRT Merlin defaults these settings to a ridiculously low level so when you reset the router it applies an extremely low setting to "TCP Timeout Establish." The old default was 432000 with 380.57. As to why this change occurred, I can only wonder but logically speaking, the TCP Timeout is set extremely low and anyone with a decent internet connection should go and check their settings. Actually, anyone with an internet connection unless you are unfortunate enough to be on non-broadband (read: DSL) to readjust that setting.

And finally hardware. I have established that several areas on the NAND are reporting bad blocks at specific addresses (not CRC errors but actual bad blocks). Several of the other RT87Us are reporting the same thing although in different addresses.

It's clear that I will need to start looking for new devices to replace the ASUS RT87Us (at least they lasted longer than my R7000 Nighthawks and didn't burn themselves out!). I will probably go with Ubiquiti. It's even higher end hardware (these were "high end" for the time) and at the very least, the people developing the firmware for it actually know basic networking like how to setup VLANs.

So yes, Solution: put a Win2K12 R2 DC Edition server with DHCP/DNS Role(s) in front of the ASUS routers and restore them to acting as "Access Points." Bonus Solution: Fix TCP Timeout Establish setting. There are those of us with real broadband connections you know. Heck, once you pass the 75/75 mark you're faster than 97% of the United States and by 200+/200+ you're faster than 99% of the United States and every major broadband carrier including Comcast (*shivers*) offers those speeds. Couple that with the more devices people are connecting to WiFi. This event was and is a disaster waiting to happen.
 
Last edited by a moderator:
Seriously, remind me to stick with more well supported DD-WRT devices. Hideous interface but at least they can grasp basic network concepts.

If you need all these dnsmasq configuration tweaks to get your network setup to work, then no, it's not just a "basic network setup". A basic network is pretty much fire-and-forget when it comes to DHCP - it's working for everyone else using these routers. Therefore since you are encountering issues, your network DOES have something out of the ordinary, and I can't find it at a quick glance of your posts. That's the part where I would have to start going back and forth with you, asking questions to fill in the blanks. And that's the part where I'm not willing to do it for you - I already have to do that for a living for my customers. Not willing to ALSO do it on my spare time, sorry.

When you attempt to assign IP addresses manually through the router it locks out those IP addresses altogether and does not assign it to the matching MAC address.

What you describe here is a very basic functionality that's working for everyone else. If it doesn't work for you, then that's another sure sign that something in your setup isn't just "a basic network setup". And that's once again something I don't see explained in your lengthy posts.

You also initially said that:

to take over the task of issuing DHCP on a different VLAN

That's not the same thing as having your ISP provide you service over a VLAN. This quote from you points at having a VLAN configured on the LAN side - and Asuswrt does not allow configuring VLANs on the switch ports, beyond its very limited capabilities of setting up VLANs for WAN + IPTV + VoIP services. That was why I said that you would need a more advanced device to handle this. Your quote implied you need a DHCP server to be bound to a specific VLAN - that's not possible through Asuswrt, unless you start configuring everything manually yourself. One thing that would be much simpler to configure using a business-class router instead.

Finally, the TCP timeout was actually changed by Asus a few months ago... Older 2.6.22.19 kernel was using a lower value, and it got increased to 5 days with the 2.6.36 kernel. This was causing issues (contrack tables filling up due to the high number of established sessions that wouldn't timeout quickly enough), so Asus reduced it back to a lower value.

If you do a search on the web, you will see it's actually quite common for people having to reduce this value from the 5 days default to a lower value. 5 days might work fine for a computer, but not for a router.
 
Last edited by a moderator:
Status
Not open for further replies.

Similar threads

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Back
Top