What's new

RT-AC68P in Media Bridge Mode - DHCP/dnsmasq problems

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

james_c

Occasional Visitor
Summary: I have have two Asus routers running Merlin firmware on a home network. One is in media bridge mode.
The clients on the media bridge have issues getting the correct IP and gateway using DHCP.

**** Main Router
Asus 1900P running Merlin 384.7_2
Connects to WAN.
Wired and wireless clients.
IP 192.168.1.1

**** Media Bridge
RT-AC68P running Merlin 384.7_2
Media Bridge Mode.
Wireless connection back to Main Router.
Wired Clients only.
IP 192.168.1.2 (statically assigned)
Default GW: 192.168.1.1 (static)
Subnet mask: 255.255.255.0
DNS: Blank.

Both routers have 'Enable JFFS custom scripts and configs' set to YES.

On the Main router I have DHCP enabled, the pool starts at 192.168.1.3 and ends at 192.168.1.254
The Default gateway and 'DNS Server 1' are set to 192.168.1.1 (this is all under LAN --> DHCP server, to be clear, not talking about the WAN side)
I have Manually assigned 4 IP's starting at 192.168.1.2 through 192.168.1.5
192.168.1.2 is the Media Bridge and it has its IP statically assigned, but it's in the manual reservation list because that's what populates etc/hosts.dnsmasq on the main router and lets me resolve the media bridge hostanme.
The clients with manual IP reservations, that connect to the main router, either wired or wireless, always seem to get the correct IP/Gateway via DHCP.
There is a Win 10 client that connects via the media bridge and this always seems to have problems with both IP and gateway.

If dnsmasq is running on the media bridge AND main router, and I do a DHCP release and renew from the Win 10 pc, then I see something like this in the logs of the media bridge - edited/formatted for clarity:

Code:
dnsmasq-dhcp[245]: DHCPREQUEST(br0)     192.168.1.254 ma:ca:dd:re:ss:01
dnsmasq-dhcp[245]: DHCPACK(br0)         192.168.1.254 ma:ca:dd:re:ss:01 win10pc
dnsmasq-dhcp[245]: DHCPRELEASE(br0)     192.168.1.254 ma:ca:dd:re:ss:01
dnsmasq-dhcp[245]: DHCPDISCOVER(br0)    192.168.1.254 ma:ca:dd:re:ss:01
dnsmasq-dhcp[245]: DHCPOFFER(br0)       192.168.1.254 ma:ca:dd:re:ss:01
dnsmasq-dhcp[245]: DHCPREQUEST(br0)     192.168.1.254 ma:ca:dd:re:ss:01
dnsmasq-dhcp[245]: DHCPACK(br0)         192.168.1.254 ma:ca:dd:re:ss:01 win10pc

and I see this from the main router:

Code:
dnsmasq-dhcp[249]: DHCPDISCOVER(br0)  192.168.1.254 ma:ca:dd:re:ss:01
dnsmasq-dhcp[249]: DHCPOFFER(br0)     192.168.1.4   ma:ca:dd:re:ss:01 // this is what I want
dnsmasq-dhcp[249]: DHCPREQUEST(br0)   192.168.1.254 ma:ca:dd:re:ss:01
dnsmasq-dhcp[249]: DHCPNAK(br0)       192.168.1.254 ma:ca:dd:re:ss:01 wrong server-ID

From what I have read and the results I'm seeing, it appears like I need to stop dnsmasq on the media bridge, even though there is no way that I can see to do this through the admin UI.

If I ssh into the media bridge and stop the service with 'service stop_dnsmasq' and repeat the process above I see this in the logs of the main router:

Code:
dnsmasq-dhcp[249]: DHCPDISCOVER(br0)  192.168.1.254 ma:ca:dd:re:ss:01
dnsmasq-dhcp[249]: DHCPOFFER(br0)     192.168.1.4   ma:ca:dd:re:ss:01
dnsmasq-dhcp[249]: DHCPDISCOVER(br0)  192.168.1.254 ma:ca:dd:re:ss:01
dnsmasq-dhcp[249]: DHCPOFFER(br0)     192.168.1.4   ma:ca:dd:re:ss:01
dnsmasq-dhcp[249]: DHCPDISCOVER(br0)  192.168.1.254 ma:ca:dd:re:ss:01
dnsmasq-dhcp[249]: DHCPOFFER(br0)     192.168.1.4   ma:ca:dd:re:ss:01
dnsmasq-dhcp[249]: DHCPREQUEST(br0)   192.168.1.4   ma:ca:dd:re:ss:01
dnsmasq-dhcp[249]: DHCPACK(br0)       192.168.1.4   ma:ca:dd:re:ss:01 win10pc

And this seems to get the win10pc to take the correct IP and gateway. However I lose all this on the reboot of the media bridge when the dnsmasq service restarts.
I tried 'service disable_dnsmasq' but that didn't help either. Then I resorted to stopping the service in the services-start jffs user script. (Exactly what @HariSeldon did as detailed in the last post on here: https://www.snbforums.com/threads/rt-ac68u-in-media-bridge-mode-responding-to-dhcp-requests.48259/)
The problem here is that dnsmasq has already started and given the bad IP/GW to win10pc before the services-start script executes and stops it - I can see that happening in the logs.
Then the dnsmasq is stopped on the media bridge but win10pc is left in a bad state thinking its gateway is 192.168.1.2 and it can't reach the dnsmasq service on the main router to rectify the situation.

My not so elegant solution to this is to pollute the dnsmasq.conf.add file with something that would stop it from ever starting: "dhcp-range="

This is what I see in the media bridge logs on restart and everything seem to work fine:

Code:
custom_config: Appending content of /jffs/configs/dnsmasq.conf.add.
dnsmasq[417]: bad dhcp-range at line 19 of /etc/dnsmasq.conf
dnsmasq[417]: FAILED to start up

My questions are:

1. Am I taking totally the wrong approach to this setup?

2. If Q1 = No, Is there a better way to prevent dnsmasq from starting on the media bridge - is there a GUI setting I'm missing or is there a config option in a file somewhere to achieve this?

3. Does the 'service' command on the router ever say anything other than 'Done'? I can give it nonsense and it will say 'Done'. Can I see a list of currently running services? Can is see a list of services that I can manage with the service command?

4. What's the difference between 'service stop_dnsmasq' and 'service disable dnsmasq'? ... I have also seen it written 'serivce disable_dnsmasq' i.e. with and without the underscore. All 'service' ever tells me is 'Done'.

5. Should I just give up on it and statically configure the IP on each media bridge client? Oddly, when I do this, the media bridge clients in the 'Network Map' --> 'Client List' on the main router show up as having 'Manual' assignment, but if they get their address via DHCP they show up as 'Static'. I thought it would be the other way around, but I'm not particularly concerned. Maybe it's because they are connecting through the media bridge and that is static.

Any help would be greatly appreciated.

Thanks,

James.
 
Last edited:
1. The correct approach would be to make media bridge work properly rather than fudge some sort of workaround.

2. Media bridge shouldn't be running dnsmasq at all!

3. No, service always reports "done" :mad::rolleyes:.

4. There is no "disable" service (this isn't Linux).

5. See answer 1.

Side note: On the media bridge LAN setup page set the DNS to 192.168.1.1. Probably irrelevant to your problem but best to set it to a sensible value (so it can set its own clock at least).

If you haven't already, try doing a factory reset of the media bridge and setting it up again.
 
Last edited:
Colin - thanks for the response.

Q1. Yes, agreed - I was more asking if I'm missing the bigger picture on something or thinking about the network topology incorrectly and what services should be running where.

Q2 and your comment on the factory reset:
When you say "shouldn't be running dnsmasq", do you mean that's our goal or that the software is designed to never start the dnsmasq service when in media bridge mode and if it starts then something is wrong with my configuration?
I did what you said on the factory reset (re-initialize and nvram reset) just to be sure. No change in behavior. This is also the n-th version of firmware on this router where this has been an issue - it's not a new problem.
Running out of ideas in the 'reset/reboot/resinstall' end of things.
I can see the service starting on boot in the log:

Code:
dnsmasq[593]: started, version 2.80test7 cachesize 1500
dnsmasq[593]: asynchronous logging enabled, queue limit is 5 messages
dnsmasq-dhcp[593]: DHCP, IP range 192.168.1.2 -- 192.168.1.254, lease time 1d
dnsmasq[593]: read etc/hosts - 5 addresses
dnsmasq[593]: using nameserver 192.168.1.1#53
dnsmasq[593]: exiting on receipt of SIGTERM
dnsmasq[598]: started, version 2.80test7 cachesize 1500
dnsmasq[598]: asynchronous logging enabled, queue limit is 5 messages
dnsmasq-dhcp[598]: DHCP, IP range 192.168.1.2 -- 192.168.1.254, lease time 1d
dnsmasq[598]: read etc/hosts - 5 addresses
dnsmasq[598]: using nameserver 192.168.1.1#53
RT-AC68P: start httpd:80
dnsmasq-dhcp[598]: DHCPREQUEST(br0) 192.168.1.4 ma:ca:dd:re:ss:01
dnsmasq-dhcp[598]: DHCPACK(br0) 192.168.1.4 ma:ca:dd:re:ss:01 win10pc

Re-reading the post that I linked to above, it would appear that I'm having the very same problem as @HariSeldon and several others - dnsmasq starting while the router is in media bridge mode.
I've also noticed that sometimes the main router wins and gets to set the IP in response to the DHCP request and sometimes it's the media bridge.
I also took your advice on the 'DNS Server 1' - as expected, no effect, but probably good to do.
 
IIRC, dnsmasq is started in MB mode to manage the local hosts file for the linux instance running on the MB router. It shouldn't, however start it's DHCP server. Sounds as if a bug may have been introduced in detecting MB mode (maybe related to the introduction of AImesh code?). When I get a chance I'll try and take a look at the code and see if anything obvious jumps out.
 
Hi.

I'll have to preface my answers with the admittance that I don't use the media bridge myself, so anything I say is my understanding of the theory rather than the practical realities.

That said there are a couple of fundamental points.

1. A media bridge (aka wireless Ethernet bridge) just connects two Ethernet devices together. That's all - nothing else. So both ends of the bridge are part of the same local network.

2. Given 1 above, there can only ever be one DHCP server on the local network*. If there is more than one DHCP server on the same network it will be entirely random which one the client uses for any particular DHCP lease request. Also, each DHCP server will be unaware of the others or which leases have been given out. This will lead to exactly the problem you are seeing.

So as you can see, for the media bridge to function correctly it should never be running a DHCP server because there will already be one running on the upstream router it is connected to.

Speaking hypothetically, I can only think of four reasons why a DHCP server (dnsmasq) might be running on the media bridge;

1. It's a bug.
2. The default boot process always starts it, but then shuts it down once it's fully booted.
3. dnsmasq runs but the DHCP server component of it is disabled.
4. There is an optional setting that allows you to enable the DHCP server in a situation where there are no other servers on the LAN. This sounds very unlikely.


* Yes, yes. I know that's not absolutely true, but it is in our case.
 
Last edited:
1. The correct approach would be to make media bridge work properly rather than fudge some sort of workaround.

2. Media bridge shouldn't be running dnsmasq at all!

3. No, service always reports "done" :mad::rolleyes:.

4. There is no "disable" service (this isn't Linux).

5. See answer 1.

Side note: On the media bridge LAN setup page set the DNS to 192.168.1.1. Probably irrelevant to your problem but best to set it to a sensible value (so it can set its own clock at least).

If you haven't already, try doing a factory reset of the media bridge and setting it up again.

The repeater/media bridge runs its own dnsmasq until it connects to the router. Not sure why, but that's the way it is.

You have a few options:

Set the dhcp on the repeater/media bridge so it issues obnoxiously short lease times so the devices get their dhcp from the router after the connection comes up.

Set the same lease reservations on the repeater/media bridge so the device gets the same ip address regardless of what hands it out.

Setup a cron job to disable dnsmasq on the repeater/media bridge device.

Do a dhcp release/renew after the repeater/media bridge establishes connectivity to the router.

Manually URL surf to the DHCP settings on the bridge/repeater and disable (will probably re-enable after next restart).
 
The repeater/media bridge runs its own dnsmasq until it connects to the router. Not sure why, but that's the way it is.
I can imagine that being the case, perhaps as some kind of fail-safe for situations where it can't contact the upstream router. That was what I was eluding to in my previous post. However, my interpretation of what the OP described was the the DHCP server is always running, not just at startup. That would obviously be wrong and should not be happening.

As an experiment I tried out the media bridge on my router running John's fork and at no point in the boot process did dnsmasq get started. So there's definitely been a significant change between the old code base and the current Merlin in that regard.

I very much suspect that this is a bug and hopefully John or @RMerlin might find the time to investigate it. In the meantime an alternative approach for people like @james_c who have a compatible router could be to install John's fork instead as that appears to work properly.
 
I can imagine that being the case, perhaps as some kind of fail-safe for situations where it can't contact the upstream router. That was what I was eluding to in my previous post. However, my interpretation of what the OP described was the the DHCP server is always running, not just at startup. That would obviously be wrong and should not be happening.

As an experiment I tried out the media bridge on my router running John's fork and at no point in the boot process did dnsmasq get started. So there's definitely been a significant change between the old code base and the current Merlin in that regard.

I very much suspect that this is a bug and hopefully John or @RMerlin might find the time to investigate it. In the meantime an alternative approach for people like @james_c who have a compatible router could be to install John's fork instead as that appears to work properly.

I switched to stock on my repeaters when i was having wireless stability problems with the 384 migration and they all exhibit the same behavior with dnsmasq. It becomes quite problematic for devices that have static addresses like cameras that connect and get a bogus dhcp address from the repeater/media bridge before the connection to the router comes up. Frankly i was surprised that stock actually supports repeater/media bridge now.

I also found that if you url surf to the dhcp settings menu, you can disable/enable it on the repeater bridge. Last time i checked though, i don't believe the settings presist between reboots.
 
What's the output from logging into the MB router and running
nvram get dnsqmode

if it returns '2' try
nvram set dnsqmode=1
nvram commit


and reboot the MB
 
What's the output from logging into the MB router and running
nvram get dnsqmode

if it returns '2' try
nvram set dnsqmode=1
nvram commit


and reboot the MB

should i get a result? i got no results back for either of my repeaters (although they are now up and connected to the router). This problem only really occurs before the connection to the router is established from the repeater/media bridge.

stock repeater 1
admin@192.168.1.3's password:
admin@RT-AC68R:/tmp/home/root# nvram get dnsqmode
admin@RT-AC68R:/tmp/home/root#

stock repeater 2
admin@192.168.1.5's password:
admin@RT-AC68U:/tmp/home/root# nvram get dnsqmode
admin@RT-AC68U:/tmp/home/root#

hmm,
looks like nothing from my router either running the latest merlin

ASUSWRT-Merlin RT-AC88U 384.7-0 Sun Oct 7 16:43:24 UTC 2018
admin@RT-AC88U-17F0:/tmp/home/root# nvram get dnsqmode
admin@RT-AC88U-17F0:/tmp/home/root# nvram get dnsqmode
admin@RT-AC88U-17F0:/tmp/home/root#
 
should i get a result? i got no results back for either of my repeaters (although they are now up and connected to the router). This problem only really occurs before the connection to the router is established from the repeater/media bridge.
As I mentioned above, your problem doesn't appear to be the same as the OP. You are also not running in MB mode which might have a bearing. So it will be interesting to see whether the results from the OP are the same.
 
should i get a result? i got no results back for either of my repeaters
OK...thanks. The code around there is full of conditional compile options and I wasn't sure what was and wasn't being set. It's fine that you don't return a value. Just need to keep looking now.
 
This is pretty fantastic John. I installed it on the MB only and have gone through several restarts of both routers and several of the client devices and everything seems to be working fine. I actually think this problem has been a huge source of network instability over the past several months even when it appeared like every client had the correct IP (I assume as a result of race conditions when leases expire - random machines losing the connection because they get the wrong gateway)
I have confirmed that dnsmasq does not show anywhere in the startup logs of the MB.
I'll continue to test over the coming days.

This may warrant a new post but I feel like it's related - the NTP time not working on routers in MB mode.
John you offered some help on this post and @HariSeldon had similar problem here.

I'll just give you all the commands that I ran and the problem is self evident.
I realized later that @HariSeldon arrived at basically the same conclusion but used a slightly different way of adding a default route - I could have saved a lot of time by reading his post more carefully.
I can add a user script to solve this but it would be better if the routing table was set up so that this worked by default. I have 192.168.1.1 as the default gw in the UI.
Here is the output (fresh boot of media bridge):

Code:
ASUSWRT-Merlin RT-AC68U 384.7-2test1-g95f534330 Sun Oct 28 19:46:13 UTC 2018

admin@bridge-router:/# date
Sat May  5 01:06:20 DST 2018

admin@bridge-router:/# ps | grep -i ntp
  429 admin     1412 S    grep -i ntp

admin@bridge-router:/# ping pool.ntp.org
PING pool.ntp.org (129.250.35.250): 56 data bytes
ping: sendto: Network is unreachable

admin@bridge-router:/# cat /etc/resolv.conf
nameserver 192.168.1.1

admin@bridge-router:/# route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.1.0     *               255.255.255.0   U     0      0        0 br0
127.0.0.0       *               255.0.0.0       U     0      0        0 lo

admin@bridge-router:/# nvram show | grep gateway
dhcp1_gateway_x=
lan_gateway=192.168.1.1
wan_gateway=0.0.0.0
ipv6_gateway=
size: 51154 bytes (14382 left)
wan_gateway_x=0.0.0.0
ipv61_gateway=
wan0_xgateway=0.0.0.0
lan1_gateway=192.168.2.1
wan0_gateway=0.0.0.0
wan0_gateway_x=0.0.0.0
dhcp_gateway_x=
wan1_gateway=0.0.0.0
wan1_gateway_x=0.0.0.0

admin@bridge-router:/# ip route add default via 192.168.1.1
admin@bridge-router:/# route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.1.0     *               255.255.255.0   U     0      0        0 br0
127.0.0.0       *               255.0.0.0       U     0      0        0 lo
default         router.asus.com 0.0.0.0         UG    0      0        0 br0

admin@bridge-router:/# ping pool.ntp.org
PING pool.ntp.org (172.98.193.44): 56 data bytes
64 bytes from 172.98.193.44: seq=0 ttl=49 time=35.664 ms
64 bytes from 172.98.193.44: seq=1 ttl=49 time=38.937 ms
64 bytes from 172.98.193.44: seq=2 ttl=49 time=36.416 ms
64 bytes from 172.98.193.44: seq=3 ttl=49 time=34.947 ms
--- pool.ntp.org ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 34.947/36.491/38.937 ms

admin@bridge-router:/# ntpd -q -p pool.ntp.org
admin@bridge-router:/# date
Mon Oct 29 00:40:17 DST 2018
 
@agilani @james_c

I think I found where there was an error in the logic as to when to start dnsmasq. I did a test build for the AC68 if you want to try it.
RT-AC68U_384.7_2test1-g95f534330.zip
https://1drv.ms/f/s!Ainhp1nBLzMJghJlb7j1wnqac97q

@john9527
I loaded this on one of the two repeaters. After the repeater came up, the wireless devices got ip addresses from the repeater instead of the router. I had to reset the wireless interface on the repeaer a few times before the devices started getting their ip addresses from the router.
 
Last edited:
@john9527
I loaded this on one of the two repeaters. After the repeater came up, the wireless devices got ip addresses from the repeater instead of the router. I had to reset the wireless interface on the repeaer a few times before the devices started getting their ip addresses from the router.
The fix should cover repeater mode as well. Do you now see dnsmasq NOT running on the repeater (ps | grep dnsmasq)?
It may have taken a while for the clients to 'forget' their old addresses, or reboot the clients.
 
This may warrant a new post but I feel like it's related - the NTP time not working on routers in MB mode.
Not sure I got this one right, but can you give
RT-AC68U_384.7_2test2-g95f534330.zip
a try (same download directory)

Also, we need to check I didn't inadvertently break things in router mode. Do you need a different build to test on your main router?
 
I installed RT-AC68U_384.7_2test2-g95f534330 - did not fix the problem with no route back to the main router.
Same solution I detailed above still works on the test2 build, so it's not any worse in that respect.

However, for whatever reason, the test2 version is vastly more responsive when using the UI. Some time ago, I'm not sure with what version of the firmware, the UI on the bridge became really slow and laggy - in bridge mode you could even see it briefly render the pages for the 'router-mode UI' and then redraw them, hiding certain router-mode features, I assume as it finished running the javascript on the page. I've gone through many many firmware upgrades and nvram resets since the issue first appeared and the UI was always been laggy - this test2 version is drastically more responsive.

With respect to testing on the main router in router mode - that's no problem - I can do it, I just wanted to see this running stable for a while first since there are some critical clients hanging off the main router that I can't take down for an extended period of time.
 
I installed RT-AC68U_384.7_2test2-g95f534330 - did not fix the problem with no route back to the main router.
Hmm...was there any change in the routing table?

Can't think of a reason the change I made would affect the UI....
 

Similar threads

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top