What's new

RT-AC68P in Media Bridge Mode - DHCP/dnsmasq problems

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

I'll try test3 this evening when I get home.
I logged in remotely via ssh and rebooted the MB and when it came back up I could see the default route in the table.
When I first installed the test2 firmware and did a power off/power on cycle, the routing table did not have the default gw in there .... 100% certain of that ... so it's very odd.
But I'll let you know as soon as I install test3.
 
The fix should cover repeater mode as well. Do you now see dnsmasq NOT running on the repeater (ps | grep dnsmasq)?
It may have taken a while for the clients to 'forget' their old addresses, or reboot the clients.

admin@RT-AC68R-F2B0:/tmp/home/root# ps | grep dnsmasq
3245 admin 1412 S grep dnsmasq
admin@RT-AC68R-F2B0:/tmp/home/root#
admin@RT-AC68R-F2B0:/tmp/home/root#

looks like its still running. I'm still on test1 build. I checked nvram and there are no dnsmasq settings.
 
admin@RT-AC68R-F2B0:/tmp/home/root# ps | grep dnsmasq
3245 admin 1412 S grep dnsmasq
admin@RT-AC68R-F2B0:/tmp/home/root#
admin@RT-AC68R-F2B0:/tmp/home/root#

looks like its still running. I'm still on test1 build. I checked nvram and there are no dnsmasq settings.
No, it's not running.....it found your grep command not a running dnsmasq :) So looks like it's working for repeater mode as well.
 
lol,
I was wondering why it didn't show in ps.

Interestingly, after a few reboots, it looks like the cameras are effectively getting their dhcp from the router now without the need to bounce the wireless interface a few times. I may do a clean wipe just to be sure.

John, your regular builds are all 380 or earlier code base right? Do we need these changes baked into the merlin builds?

If not, i guess its not the end of the world running out of date code on the repeaters.
 
Ok, confirmed test3 it's working - no running dnsmasq service and ntp is running can can communicate.

I upgraded the firmware, ssh'ed in after it booted up, and routing table was:

Code:
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.1.0     *               255.255.255.0   U     0      0        0 br0
127.0.0.0       *               255.0.0.0       U     0      0        0 lo
default         192.168.1.1     0.0.0.0         UG    0      0        0 br0

... then I deleted the default route to make sure it wasn't a hold over from my prior manual settings, confirmed it was gone, hard reboot, and the routing table became this:

Code:
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.1.0     *               255.255.255.0   U     0      0        0 br0
127.0.0.0       *               255.0.0.0       U     0      0        0 lo
default         router.asus.com 0.0.0.0         UG    0      0        0 br0

... as you can see 192.168.1.1 became router.asus.com, which works, but I thought it wouldn't because:

Code:
admin@bridge-router:/# nslookup router.asus.com


Server:    192.168.1.1
Address 1: 192.168.1.1 router.asus.com

Name:      router.asus.com
Address 1: 192.168.1.2 router.asus.com

I deleted the default route again and added one back, this time explicitly pointing to 192.168.1.2 ... that does not work ... pings just never come back ... but I get 100% packet loss as distinct from 'network unreachable' when there is no default route in there.
I deleted that bad route and added back a new default with 192.168.1.1 and that works fine.
In both these test cases the route table shows router.asus.com in the output, whether it's .1.1 or .1.2 that's really being used.
 
You are running across a quirk in the implementation, where the local hosts file on each router is populated with that router's ip and a name of router.asus.com. So on the parent router, router.asus.com is 192.168.1.1, on the mediabridge it's 192.168.1.2. Thce routing entry is really pointing to the correct parent router address.

I've thought about not populating that hosts entry in non-router modes, but am afraid there may be dependencies on that name elsewhere in the code.
 
I use the following hosts.postconf script that inserts the user specified host name at the beginning of the hosts file. That way the nslookup output doesn't always respond with "router.asus.com".

Code:
# cat hosts.postconf
#!/bin/sh
CONFIG=$1
source /usr/sbin/helper.sh

addr="$(nvram get lan_ipaddr)"
name="$(nvram get computer_name)"
domain="$(nvram get lan_domain)"
pc_insert "localhost" "${addr} ${name}.${domain} ${name}" $CONFIG
pc_insert "localhost" "# Following entry inserted by hosts.postconf" $CONFIG
 
@ColinTaylor
No problems with the reorder? The user defined router name is already included, so just a re-order may be the right answer.
 
@ColinTaylor
No problems with the reorder? The user defined router name is already included, so just a re-order may be the right answer.
Not that I've noticed. I've only been running like that for about 6 weeks and TBH I'd forgotten about it.

Then again I can't say that I've "tested" it much as I usually log into the router using its IP address rather than "router.asus.com".
 
@ColinTaylor
I made the hosts order change on my fork and will give it a workout. Thanks for the idea.

@james_c
It seems test3 is working as expected. Please post back when you are comfortable with the changes and have also tried it on the parent router. Then I'll package up the changes to send to @RMerlin
Thanks for helping test.
 
@ColinTaylor - thanks for the script ... and your earlier advice.

@john9527 - I've been hammering on it tonight across various different test cases - I think this is going to solve a ton of problems for me so thanks very much for your efforts. I'll give it a few days to check stability and deploy to the main router in regular router mode. Will report back.
 
@john9527 Just out of interest is it possible for you to summarise (in very general terms) what the problem was? i.e. Just curious whether it was a fundamental design flaw or just a minor bug in the code.
 
@john9527 Just out of interest is it possible for you to summarise (in very general terms) what the problem was? i.e. Just curious whether it was a fundamental design flaw or just a minor bug in the code.
The latter...

There are a couple of other state checks involved, but basically there was a test that was checking if the router was configured as a access point, AND as a repeater, AND as media bridge (obviously not likely :) ) instead of just one of those modes.

It wasn't immediately obvious, because the test pieces were buried in a bunch of conditional compile statements.
 
The latter...

There are a couple of other state checks involved, but basically there was a test that was checking if the router was configured as a access point, AND as a repeater, AND as media bridge (obviously not likely :) ) instead of just one of those modes.

It wasn't immediately obvious, because the test pieces were buried in a bunch of conditional compile statements.

Sounds like something that should be fixed upstream by asus. :)
 
So I've had some stability issues ... but I doubt it's anything to do with @john9527's changes.

I use 3 different monitors on uptimerobot.com to connect every 5 minutes to the following:
  1. the ssh port (dropbear) on the main router
  2. the ssh port on the bridge which is forwarded through the main router
  3. a http server that sits behind the bridge, forwarded through the main router
I've experienced outages where uptimerobot can't connect to the bridge or the webserver. Sometimes these are temporary and sometimes they need a reboot of the bridge.
During one of these outages, from machines wired directly to the bridge I can access the internet and get responses. I can also get connections from any machine behind the bridge to any other.
I just can't initiate anything from the main router side wirelessly into the bridge or the devices connected to it.
Below is what I see in the logs during one of these events.
In this case, according to uptimerobot, the bridge was unreachable for about 25 minutes between ~5:00AM and ~5:24AM
The dropbear lines in the log are uptimerobot successfully opening a connection to that port.
I don't have any issues on the main router .... that's running merlin 384.7_2 ... there is nothing interesting in the logs at the corresponding time, just successful connections from uptimerobot to the main router.

Code:
// Many lines like this next pair ...

Nov  1 04:56:45 dropbear[1608]: Child connection from 63.143.42.247:3769
Nov  1 04:56:45 dropbear[1608]: Exit before auth: Exited normally

// Here is where the problem starts ...

Nov  1 04:58:51 rc_service: psta_monitor 266:notify_rc restart_wlcmode 0
Nov  1 04:58:51 FTP_Server: daemon is stopped
Nov  1 04:58:51 Samba_Server: smb daemon is stopped
Nov  1 04:58:51 kernel: gro disabled
Nov  1 04:58:52 dropbear[497]: Exit (admin): Terminated by signal
Nov  1 04:58:52 dropbear[449]: Early exit: Terminated by signal
Nov  1 04:58:52 kernel: br0: port 3(eth2) entering forwarding state
Nov  1 04:58:52 kernel: br0: port 2(eth1) entering forwarding state
Nov  1 04:58:52 kernel: br0: port 1(vlan1) entering forwarding state
Nov  1 04:59:00 kernel: br0: port 3(eth2) entering forwarding state
Nov  1 04:59:00 kernel: br0: port 3(eth2) entering forwarding state
Nov  1 04:59:00 kernel: br0: port 2(eth1) entering forwarding state
Nov  1 04:59:00 kernel: br0: port 2(eth1) entering forwarding state
Nov  1 04:59:00 kernel: br0: port 1(vlan1) entering forwarding state
Nov  1 04:59:00 kernel: br0: port 1(vlan1) entering forwarding state
Nov  1 04:59:00 RT-AC68U: start httpd:80
Nov  1 04:59:00 dropbear[1637]: Running in background
Nov  1 04:59:16 rc_service: psta_monitor 266:notify_rc restart_wlcmode 1
Nov  1 04:59:16 FTP_Server: daemon is stopped
Nov  1 04:59:16 Samba_Server: smb daemon is stopped
Nov  1 04:59:16 kernel: gro disabled
Nov  1 04:59:16 dropbear[1637]: Early exit: Terminated by signal
Nov  1 04:59:16 kernel: br0: port 3(eth2) entering forwarding state
Nov  1 04:59:16 kernel: br0: port 2(eth1) entering forwarding state
Nov  1 04:59:16 kernel: br0: port 1(vlan1) entering forwarding state
Nov  1 04:59:20 watchdog: restart httpd
Nov  1 04:59:20 rc_service: watchdog 264:notify_rc stop_httpd
Nov  1 04:59:20 rc_service: waitting "restart_wlcmode 1" via psta_monitor ...
Nov  1 04:59:24 kernel: br0: port 3(eth2) entering forwarding state
Nov  1 04:59:24 kernel: br0: port 3(eth2) entering forwarding state
Nov  1 04:59:24 kernel: br0: port 2(eth1) entering forwarding state
Nov  1 04:59:24 kernel: br0: port 2(eth1) entering forwarding state
Nov  1 04:59:24 kernel: br0: port 1(vlan1) entering forwarding state
Nov  1 04:59:24 kernel: br0: port 1(vlan1) entering forwarding state
Nov  1 04:59:29 RT-AC68U: start httpd:80
Nov  1 04:59:29 dropbear[1667]: Running in background
Nov  1 04:59:31 rc_service: watchdog 264:notify_rc start_httpd
Nov  1 04:59:31 RT-AC68U: start httpd:80

// nothing in the logs for the next 20 mins ... until the first successful reconnect from uptimerobot ...

Nov  1 05:24:28 dropbear[1748]: Child connection from 63.143.42.247:3880
Nov  1 05:24:28 dropbear[1748]: Exit before auth: Exited normally
 
The latter...

There are a couple of other state checks involved, but basically there was a test that was checking if the router was configured as a access point, AND as a repeater, AND as media bridge (obviously not likely :) ) instead of just one of those modes.

It wasn't immediately obvious, because the test pieces were buried in a bunch of conditional compile statements.

I'm struggling with the same issues on Merlin 384.13. @john9527 and @RMerlin, any chance this conditional logic check can be corrected in your current builds, respectively?

Thank you very much for both of your efforts.
 

Similar threads

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top