What's new

Internet on Wireless Access Point stops working (Seniors' residence)

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

Jim_Lafleur

Occasional Visitor
Hi,

I've installed a Wi-Fi setup in a Seniors' residence.

We've had weird problems since one month (or more...). I'll explain all the problems we've had in details later in this post (See *** below). For now I'll explain the current issue:

There's an Asus RT-AC3100 in building 3, configured as a router, serving as a Wireless Access Point (WAP) which works for a couple of days and then after that the internet is no longer available. Can connect to the Wi-Fi, receive an IP, no internet... Yesterday I went there to troubleshoot. The RT-AC3100 is connected to a switch 225 feet away. On the switch, there was no activity on the port where the RT-AC3100 connects to. It looked like the port was down. Rebooted the switch : Same issue. Looked like the WAN port of the RT-AC3100 was down. Rebooted the RT-AC3100 : Internet came back... for 5 minutes. The CPU temp was 81°C (But I mean, those run at 75°C on idle when you first cold start them.) I had the Merlin 380.61 firmware installed. Installed the latest original Asus firmware (2016/08/05 3.0.0.4.380.3941). Now it's working good since yesterday (Knock on Wood). I don't have much faith in the solution because of all what happened before (See *** below). (I'm starting to get crazy :eek:, so that's why I've decided to post here ;). )

Why did this happened? I mean it worked for 10 days with Merlin firmware without any problems.
Note : The RT-AC3100 in building 3 is located at the second floor of the building (The building has a ground floor and a second floor). It is attached to the ceiling upside down (would this matter?).)

ere's 3 buildings all connected together. The main one and 2 other buildings, constructed later, extending from the main building. There's many WAPs in the main building but I'll spare the details since the problem is in building 3.

Here's the floor plan:
Building Diagram.jpg



Here's the Network Diagram :
Network Diagram (1).jpg



Here's the Network Diagram with more details :
Network Diagram - Details (2).jpg


*** : The whole story : I explain this because this is too weird. It might be related to my current problem. It's like trying to find a needle in a haystack :

In march I've installed a new switch in building 3 (used Cisco Catalyst). There was already a used Cisco Catalyst in building 2. The 2 switches would talk good together for 1 hour or so and then the connected ports would go down. Troubleshooted and didn't find the problem. Figured I had bough a bad switch. I did replace the Cisco switch by a small, temporary, "dummy" belkin switch in building 3. An Asus RT-AC3200 WAP (with Merlin) was connected to the belkin switch. All worked good for months. Then the residents of building 3 lost connection to the internet. They could connect to the Wi-Fi and get an IP but no internet.

Troubleshooted :
(For troubleshooting purposes, I've removed the Belking switch in building 3. I've connected the WAP in building 3 directly to the Cisco switch in building 2 with a patch cable joining the 2 ports on the patch panel of building 3).

  • Unplugged the cable from the Cisco switch in building 2 : The internet came back... for 10 minutes or so.
  • Rebooted the Cisco switch in building 2 : The internet came back... for 10 minutes or so.
  • Unplugged the cable from the Asus WAP in building 3 : The internet came back... for 10 minutes or so.
  • Rebooted the Asus WAP : The internet came back... for 10 minutes or so.
  • Installed the latest Merlin firware on the Asus WAP: The internet came back... for 1 day or so.
  • Replaced the RT-AC3200 with a brand new RT-AC3100 (with Merlin). : The internet came back... for 1 day or so.
  • Replaced the Cisco switch in building 2 with a TP-Link switch: The internet came back... for 10 days.

Then I did what I've explained at the start of the post : changed the firmware from Merlin to the original Asus firmware. We'll see next if it helps.

Right now there's only 3 things connected to the TP-Link switch :
  • The main router (internet),
  • the WAP in building 2 and
  • the WAP in building 3.
The RT-AC3200 WAP in building 2 is working strong since 2 years (CPU running at 63°C). Why is it that, almost the same WAP in building 3 is having so much trouble?
If you see this, it means you read the whole thing! (or you skip to the bottom of the post ;)). Thanks! Any help would be really appreciated.
 
If your diagrams are correct, then it seems like having the identical LAN IP for both WAP's is the issue?
 
If your diagrams are correct, then it seems like having the identical LAN IP for both WAP's is the issue?
It shouldn't matter. The WAP's are setup as routers. The main network feeding internet is 198.168.7.1. It is connected to the WAN port of the WAP's. Then each WAP distribute their own local IPs via DHCP to their clients. Clients of WAP in building 3 cannot "speak" with clients of WAP in building 2. That's OK. They don't need to.
 
Are you able to try different cables then? If not, can you switch the two WAP's around to verify that it is the WAP or the cable(s) to blame?
 
Are you able to try different cables then? If not, can you switch the two WAP's around to verify that it is the WAP or the cable(s) to blame?
Try different cables, not really. There's only one cable going from Building 2 to 3. Only one cable going to WAP in Building 3. But I can swap WAP's. Good Idea.
Do you think it matters that the WAP is upside down?
 
Try different cables, not really. There's only one cable going from Building 2 to 3. Only one cable going to WAP in Building 3. But I can swap WAP's. Good Idea.
Do you think it matters that the WAP is upside down?

I don't think the orientation matters except if it interferes with the cooling somehow. Put a fan on it (either directly attached or a larger, floor standing model) to see if that (heat) makes any difference.
 
If your switches are managed, try disabling any "security" feature on the ports used by the WAP as well as Green Ethernet support. Occasionally, my own managed Netgear switch would disable the port on which my router is connected following a router reboot. The only way I can fix it is by rebooting the switch. I always thought it was a quirk in this low-end managed switch, but it could be something else triggering an advanced security feature in it.
 
Are you able to try different cables then? If not, can you switch the two WAP's around to verify that it is the WAP or the cable(s) to blame?
Could it really be cables? Could cables work good for 10 days and then not work suddenly? And then work good for x days. With cables wouldn't it be like : "It just works or it just doesn't work"?
 
If your switches are managed, try disabling any "security" feature on the ports used by the WAP as well as Green Ethernet support. Occasionally, my own managed Netgear switch would disable the port on which my router is connected following a router reboot. The only way I can fix it is by rebooting the switch. I always thought it was a quirk in this low-end managed switch, but it could be something else triggering an advanced security feature in it.
Could be that. The switch is a TP-Link TL-SL2452. It's a smart switch. When I've installed that switch I didn't fiddle with any security settings though. It's probably much like a dumb switch right now. Here's an emulator which shows practically what I have to play with: http://static.tp-link.com/resources/simulator/TL-SG2424/Index.htm. The only difference with my switch is that my switch has 48 100Mbps ports. The one on the emulator has 24 Gigabit ports.
 
Last edited:
I don't think the orientation matters except if it interferes with the cooling somehow. Put a fan on it (either directly attached or a larger, floor standing model) to see if that (heat) makes any difference.
Are the Chips on the top or on the bottom on the RT-AC3100? Is there a way to check the temperature with the Asus Firmware? I could with Merlin, but now that it's back to stock, I don't see any options for that. CLI maybe?
 
If your diagrams are correct, then it seems like having the identical LAN IP for both WAP's is the issue?

It's more than just that - time and materials, plus per diem for meals/lodging/incidentals and airfare/car rental, I can fix it...

Materials would include putting in some business grade AP's and a new Gateway - consumer gear is not up to this level of service.

Perhaps it's a good chance for OP to take stock, and hire a professional in the area - one that has experience with multiple dwelling units (lodging/hospitality)...
 
It's more than just that - time and materials, plus per diem for meals/lodging/incidentals and airfare/car rental, I can fix it...

Materials would include putting in some business grade AP's and a new Gateway - consumer gear is not up to this level of service.

Perhaps it's a good chance for OP to take stock, and hire a professional in the area - one that has experience with multiple dwelling units (lodging/hospitality)...
At the beginning I've proposed the owner to install ubiquiti unifi APs. He didn't want to. He just said : "Install the most expensive router you can find. That's what I want". He just wants the residents to be able to go on the internet with their tablets. The employees don't need the wi-fi for their work.

Right now I'd just like for the system to be stable. All the WAPs are stable. It's just one WAP that's not. Did a lot of troubleshooting. Came here cause I was (almost) out of ideas.
 
At the beginning I've proposed the owner to install ubiquiti unifi APs. He didn't want to. He just said : "Install the most expensive router you can find. That's what I want". He just wants the residents to be able to go on the internet with their tablets. The employees don't need the wi-fi for their work.

I would counter with putting the Uni-FI's in, and perhaps the EdgeRouter-Lite - the existing switches are fine as long as they test out good... with the AP's, you're likely good with AC1200/N900 class considering the user story...

Don't get me wrong - but the Asus Router/AP's are not the right application here...

Having a simple managed platform will lead to a better customer experience in a medium scale deployment like this one...
 
Right now I'd just like for the system to be stable. All the WAPs are stable. It's just one WAP that's not. Did a lot of troubleshooting. Came here cause I was (almost) out of ideas.

If you have to stay with the current equipment:

I would set up a scheduled reboot (a built-in feature in stock/merlin FW) in the middle of the night (e.g. 3:30AM) every seven days on all three ASUS router/WAP. Three on different days perhaps. I bet your problem will disappear..​

The stock FW or its derivatives aren't for this type of application..
 
If you have to stay with the current equipment:

I would set up a scheduled reboot (a built-in feature in stock/merlin FW) in the middle of the night (e.g. 3:30AM) every seven days on all three ASUS router/WAP. Three on different days perhaps. I bet your problem will disappear..​
It was an is set now to reboot everyday at 4:00AM.
 
It was an is set now to reboot everyday at 4:00AM.

Right move and you were ahead of us! I guess not suitable to do experiment in your production environment. Or else you can try to swap two ASUS router/WAP..then see if it happens to the same spot or the same router...

I would probably bring in a well configured and tested Asus/Netgear in your workbench and replace the "bad" guy for a week or two. See if this helps to isolate the issue to the router itself or the infrastructure. At least no service interruption in this way.
 
Just a quick tip - take a look at the switch stats - some of those runs are right at the limit of what one should be doing with CAT5/CAT6 cables - in theory one can go to 100M on a single run, but that's assuming that the cable meets spec very closely (many don't, but on short runs, it's less of a problem).

Anyways, a managed switch should be able to get some decent port statistics and metrics, and chase it from there...
 
And another hint/tip - the 87U (which in the drawings is the primary GW) - try to stay away from LAN1, as this is attached to the Quantenna chipset - LAN2-4 are on the Broadcom SoC...
 
Just a quick tip - take a look at the switch stats - some of those runs are right at the limit of what one should be doing with CAT5/CAT6 cables - in theory one can go to 100M on a single run, but that's assuming that the cable meets spec very closely (many don't, but on short runs, it's less of a problem).

Anyways, a managed switch should be able to get some decent port statistics and metrics, and chase it from there...

Port 46 is where the router is connected. Port 48 is the other WAP (RT-AC32100) (The one which is working good). Port 50 is the WAP we're having problems with.
2016-09-22 13_10_35-TL-SL2452.png


Here are the stats (For the last 2 days - 4 hours):

For port 46 (Router) :
2016-09-22 13_18_51-TL-SL2452.png


Port 48: (the other WAP; RT-AC32100; The one which is working good).
2016-09-22 13_17_46-TL-SL2452.png


Port 50 : (the WAP we're having problems with)
2016-09-22 13_18_04-TL-SL2452.png
 
The thing that leaps out at me from those pictures is the fact that the problem device is the only one plugged into a gigabit port. It also has the longest cable run. Everything else is 10/100. Try moving the cable from port 50 to port 44.
 

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top