What's new

ROG GT-BE19000_AI Router Five-Month Follow-Up Experience: The Suffocating Truth About Its "Local LLM"

Rabbit-Spec

Occasional Visitor
IMG_2069.jpg

It's been almost five months since I got my hands on the ROG BE19000_AI, and today I bring you some good news and some extremely hardcore bad news. The good news is that ASUS has finally delivered on its NPU promises to some extent; the bad news is that after some low-level reverse engineering on the router's highly-touted "local generative AI model," I discovered a jaw-dropping truth.

🟢 Frigate Finally Gets NPU Support

In my previous post, I mentioned that even the built-in Frigate container on the ASUS AI Board couldn't utilize the NPU for object detection. However, with the latest Frigate v0.17.0 update, this issue has finally been fixed! Frigate has now successfully bridged the container with the SyNAP hardware layer. Object detection computations for video streams can finally be offloaded to this 7.9 TOPS NPU, resulting in a massive drop in CPU load. This proves that the SL1680's NPU isn't a dud, and ASUS engineers are indeed pushing forward with low-level driver integration.

But this is strictly limited to the realm of Computer Vision (CV). When we shift our focus to Large Language Models (LLMs), things take a sharp nosedive.


🔴 The Cruel Reality of the Official "Local AI Assistant"

Many enthusiasts bought this router specifically to run local LLMs (like Qwen or Llama). Since it comes with an official AI assistant container named slm-cn-asus, does that mean ASUS has cracked the code for running LLMs on the NPU?

aba4db8c66b5924203bfaaba027eec2b.PNG

To figure this out, I dove straight into the lower levels of the official container for a thorough low-level inspection, and the findings are suffocating.

1. There is NO generative LLM running locally! I traversed the entire container's file system and found only one single model file under the /llm/ai_assistant/models directory: Qwen3-Embedding-0.6B-Q8_0.gguf. Anyone who knows their stuff will immediately spot the issue: this is an Embedding model. Its sole purpose is to convert text into vector matrices. It has absolutely zero capability for chat or text generation!

2. The so-called "Local AI" is actually a "Cloud Frankenstein." By analyzing the Python source code (ai_agent.py) and dependencies inside the container, the true workflow of the official AI assistant surfaced:

  • Local RAG Retrieval: When you ask a question, it first uses that local 0.6B Embedding model to match relevant paragraphs within the ASUS router manual's knowledge base (a .npy vector file).
  • Cloud-based Conversation Generation: Then, it packages the matched manual content along with your question and sends them via API to Alibaba Cloud's Qwen servers, letting the cloud generate the final response.

3. The biggest irony: Even this 0.6B model isn't using the NPU! A quick look at its requirements.txt reveals that the official container calls the model using the generic llama-cpp-python library. This means ASUS didn't even bother to write proprietary SyNAP drivers for this tiny 0.6B model. It's still relying purely on brute-force computation from the SL1680's quad-core A73 CPU!

🧱 Is Even the Chipmaker (Synaptics) Powerless?

Why didn't ASUS use the NPU? Just to save time? To find the root cause, I dug deep into the chipmaker Synaptics' official AI demo repository on GitHub (synaptics-synap/examples).

In the official LLM deployment guide for the SL16xx platform, the standard solution provided by Synaptics is literally just installing a pre-compiled llama-cpp-python! This implies that even the chipmaker itself, when faced with generative LLMs, has to bite the bullet and hard-compute using the CPU, completely unable to tap into that 7.9 TOPS NPU.

Fundamentally, this is an inherent flaw in the hardware's low-level architecture: This NPU is intrinsically designed for traditional CNNs (Convolutional Neural Networks, e.g., image recognition, noise cancellation). Modern LLMs rely on the Transformer architecture, which involves complex KV Cache dynamic memory management and core operators like FlashAttention—things this NPU simply does not support at the circuit level. Add to that the limited 4GB LPDDR4x memory bandwidth, and it’s destined to be just a bystander in the era of generative AI.


ee6a0be4b49222bd05ad2fd55f8f4dc7.PNG


🌅 Finally Facing Reality

What's really interesting is that while us early-adopter geeks were stubbornly battling the SL1680, Synaptics officially recognized this fatal flaw. In late 2025, Synaptics finally released the next-generation SL2610 processor.

Compared to the SL1680, which is severely "unbalanced" when handling LLMs, the new SL2610 features a targeted architectural redesign of its NPU. It natively patches the operator support required by the Transformer architecture and optimizes the memory subsystem.

This indirectly validates our previous low-level deductions: the SL1680's NPU truly cannot run generative AI. The manufacturer knew this full well, which is exactly why they rolled out the SL2610 to definitively fix the issue. But for gamers who already paid hard cash for the ROG GT-BE98_AI, this is undoubtedly a backstab. Our flagship router just became a "martyr" in the AI hardware exploration phase.


🤔 Speculation Time: Why This Obscure Chip?

Did the product managers get kickbacks from Synaptics? I just couldn't understand why they would pass up the RK3588 with its rich, mature open-source ecosystem, and opt for such an obscure chip instead.

Well, it might be because of this...

Back in 2020, Synaptics completely acquired Broadcom's IoT wireless division and assets.

In an embedded Linux system like a router, where absolute stability is paramount, hooking up a third-party co-processor board (like the RK3588) to the main Broadcom CPU via PCIe/USB is essentially "driver hell." Broadcom's SDK is notoriously closed-off and a nightmare to work with.

It's highly likely that the SL1680's low-level bus interface definitions, memory mapping logic, and even the coding habits of its engineers still carry the heavy DNA of the "ex-Broadcom IoT division." ASUS engineers probably realized that pairing the SL1680 alongside the BCM4916 results in buttery-smooth API handshakes at the lowest level, requiring almost zero from-scratch bridge drivers. If they had forced an RK3588 in there, ASUS would probably have had to hire dozens of low-level driver devs just to fix daily Kernel Panics.

Any resemblance to actual corporate decisions is purely coincidental. 😉


📝 Conclusion​

The network performance of the ROG GT-BE19000_AI is indeed T0 right now, and the Frigate NPU fix has salvaged some face for the "AI Board." However, for LLM enthusiasts, the closed chip ecosystem destines it to fail as a capable edge LLM host.
 
Last edited:

Latest threads

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!

Members online

Back
Top