What's new

Overheating protection

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

ryzhov_al

Very Senior Member
Opened CFE source code gives me a chance to understand how thermal protection work on RT-N66U:
$ cat ./asuswrt/release/src-rt-6.x/cfe/build/broadcom/bcm947xx/compressed/rt-ac66u_nvram.txt
...
# Chip temperature polling period, range 1-14, in units of seconds, 0 means driver decides the value, 15 is reserved
pci/1/1/temps_period=5
# Temperature threshold above which the chip switches to a single TX chain to prevent damage from overheating
pci/1/1/tempthresh=120
# Temperature hysteresis, when the chip temperature falls below (tempthresh ?temps_hysteresis), 2-chain TX is re-enabled
# range 1-14, in units of degrees C. 0 means driver decides the value, 15 is reserved
pci/1/1/temps_hysteresis=5
...
In other words, when Wi-Fi module heated up to tempthresh "degrees", then Wi-Fi transmission degrades from 3Tx mode to 1Tx mode. When temperature will reduced by temps_hysteresis "degrees" a full speed 3 channels mode will be turned on again. Temperature will be checked every temps_period seconds.

A "degrees" word quoted because it some raw values used by Wi-Fi module. You may check current temperature with wl -i eth1 phy_tempsense command for 2,4GHz module or wl -i eth2 phy_tempsense command for 5Hz module. To convert this raw values into Celsius degrees use this formula, taken from WEB-interface code:
temperature in °C degrees = raw value / 2 + 20

To check that thermal throttling really works i defined my own values (48°C for 2,4GHz in example):
$ nvram set pci/2/1/tempthresh=56
$ nvram set pci/1/1/tempthresh=56
$ nvram set pci/1/1/temps_hysteresis=1
$ nvram set pci/2/1/temps_hysteresis=1
$ nvram commit && reboot
ran a script:
$ cat /opt/usr/sbin/wl_heartbeat.sh
#!/opt/bin/bash

get_vals() {
TEMP_24=$(($(/usr/sbin/wl -i eth1 phy_tempsense | awk "{print \$1}") /2 + 20)) ;
PWR=$(wl -i eth1 curpower | grep "Last adjusted est. power" | awk "{print \$6,\$7,\$8}")
echo "time: `date +%H:%M:%S`, temp: ${TEMP_24}, power: ${PWR}"
}

for seconds in {1..300}
do
get_vals
sleep 1s
done
and took a hair dryer :) See what's happens:
...
time: 23:20:56, temp: 48, power: 18.50 0.0 0.0
time: 23:20:58, temp: 48, power: 18.50 0.0 0.0
time: 23:20:59, temp: 48, power: 18.50 0.0 0.0
time: 23:21:00, temp: 47, power: 18.50 18.50 18.25
time: 23:21:01, temp: 47, power: 18.50 18.50 18.75
time: 23:21:02, temp: 47, power: 18.50 18.50 18.25
time: 23:21:04, temp: 47, power: 18.50 18.50 18.25
...
time: 23:21:37, temp: 47, power: 18.50 18.50 18.75
time: 23:21:38, temp: 47, power: 18.50 18.50 18.25
time: 23:21:39, temp: 47, power: 18.50 18.50 18.75
time: 23:21:41, temp: 48, power: 18.50 0.0 0.0
time: 23:21:42, temp: 48, power: 18.50 0.0 0.0
time: 23:21:43, temp: 48, power: 18.50 0.0 0.0
...
It works! 3Tx mode was only when temperature was below 48°C. If you want to return factory settings enter:
$ nvram set pci/2/1/tempthresh=120
$ nvram set pci/1/1/tempthresh=120
$ nvram set pci/1/1/temps_hysteresis=5
$ nvram set pci/2/1/temps_hysteresis=5
$ nvram commit && reboot

So, factory settings will slow down Wi-Fi transmission when chip temperature raises up to 80°C. Is it too much or too little? You decide.
 
Last edited:
Asus is lying. Real temperature can be readed with these commands:
wl -i eth1 phy_tempsence
wl -i eth2 phy_tempsence

I checked this... Here is my picture.
http://s18.postimg.org/us4ltqnt5/IMG_20130707_213901.jpg

These temperatures is in Celsium.

As you can see my router (and many others) are "burning".

You have any documentation from Broadcom's SDK to confirm that these are *real* values?

Asus is doing a manipulation on the values read from the registers. This isn't unheard of - for instance, the temperature returned by an Intel CPU is not the real temperature, but the difference between Tjunc and the read temperature. So if your CPU has a Tjunc value of 100C, and the CPU registers returns "40C", then the real CPU temperature is 100C-40C = 60C.

The real answer lies in Broadcom's SDK, which none of us has unfortunately.
 
Addendum: I hadn't seen the measurement photo (at first I thought the linked attachment was just a screenshot). This is definitely interesting, tho it's still not alarming. Modern chips can have a very high temperature tolerance, it depends on how they are designed. For example, I used to have an ATI video card which would see the GPU break over 100C (you read that right) under heavy load - and it was still perfectly stable, and within the specs for that specific GPU.

Also, the CFE does include thermal throttling protection in case of overheating.

There must be a reason for Asus to manipulate the values the way they do before displaying them. Only they (or Broadcom) would know the reasons.
 
Chips can have higher temperature tolerances but not the other components (capacitors, etc.) Also USB ports and SD card slot on PCB is very hot. Then how about temperature tolerances of USB thumbs and SD cards?
 

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top