Dismiss Notice

Welcome To SNBForums

SNBForums is a community for anyone who wants to learn about or discuss the latest in wireless routers, network storage and the ins and outs of building and maintaining a small network.

If you'd like to post a question, simply register and have at it!

While you're at it, please check out SmallNetBuilder for product reviews and our famous Router Charts, Ranker and plenty more!

[Release] FreshJR Adaptive QOS (Improvements / Custom Rules / and Inner workings)

Discussion in 'Asuswrt-Merlin' started by FreshJR, Jan 12, 2017.

  1. FreshJR

    FreshJR Regular Contributor

    Joined:
    Oct 8, 2016
    Messages:
    174
    Defiantly different names than webUI. Spaces in variables are a no-no. Valid names are the following (Case Sensitive)

    VOIP
    Gaming
    Others
    Web
    Downloads
    Default
     
  2. MarCoMLXXV

    MarCoMLXXV Guest

    Any updates on the issue I reported @FreshJR ?
     
  3. FreshJR

    FreshJR Regular Contributor

    Joined:
    Oct 8, 2016
    Messages:
    174
    I couldn't get the issue repeated. How fast is your internet connection? What were you doing.

    The bucket error doesn't seem to refer to the QOS scheduler buckets overflowing, but rather too many TCP connections in the time wait stage. There wasn't enough resources to keep track of more connections.

    Code:
    Time WAIT =
    (1)Server wants to close connection, send you a FIN (terminate connection) message.
    (2)You receive that message, and you send an AWK (acknoldge that message)
    (3)You also sen a FIN that you are terminating this connection
    (4) You are now in TIME WAIT for the server to send you last message that it AWK'd your termination aswell
    
    This is done to prevent potential overlap when opening the new connection.
    
    
    QOS should never of dropped that last AWK packet since that is net control. Even if that packet was dropped, no new connection should have been established between you and speedtest since that time wait was not finished.

    Could it be that you were getting DDOS'd or doing something on your network that created an insane amount of connections. Something had to be opening a lot of connections during that time.

    I do not think QOS should be able to go rouge with its setup. Your changes were correctly implemented and really minor from mine.

    You do not need App Analysis turned on for QOS to function. If you did that imposed a way larger load on the router that was already strained from high connections. Not only does each packet now get inspected and sorted with QOS, with app analysis every connection has to be looked up, referenced by name, and also have bandwidth tracked. If you had were already pushing the limits of active open connections, and added the burden of looking up each connection + tracking its bandwidth, no wonder the kernel dropped all services to deal with this high amount work.

    If you want to try again and encounter this issue check open connections under the Tools in webUI.
    Connections 515 / 300000 - 28 active

    If it is high, go to Network Tools in webUI, netstat, netstat-nat, sort by state, netstat.
    See what demon on your network is unleashing this hell. Note, drop app analysis if you keep it running to prevent additional unnecessary load.

    You may have been getting DDOS or have an infected computer part of a botnet on your network. Kernal panic may probably would have been avoided without the QOS overhead but besides this theory I have no idea what happened.

    If you regularly run that many connections, than you might be over the limits of this router.
     
    Last edited: Aug 3, 2017
  4. MarCoMLXXV

    MarCoMLXXV Guest

    Thanks for checking it out, I appreciate it. I'll give it another try on a fresh install in the weekend.

    Regarding speed: I have a DOCSIS 3.0 Cable connection with 150 Mbit downstream and 15 Mbit upstream, which are most of the time when measured slightly higher. I've never had any issues before, nor have I noticed any changes with the new fq_codel discipline QoS with Docsis preset by @RMerlin in the current stable release.

    I wasn't doing anything special, just a bit of surfing and only my laptop out of 21 devices was actively in use, besides a baby cam streaming live video on the LAN to my iPad, furthermore some low traffic from/to several IoT devices around the house. So nothing out of the ordinairy. I haven't found any evidence that I was dDOS'ed as my outgoing connection was stable as usual. Could it be that changing the variables while the script was in use, caused this? Even if I rebooted after I noticed the errors, but the errors still filled the logs after the reboot?
     
  5. Vexira

    Vexira Very Senior Member

    Joined:
    Jan 20, 2017
    Messages:
    731
    Location:
    Australia
    On the topic of app anaysis im qurious, so how does it affect qos, if its tracking band width, is it onlynlike you said before jsut there to show the bandwidth usage of devices and which apps are pulling data on eg a tablet the facebook app or is there any other purpose to it does it tie into traffic history logging page. Ive always wondered about it.
     
  6. FreshJR

    FreshJR Regular Contributor

    Joined:
    Oct 8, 2016
    Messages:
    174
    Correct, it just tracks bandwidth per packet pattern match per client so it can display that information to you.
    I do not use that information, so I keep it OFF to save/not waste processing power/ram.

    Procedure is like this

    Adaptive QOS (ON)
    -inspects packet
    -packet matches pattern in database
    -packet gets marked according to database entry
    -marked packet gets sorted into its traffic container

    Traffic Analyzer (ON)
    -on packet pattern match it additionally gets cross reference to the pattern name identifier in database
    -that packet size is added to its pattern identifier counter to show bandwidth (each user has separate counter per same pattern ID)
    -this information is processed and shown in webUI

    All in all, the traffic analyzer procedure is optional. Adaptive QOS will work without traffic analyzer

    Traffic history functions on same principle but is also optional.
     
    Last edited: Aug 5, 2017
  7. Vexira

    Vexira Very Senior Member

    Joined:
    Jan 20, 2017
    Messages:
    731
    Location:
    Australia
    thanks i appreicate the explantion, seems like it woeld be awesome if your wrote a guide to adaptive qos with explantions on the bells and whistles.
    Also thanks for the tip i noticed after disabling it the ram usage went down.
     
    Last edited: Aug 5, 2017
  8. ledan

    ledan Occasional Visitor

    Joined:
    Jan 7, 2010
    Messages:
    11
    Hi. I've been using your script for couple days with good results but google photos goes to net control packet although I've included HTTPS filters indicated on page 9. I've also tried filtering port 443 by using custom rules without much success. Any idea ? Thanks
     
  9. FreshJR

    FreshJR Regular Contributor

    Joined:
    Oct 8, 2016
    Messages:
    174
    If it is going to netcontrol that means it is matching some filter rule, but not matching your custom rule.


    Out of box these two filters point to net control

    -0x80090000 (Management tools / protocols)
    -0x80140000 (Network protocol)

    You can run

    Code:
     tc filter show dev br0 | grep "1:10" -A 1
    
    To see which filter it is matching by looking at the success counter incrementation between successive calls.

    So while you also added 0x8013, which also handles https traffic shown on page 9, I am willing to bet that google photos is instead matching either marks 0x8009 or 0x8014 instead. You will have to fix your custom rule instead so it performs a match and directs it to your desired container.

    Note: the custom rules are EGRESS traffic, that means traffic physically moving AWAY from the router.

    Download EGRESS traffic is all the the packets pushed from your router towards the PC.
    The dst port for download egress is your PC's receiving port. **Template rule
    (If your PC is not receiving the photos through 443 the rule will fail)​
    The ip src port for download egressis your router originating port. **Not included as script template
    (Due to port forwarding, the routers src port and the PC's receiving port may differ)​

    Upload EGRESS traffic is all packets pushed from your router towards the google server.
    The ip dst for upload egress is the servers receiving port.**Not included as script template
    (If your router is not sending photos to WAN port 443, the rule will fail)​
    The ip src for upload egress is the router sending port. **Template rule
    (If your router is not sending the photos from 443 the rule will fail) **Example rule
    First double check your custom rule syntax. Next see where traffic is actually going using the routers netstat-nat command. Finally, if you really need ingress traffic matching, that has to be done within iptables. That example was included within the script but for different types of rules.
     
    Last edited: Aug 8, 2017
  10. MarCoMLXXV

    MarCoMLXXV Guest

    @FreshJR I've upgraded to @RMerlin's 380.68_alpha2-g5d6b6dd in the meantime. I did a fresh install. Can I use your script with 380.68 as well, given the amount of changes under the hood between .67 and .68? Anyone running it already on 380.68_alpha2 successfully?
     
  11. Vexira

    Vexira Very Senior Member

    Joined:
    Jan 20, 2017
    Messages:
    731
    Location:
    Australia
    I'm running it, since it's not a gui based script it should be fine, seems to work for me.
     
    MarCoMLXXV likes this.
  12. FreshJR

    FreshJR Regular Contributor

    Joined:
    Oct 8, 2016
    Messages:
    174
    Most likely yes. Just try it.

    The script just changes the value parameters that were passed into the the traffic control engine originally by asus. It doesn't really break between updates since it doesn't try to integrate anywhere or install anything.
     
    MarCoMLXXV likes this.
  13. MarCoMLXXV

    MarCoMLXXV Guest

    Thanks for the quick reply, I'll give it another try!
     
  14. ledan

    ledan Occasional Visitor

    Joined:
    Jan 7, 2010
    Messages:
    11
    Results:
    filter parent 1: protocol all pref 12 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10
    mark 0x80090000 0x803f0000 (success 1712)
    --
    filter parent 1: protocol all pref 23 u32 fh 801::800 order 2048 key ht 801 bkt 0 flowid 1:10
    mark 0x80140000 0x803f0000 (success 100095)

    So, it seems to be
    -0x80140000 (Network protocol)

    Added rules :
    ${tc} filter add dev br0 protocol all prio 15 u32 match mark 0x80140000 0x803f0000 flowid ${Web}
    ${tc} filter add dev br0 protocol all prio 15 u32 match mark 0x80130000 0x803f0000 flowid ${Web}
    ${tc} filter add dev eth0 protocol all prio 15 u32 match mark 0x40140000 0x403f0000 flowid ${Web}
    ${tc} filter add dev eth0 protocol all prio 15 u32 match mark 0x40130000 0x403f0000 flowid ${Web}

    What I am wondering is: Can I get problems with real network control packet that could go in the wrong container ?

    Thanks for you help
     
  15. FreshJR

    FreshJR Regular Contributor

    Joined:
    Oct 8, 2016
    Messages:
    174
    That's a terrible rule.

    You will be moving all your net controls into the web container. Net controls are intended to the processed asap for a responsive internet.

    You should try to figure out a better rule that would only catch the photos instead.

    Maybe google has a set iprange for their photo severs. Try Whois/dns lookups for those domains to see the ranges they resolve too.

    https://support.google.com/a/answer/2589954?visit_id=0-636379033563589379-2115053252&hl=en&rd=1

    If not maybe dump the packets and see if they have a unique TOS/dscp mark or anything else you can filter on.

    HTTPS traffic is rough. Trend micro identifies a suprising amount of https traffic.

    Worst case just filter 443 tracfic for packets above a certain size. Get creative.

    Note: When replacing existing rules, watch the PRIO definition since both those rules should not be prio 15.
    8014 was an existing rule located at pref 23. To redefine its container destination you should of used pref23 instead of 15. By placing it on pref15 now you have a duplicate with the original on pref23 (the one on pref15 will get matched first).

    I updated the table on page 9 to make this easier to which prefs factory rules are located at. This should be kept in mind when changing factory rule destinations.

    Pref15 was a the first blank spot within ASUS's rules so thats where I put in 8013. That was arbitrary, more logically it should be at pref 22.
     
    Last edited: Aug 9, 2017
  16. ledan

    ledan Occasional Visitor

    Joined:
    Jan 7, 2010
    Messages:
    11
    Well. Not enough knowledge...

    I'll remove any added rules and will only use your examples.

    Thanks again
     
  17. Johnathon

    Johnathon Occasional Visitor

    Joined:
    Jun 21, 2017
    Messages:
    27
    Hi,

    Question, when I make any changes to the bandwidth allocations I am met with a string of errors saying "rate must be defined" or something along those lines. I am not sure what I am doing wrong here.

    Secondly, where should I place the script once it's up and running so that it starts at router boot?

    Thanks,
    J
     
  18. Johnathon

    Johnathon Occasional Visitor

    Joined:
    Jun 21, 2017
    Messages:
    27
    Ignore my question about where to place it.
     
  19. Uthall

    Uthall Occasional Visitor

    Joined:
    Jul 26, 2017
    Messages:
    39
    Is there any way to classify "Youtube" into its own container?
     
  20. FreshJR

    FreshJR Regular Contributor

    Joined:
    Oct 8, 2016
    Messages:
    174
    If your bandwidth allocation changes are failing you may have introduced a space into the variable.

    There should be NO space before or after the equal sign. There should be NO decimals in the percentage aswell.
     

Share This Page