What's new

Entware-3x for new HND platform (GT-AC5300 and RT-AC86U) with asuswrt-merlin firmware

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

It's a hyrbid/mutant I would think... broken paths with the old school stuff?

HND puts iproute2 packages under /bin, while previous SDKs installed iproute2 under /usr/sbin/ . Unsure if the change was done by the iproute2 maintainers or by BCM.
 
@Odkrys has installed 64-bit version.
BTW @Voxel 32-bit armv7 repo can be also used on HND platform. It will be faster than entware-ng and probably faster than standard 32-bit Entware-3x repo.

Hi zyxmon, I still have not released Cortex-A15 version with hardware float using your December's snapshot. Permanent lack of time. It is even already compiled, but no time to publish... I hope to complete end of this week or beginning of the next.

Voxel.
 
Thanks for your work on Entware-3x, zyxmon. I'm running it on my EdgeRouter X and like it very much.
 
Zyxmon BIG thx
Most things that I tested are working correctly on my AC86U
I only have one small problem when trying to run hdparm:
Code:
hdparm: error while loading shared libraries: libgcc_s.so.1: cannot open shared object file: No such file or directory
 
... It is now fixed. Run `opkg update; opkg upgrade`.
FYI... My RT-AC86U firmware has the required 64-bit library /lib/aarch64/libgcc_s.so.1, from the Asuswrt toolchain. And I was a bit surprised when I ran hdparm, and it worked just fine!! However, I understand you don't want to link Entware programs against Asuswrt libraries.
Code:
# ldd /opt/sbin/hdparm
linux-vdso.so.1 (0x0000007f9b173000)
libgcc_s.so.1 => /lib/aarch64/libgcc_s.so.1 (0x0000007f9b128000)
libc.so.6 => /lib/aarch64/libc.so.6 (0x0000007f9afde000)
/lib/ld-linux-aarch64.so.1 (0x0000007f9b148000)

After updating to hdparm 9.52-1a, it now correctly links /opt/lib/ . Thank you.
 
Last edited:
Once I'm done with my backlog of work, I will look into possibly making entware-setup.sh offer to chose between each repos at setup time.
 
Once I'm done with my backlog of work, I will look into possibly making entware-setup.sh offer to chose between each repos at setup time.
If you would append in one-liner form to the user scripts instead of renaming them and write new files, that would be the icing on the cake.
In AB4 I only use the post-mount and services-stop files.
rc.unslung is run when file opkg is found on $1, I make sure there is only one entware* folder during installation:
Code:
#!/bin/sh

OF="$(find $1/entware*/bin/opkg 2> /dev/null)"

if [ "$OF" ]; then
ln -nsf "$(echo "$OF" | sed 's~/bin/opkg~~g')" /tmp/opt
/opt/etc/init.d/rc.unslung start $0
logger -t AB-Solution "started Entware services"
fi
The reason I use find is because I no longer want to worry if a device has a label or not.
This code is in a separate file I source in post-mount.
 
... aarch64 repo works and is faster compared to armv7.
OpenSSL ranked from best to worst performance on RT-AC86U.
1. Asuswrt-Merlin (armv7)
2. Entware-ng-3x-armv7
3. Entware-ng-3x-armv8

RT-AC86U Entware-ng-3x-armv8
Code:
# /opt/bin/openssl speed aes-256-cbc
Doing aes-256 cbc for 3s on 16 size blocks: 7737774 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 64 size blocks: 1987817 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 256 size blocks: 505497 aes-256 cbc's in 3.02s
Doing aes-256 cbc for 3s on 1024 size blocks: 126280 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 8192 size blocks: 15783 aes-256 cbc's in 3.00s
OpenSSL 1.0.2n  7 Dec 2017
built on: reproducible build, date unspecified
options:bn(64,64) rc4(ptr,char) des(idx,cisc,2,int) aes(partial) blowfish(ptr)
compiler: aarch64-openwrt-linux-gnu-gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -I/media/ware4/Entware-3x.2017.12/staging_dir/target-aarch64_cortex-a53_glibc-2.25/opt/include -I/media/ware4/Entware-3x.2017.12/staging_dir/target-aarch64_cortex-a53_glibc-2.25/include -I/media/ware4/Entware-3x.2017.12/staging_dir/toolchain-aarch64_cortex-a53_gcc-6.3.0_glibc-2.25/include -I/media/ware4/Entware-3x.2017.12/staging_dir/toolchain-aarch64_cortex-a53_gcc-6.3.0_glibc-2.25/include -DOPENSSL_SMALL_FOOTPRINT -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DOPENSSL_NO_ERR -DTERMIOS -O2 -pipe -mcpu=cortex-a53 -fno-caller-saves -fno-plt -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result  -fpic -I/media/ware4/Entware-3x.2017.12/package/libs/openssl/include -ffunction-sections -fdata-sections -fomit-frame-pointer -Wall -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-256 cbc      41268.13k    42406.76k    42850.08k    43103.57k    43098.11k

RT-AC86U Entware-ng-3x-armv7
Code:
# /opt/bin/openssl speed aes-256-cbc
Doing aes-256 cbc for 3s on 16 size blocks: 8740478 aes-256 cbc's in 2.99s
Doing aes-256 cbc for 3s on 64 size blocks: 2303619 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 256 size blocks: 588327 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 1024 size blocks: 147700 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 8192 size blocks: 18483 aes-256 cbc's in 3.00s
OpenSSL 1.0.2n  7 Dec 2017
built on: reproducible build, date unspecified
options:bn(64,32) rc4(ptr,char) des(idx,cisc,2,long) aes(partial) blowfish(ptr)
compiler: arm-openwrt-linux-gnueabi-gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -I/media/ware4/Entware-3x.2017.12/staging_dir/target-arm_cortex-a9_glibc-2.25_eabi/opt/include -I/media/ware4/Entware-3x.2017.12/staging_dir/target-arm_cortex-a9_glibc-2.25_eabi/include -I/media/ware4/Entware-3x.2017.12/staging_dir/toolchain-arm_cortex-a9_gcc-6.3.0_glibc-2.25_eabi/usr/include -I/media/ware4/Entware-3x.2017.12/staging_dir/toolchain-arm_cortex-a9_gcc-6.3.0_glibc-2.25_eabi/include -DOPENSSL_SMALL_FOOTPRINT -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DOPENSSL_NO_ERR -DTERMIOS -O2 -pipe -march=armv7-a -mtune=cortex-a9 -fno-caller-saves -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result -mfloat-abi=soft  -fpic -I/media/ware4/Entware-3x.2017.12/package/libs/openssl/include -ffunction-sections -fdata-sections -fomit-frame-pointer -Wall -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-256 cbc      46771.79k    49143.87k    50203.90k    50414.93k    50470.91k

RT-AC86U Asuswrt-Merlin (armv7)
Code:
# openssl speed aes-256-cbc
Doing aes-256 cbc for 3s on 16 size blocks: 9697053 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 64 size blocks: 2674358 aes-256 cbc's in 3.01s
Doing aes-256 cbc for 3s on 256 size blocks: 685991 aes-256 cbc's in 3.01s
Doing aes-256 cbc for 3s on 1024 size blocks: 172596 aes-256 cbc's in 3.01s
Doing aes-256 cbc for 3s on 8192 size blocks: 21598 aes-256 cbc's in 3.00s
OpenSSL 1.0.2n  7 Dec 2017
built on: reproducible build, date unspecified
options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr)
compiler: /opt/toolchains/crosstools-arm-gcc-5.3-linux-4.1-glibc-2.22-binutils-2.25/usr/bin/arm-buildroot-linux-gnueabi-gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DOPENSSL_NO_HEARTBEATS -DL_ENDIAN -march=armv7-a -fomit-frame-pointer -mabi=aapcs-linux -marm -ffixed-r8 -msoft-float -D__ARM_ARCH_7A__ -ffunction-sections -fdata-sections -O3 -Wall -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-256 cbc      51717.62k    56863.43k    58343.42k    58717.04k    58976.94k
 
Last edited:
OpenSSL ranked from best to worst performance on RT-AC86U.
I have made similar tests on Realtek RTD1295. Entware-ng (armv7) vs Entware-3x(armv8). Entware-3x was ~17% faster. This means benchmarks are CPU specific.
Asuswrt openssl is compiled with -O3 optimizations. Entware is compiled with -O2 optimizations. Entware-ng uses soft float (-mfloat-abi=soft). This explains some of the test results.
You can also test Vortex Entware-3x port that is armv7 -O3 hard float. It was 30% faster in a similar test compared with Entware-ng on IPQ4018 (Asus RT-58AC running lede).

Different benchmarks I have run show small difference between aarch64 and armv7. Some tests are better on armv7 some tests are better on aarch64. Using hardfloat on aarch64 generally gives some advantages to aarch64. I have made sysbench entware package but the results are very strange - https://github.com/akopytov/sysbench/issues/209
That is why I prefer these benchmarks - http://forums.zyxmon.org/viewtopic.php?f=10&t=5382 (Russian) - (1) bc to calculate pi (single threaded) (2) p7zip internal benchmark (multithreaded) and (3) openssl test that can use hardware crypto engine if there is one.

Real life tests (like transmission download speed) are preferable IMHO.
 
Last edited:
You can also test Vortex Entware-3x port that is armv7 -O3 hard float
Where is the Entware-3x config to do "-O3 hard float"? I will try re-compiling the repo this way. I hope it goes well.


EDIT: is it this way?
Code:
CONFIG_TARGET_OPTIMIZATION="-O3 -pipe -march=armv7-a -mtune=cortex-a9 -mfloat-abi=hard"
# CONFIG_SOFT_FLOAT is not set
 
EDIT: is it this way?
Like that. Better use `make menuconfig` to setup options. Floating point subtype like `CPU_SUBTYPE:=neon-vfpv4` is setup here - https://github.com/Entware-for-kernel-3x/Entware-ng-3x/blob/master/target/linux/armv7-3x/Makefile

We have started Entware-3x update today. It will take some time. Buildroot may be unusable now.

Just try install_std.sh script from http://cortex-a15.zyxmon.org/binaries/cortex-a15-3x/installer/ - you will have entware-3x fork installed with -O3 optimizations and neon fp. You can use it only to benchmark.
 
OpenSSL ranked from best to worst performance on RT-AC86U.
1. Asuswrt-Merlin (armv7)
2. Entware-ng-3x-armv7
3. Entware-ng-3x-armv8

Add the elapsed flag... makes for better numbers...

Code:
sfx@raspy3:~ $ openssl speed -evp aes-256-cbc -elapsed
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-256-cbc      25006.49k    32792.55k    35607.72k    36381.35k    36623.70k    36640.09k

Pi3 on amrhf via Raspbian - openssl 1.1 which might skew results.

There's another thread where AC86U numbers might be of interest where we're looking at overall thruput.
 
Hi zyxmon, I still have not released Cortex-A15 version with hardware float using your December's snapshot. Permanent lack of time. It is even already compiled, but no time to publish... I hope to complete end of this week or beginning of the next.

Tinkering about - I don't have a Cortex-15 chip handy, but I do have an A17 recently, which is competitive...

Outside of entware, one can tweak a bit, but considering the effort - I've settled on armhf - tune the kernel and glibc, but I have to support A8/A9/A7/a17/a53 cores with various options...
 
@Fitz Mutch - I have compiled openssl with -O3 optimizations for armv7 and aarch64. It is just an option for openssl packages
just run
Code:
opkg install http://entware-3x.zyxmon.org/binaries/armv7/test/libopenssl_1.0.2n-1a_armv7-3x.ipk
opkg install http://entware-3x.zyxmon.org/binaries/armv7/test/openssl-util_1.0.2n-1a_armv7-3x.ipk
or
Code:
opkg install http://entware-3x.zyxmon.org/binaries/armv8/test/libopenssl_1.0.2n-1a_armv8-3x.ipk
opkg install http://entware-3x.zyxmon.org/binaries/armv8/test/openssl-util_1.0.2n-1a_armv8-3x.ipk
to install/upgrade.
As can be seen from openssl sources arm and aarch64 use different assembler optimizations. More source files are assembly optimized for arm (32 bit) now. This can explain benchmarks.
 
Here is -O2 and -O3 difference for Realtek (aarch64)

Code:
# /opt/bin/openssl speed aes-256-cbc
Doing aes-256 cbc for 3s on 16 size blocks: 2579910 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 64 size blocks: 662011 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 256 size blocks: 167691 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 1024 size blocks: 42078 aes-256 cbc's in 2.99s
Doing aes-256 cbc for 3s on 8192 size blocks: 5265 aes-256 cbc's in 2.99s
OpenSSL 1.0.2n  7 Dec 2017
built on: reproducible build, date unspecified
options:bn(64,64) rc4(ptr,char) des(idx,cisc,2,int) aes(partial) blowfish(ptr) 
compiler: aarch64-openwrt-linux-gnu-gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -I/media/ware4/Entware-3x.2017.12/staging_dir/target-aarch64_cortex-a53_glibc-2.25/opt/include -I/media/ware4/Entware-3x.2017.12/staging_dir/target-aarch64_cortex-a53_glibc-2.25/include -I/media/ware4/Entware-3x.2017.12/staging_dir/toolchain-aarch64_cortex-a53_gcc-6.3.0_glibc-2.25/include -I/media/ware4/Entware-3x.2017.12/staging_dir/toolchain-aarch64_cortex-a53_gcc-6.3.0_glibc-2.25/include -DOPENSSL_SMALL_FOOTPRINT -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DOPENSSL_NO_ERR -DTERMIOS -O2 -pipe -mcpu=cortex-a53 -fno-caller-saves -fno-plt -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result  -fpic -I/media/ware4/Entware-3x.2017.12/package/libs/openssl/include -ffunction-sections -fdata-sections -fomit-frame-pointer -Wall -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-256 cbc      13759.52k    14122.90k    14309.63k    14410.66k    14425.04k


Code:
# /opt/bin/openssl speed aes-256-cbc
Doing aes-256 cbc for 3s on 16 size blocks: 2962298 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 64 size blocks: 768632 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 256 size blocks: 195814 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 1024 size blocks: 49159 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 8192 size blocks: 6252 aes-256 cbc's in 3.00s
OpenSSL 1.0.2n  7 Dec 2017
built on: reproducible build, date unspecified
options:bn(64,64) rc4(ptr,char) des(idx,cisc,16,int) aes(partial) blowfish(ptr) 
compiler: aarch64-openwrt-linux-gnu-gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -I/media/ware4/Entware-3x.2017.12/staging_dir/target-aarch64_cortex-a53_glibc-2.25/opt/include -I/media/ware4/Entware-3x.2017.12/staging_dir/target-aarch64_cortex-a53_glibc-2.25/include -I/media/ware4/Entware-3x.2017.12/staging_dir/toolchain-aarch64_cortex-a53_gcc-6.3.0_glibc-2.25/include -I/media/ware4/Entware-3x.2017.12/staging_dir/toolchain-aarch64_cortex-a53_gcc-6.3.0_glibc-2.25/include -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DOPENSSL_NO_ERR -DTERMIOS -pipe -mcpu=cortex-a53 -fno-caller-saves -fno-plt -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result -O3 -fpic -I/media/ware4/Entware-3x.2017.12/package/libs/openssl/include -ffunction-sections -fdata-sections -fomit-frame-pointer -Wall -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-256 cbc      15798.92k    16397.48k    16709.46k    16779.61k    17072.13k
 
@Fitz Mutch - I have compiled openssl with -O3 optimizations for armv7 and aarch64. ... More source files are assembly optimized for arm (32 bit) ...
Now it's fast like Asuswrt-Merlin.
Code:
# /opt/bin/openssl speed aes-256-cbc -elapsed
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256 cbc for 3s on 16 size blocks: 9758463 aes-256 cbc's in 3.02s
Doing aes-256 cbc for 3s on 64 size blocks: 2658233 aes-256 cbc's in 3.01s
Doing aes-256 cbc for 3s on 256 size blocks: 682584 aes-256 cbc's in 3.01s
Doing aes-256 cbc for 3s on 1024 size blocks: 171306 aes-256 cbc's in 3.01s
Doing aes-256 cbc for 3s on 8192 size blocks: 21491 aes-256 cbc's in 3.01s
OpenSSL 1.0.2n  7 Dec 2017
built on: reproducible build, date unspecified
options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) blowfish(ptr)
compiler: arm-openwrt-linux-gnueabi-gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -I/media/ware4/Entware-3x.2017.12/staging_dir/target-arm_cortex-a9_glibc-2.25_eabi/opt/include -I/media/ware4/Entware-3x.2017.12/staging_dir/target-arm_cortex-a9_glibc-2.25_eabi/include -I/media/ware4/Entware-3x.2017.12/staging_dir/toolchain-arm_cortex-a9_gcc-6.3.0_glibc-2.25_eabi/include -I/media/ware4/Entware-3x.2017.12/staging_dir/toolchain-arm_cortex-a9_gcc-6.3.0_glibc-2.25_eabi/include -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DOPENSSL_NO_ERR -DTERMIOS -pipe -march=armv7-a -mtune=cortex-a9 -fno-caller-saves -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result -mfloat-abi=soft -O3 -fpic -I/media/ware4/Entware-3x.2017.12/package/libs/openssl/include -ffunction-sections -fdata-sections -fomit-frame-pointer -Wall -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-256 cbc      51700.47k    56520.57k    58053.66k    58278.19k    58489.79k
 

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top