-
sech1
-
sech1
Only 1356 h/s. It means SG2042 in X5 must have hardware AES and FP vector instructions
-
sech1
Because it does 11780 h/s
-
sech1
8-9 times faster
-
hyc
yah just saw that
-
hyc
I wonder if the Zb extensions would have made much difference
-
sech1
Not much, they save only few instructions
-
sech1
but vector extensions can make huge difference
-
hyc
sg2042 / C920 definitely has vector
-
hyc
that's the main difference to C910
-
hyc
"C920 adopts a state of the art 12-stage out-of-order multiple issue superscalar pipeline with high frequency, IPC, and power efficiency, with a 128-bit vector unit implementing the RISC-V V Extension 0.7.1."
-
hyc
of course the current spec is 1.1, and will be 2.0 by the time it's finalized
-
sech1
he did run with 64 threads
-
sech1
that explains such low hashrate
-
hyc
-
hyc
fp16 and fp32 vector
-
sech1
SG2042 also has L3 cache slicing (each group of 4 cores has "near" 4 MB of L3 cache)
-
sech1
I suspect the latency for these 4 MB will be much smaller
-
sech1
so scratchpad allocation will need to make OS API calls to arrange that
-
hyc
1MB L2 per 4-core cluster, 64MB L3 for entire chip
-
hyc
should support numactl etc
-
hyc
I thought only argon2 would take advantage of vector instructions?
-
hyc
-
hyc
it's the oldest release there...
-
pauliouk
you mean my poor little c910 beaglev-ahead is only going to scrape 100h/s :(
-
elucidator
ROI in 2e2 years
-
pauliouk
to be fair, the IO is going to come in useful, although its a bit overkill for controlling my greenhouse :D got some pi-zero's that'd do the job just as well :D
-
sech1
hyc vector instructions are used for regular RandomX instructions too, all RandomX FP registers are 128-bit (2x64)
-
Inge
so everyone is going to test out the pi5 then?
-
hyc
hmm. so one of the older chips, C906 also had RVV 0.7.1. Some folks have suggested C910 has it too, they just didn't document it.
-
pauliouk
just got my amazon of the bits I needed for the beaglev if you want to check anything on it?
-
hyc
and this older doc also says it explicitly
ftp.libre-soc.org/466100a052.pdf
-
hyc
-
hyc
so it looks like it'll be worthwhile to implement RVV 0.7.1 and see what difference it makes
-
tevador
vector 0.7.1 is an incompatible draft version, I'm not planning to implement it
-
hyc
hmmm. I need a few more ports on my ethernet switch...
-
hyc
tevador: yeah I know it's incompatible but nobody has the later versions anywhere do they?
-
hyc
and these chips are on the market now. might as well see how much it helps
-
tevador
the first chips with V 1.0 are expected perhaps next year
-
hyc
I may take a stab at it. we can always #ifdef it away later.
-
tevador
It will be hard to do. Some asm mnemonics are the same for V0.7.1 and V1.0, but the binary code is different.
-
tevador
It could make sense for xmrig, but I don't think it's worth it for the RandomX library. And you can't expected competitive hashrates without hardware AES anyways.
-
hyc
good point
-
tevador
SG2042R - the "R" could mean RandomX and they could have some custom extensions for it. RISC-V has opcode space reserved for custom instructions.
-
sech1
Do you mean they could've implemented some RandomX instructions 1:1?
-
sech1
Not just AES?
-
hyc
heh then it really would be an ASIC after all
-
tevador
Yes, some helper instructions for RandomX. We can't rule that out without seeing the risc-v binaries.
-
hyc
well I've booted up my licheepi4a but they recommend I update the firmware image. it's got debian preloaded
-
Lyza
maybe we can live in a world where all major CPU manufacturers add RandomX helper instructions (:
-
sech1
A helper instruction to load data from scratchpad masked address (L1/L2 mask) would be very useful
-
sech1
It could save 2-3 instructions on every scratchpad reading instruction
-
sech1
tevador ^
-
tevador
Yes, that's one of the instructions I had in mind.
-
hyc
randomx-tests passes. skipped randomx_reciprocal_fast and cache init sse /avx. no surprise there
-
tevador
Yes, those are x86-only.
-
hyc
running the 10M benchmark now
-
hyc
mem init was 29.5805s with 4 threads
-
tevador
What is your hashrate?
-
tevador
10M might take 1 day or more
-
hyc
I should've tested 1M first
-
hyc
lemme kill this
-
tevador
for me 10M took about 38 hours.
-
hyc
yeah. I only got 46.77 H/s
-
hyc
on 1000 nonces
-
tevador
1 thread?
-
hyc
4threads
-
hyc
and with hugepages
-
tevador
Shoud be more. It's a more powerful chip than the JH7110 I have.
-
hyc
23.933 H/s with 2 threads
-
hyc
yeah seems a bit too slow
-
tevador
Should be > 100 H/s without large pages based on the results from felixonmars.
-
hyc
maybe the newer firmware will improve that
-
hyc
I don't believe the chip is throttling, I've got the heatpad and fan mounted
-
hyc
but the results are consistent. 11.998H/s 1 thread
-
hyc
gcc (Debian 13.2.0-4revyos1) 13.2.0
-
hyc
I wonder if the compiler is just bad
-
hyc
11.28H/s without largepages
-
hyc
memory init 38.6s
-
tevador
Lichee Pi 4A is TH1520 quad core. The benchmarks in the PR show 35 H/s with 1T and 104 H/s with 4T.
-
tevador
My tests are with gcc (Debian 12.2.0-10) 12.2.0
-
hyc
yeah I wonder which board felixonmars used
-
hyc
looks like they use the same u-boot setup as arm android. flash mode comes up as an android fastboot usb device. same tool is used.
-
pauliouk
hmm time to get the beagle started up... turns out the random microb-usb cable I had in my draw (don't remember buying it, so probably found it years and years ago) is urm, well poo. So lets try a new shiny one :D
-
hyc
booted up again with fresh firmware
-
hyc
got 34.89H/s this time, without largepages
-
hyc
it's a newer kernel build, september vs july.
-
hyc
43.35H/s with largepages. 1 thread.
-
hyc
78.71H/s 2 threads
-
hyc
132.54H/s 4 threads
-
hyc
that 34.89 matches felixonmars' result without largepages
-
hyc
I guess I'll run a 1M and 10M now
-
hyc
should be a little over 2 hours for 1M
-
pauliouk
hmm my android tv boxes got 1M in around an hour
-
hyc
sure but they have all the acceleration goodies
-
tevador
hyc: cool
-
tevador
hyc: is this the default build or native?
-
hyc
I just used "cmake .." no other options
-
tevador
OK, so the default one (rv64gc)
-
hyc
I guess we don't care about the 1M or 10M result on default build now? should I just stop this and rebuild with native?
-
tevador
The default build is more important because monerod will ship with it.
-
hyc
ok
-
hyc
I'll get the 1M and 10M results on default build then.
-
tevador
It seems that monero doesn't release binaries for risc-v yet. But it might be a good time to start.
github.com/monero-project/monero/releases/tag/v0.18.3.1
-
hyc
will have to see how stable the toolchains are. sipeed is still maintaining their own patches to gcc
-
hyc
I guess if we're doing generic rv64gc that shouldn't matter
-
tevador
yes, I think rv64gc has been stable for quite some time
-
hyc
about as appealing as raspberry pi ... bleah
-
pauliouk
meh, well beagle isn't auto connecting via ethernet, boots damn quick though. I just don't have a means of connecting to the damn thing. Got the screen hooked up, but can't think of a way off hand of 'forwarding' my keyboard to it through the usb-b micro cable :/
-
tevador
IMO it makes more sense than the armv7 build
-
pauliouk
and the uart->usb isn't connecting either by the looks of it
-
hyc
I have a bunch of wireless airmouse minikeyboards for my tvboxes. plugged one of those into usb
-
hyc
but the one I used right now, the mouse pointer isn't working. oh well
-
hyc
tevador yeah we should prob think about dropping all of the 32bit builds
-
pauliouk
beagle doesn't come with usb2/3 :| well it has, but I'm powering it from my usb3 on the 'host' machine
-
tevador
When this RandomX patch is included in monerod, I'm going to run a node on my risc-v board.
-
hyc
I'm prob gonna check how well it streams movies :P
-
hyc
I wonder if termux has risc-v binaries yet. if I decide to install android
-
pauliouk
finally! damn thing decided to wake up
-
pauliouk
Linux BeagleV 5.10.113-g52fbe8443ea1-dirty #1 SMP PREEMPT Tue Jul 11 17:16:44 UTC 2023 riscv64 riscv64 riscv64 GNU/Linux
-
tevador
cool
-
tevador
hyc: does your kernel list any isa extensions in /proc/cpuinfo?
-
pauliouk
It still astounds me that this little circuit board, powered by a USB cable, runs faster than and does way more things than the $1600 PC from 25 years ago sitting in the cupboard next to me could ever dream of doing
-
pauliouk
granted, takes a bit longer to build xmrig from src on this than it does on a 7950x... but heck, its smaller than my work pass
-
selsta
tevador: our depends build system allows for cross compiling to risc-v, we just don't have reproducible builds setup yet so no release
github.com/monero-project/monero/ac…ons/runs/6384731213/job/17327984104
-
tevador
What needs to be done to get reproducible builds for riscv64?
-
pauliouk
hyc, quick question - what flags should I use for compiling xmrig on this C910?
-
selsta
Would this be a Linux risc-v release? or which OS do most people use?
-
pauliouk
just grabbed the git and built libuv, openssl and hwloc as static libraries
-
tevador
selsta: yes, I would assume that linux-riscv64 would be the most useful release.
-
selsta
I think we would have to add g++-riscv64-linux-gnu here, and a couple more steps that hyc knows best
github.com/monero-project/monero/bl…ter/contrib/gitian/gitian-linux.yml
-
selsta
and something to HOSTS
-
tevador
selsta: I found a related bitcoin PR
bitcoin/bitcoin #13665
-
selsta
nice, shouldn't be too much work to adapt for our codebase. we also don't need changes for qt and related packages.
-
hyc
-
hyc
it's a 5.10 kernel
-
hyc
selsta yeah I think mostly it should be a simple dropin to gitian-linux.yml
-
hyc
I tried cmake -DARCH=native and it still just did -march=rv64gc ... hmm
-
hyc
ah yeah both zba and zbb give illegal instruction
-
hyc
so default and native are identical here
-
hyc
looking at that bitcoin PR, a lot of it is just updating autoconf to detect rv64 machine
-
hyc
heh. so we already have a 64core chip here. that 64bit thread affinity mask is going to need increasing soon ;)
-
hyc
anyway, my 1M run matched
-
selsta
I'll try in the next days to add riscv to gitian