09:00:36 @hyc someone tested SG2042 with tevador's code: https://user-images.githubusercontent.com/1006477/275177467-cc78752e-28d5-4293-b18e-787566b246ba.png 09:01:09 Only 1356 h/s. It means SG2042 in X5 must have hardware AES and FP vector instructions 09:01:25 Because it does 11780 h/s 09:01:35 8-9 times faster 09:05:34 yah just saw that 09:06:28 I wonder if the Zb extensions would have made much difference 09:06:42 Not much, they save only few instructions 09:06:53 but vector extensions can make huge difference 09:08:13 sg2042 / C920 definitely has vector 09:09:07 that's the main difference to C910 09:12:38 "C920 adopts a state of the art 12-stage out-of-order multiple issue superscalar pipeline with high frequency, IPC, and power efficiency, with a 128-bit vector unit implementing the RISC-V V Extension 0.7.1." 09:13:56 of course the current spec is 1.1, and will be 2.0 by the time it's finalized 09:16:18 he did run with 64 threads 09:16:30 that explains such low hashrate 09:17:57 https://xrvm.com/product/xuantie/4224888731980599296 09:18:07 fp16 and fp32 vector 09:18:48 SG2042 also has L3 cache slicing (each group of 4 cores has "near" 4 MB of L3 cache) 09:19:01 I suspect the latency for these 4 MB will be much smaller 09:19:27 so scratchpad allocation will need to make OS API calls to arrange that 09:23:34 1MB L2 per 4-core cluster, 64MB L3 for entire chip 09:25:43 should support numactl etc 09:30:48 I thought only argon2 would take advantage of vector instructions? 09:32:49 https://github.com/riscv/riscv-v-spec/releases/tag/0.7.1 09:32:58 it's the oldest release there... 10:28:40 you mean my poor little c910 beaglev-ahead is only going to scrape 100h/s :( 11:11:20 ROI in 2e2 years 11:22:15 to be fair, the IO is going to come in useful, although its a bit overkill for controlling my greenhouse :D got some pi-zero's that'd do the job just as well :D 12:15:54 hyc vector instructions are used for regular RandomX instructions too, all RandomX FP registers are 128-bit (2x64) 12:46:23 so everyone is going to test out the pi5 then? 14:42:46 hmm. so one of the older chips, C906 also had RVV 0.7.1. Some folks have suggested C910 has it too, they just didn't document it. 14:43:54 just got my amazon of the bits I needed for the beaglev if you want to check anything on it? 14:46:34 and this older doc also says it explicitly https://ftp.libre-soc.org/466100a052.pdf 14:48:23 ah yes, and this review says it too https://linuxgizmos.com/dev-kit-debuts-risc-v-xuantie-c910-soc-with-a-3d-gpu-and-android-and-linux-support/ 14:48:50 so it looks like it'll be worthwhile to implement RVV 0.7.1 and see what difference it makes 14:49:32 vector 0.7.1 is an incompatible draft version, I'm not planning to implement it 14:49:44 hmmm. I need a few more ports on my ethernet switch... 14:50:17 tevador: yeah I know it's incompatible but nobody has the later versions anywhere do they? 14:50:35 and these chips are on the market now. might as well see how much it helps 14:50:55 the first chips with V 1.0 are expected perhaps next year 14:51:39 I may take a stab at it. we can always #ifdef it away later. 14:54:23 It will be hard to do. Some asm mnemonics are the same for V0.7.1 and V1.0, but the binary code is different. 14:56:41 It could make sense for xmrig, but I don't think it's worth it for the RandomX library. And you can't expected competitive hashrates without hardware AES anyways. 14:57:15 good point 14:58:14 SG2042R - the "R" could mean RandomX and they could have some custom extensions for it. RISC-V has opcode space reserved for custom instructions. 15:13:16 Do you mean they could've implemented some RandomX instructions 1:1? 15:13:23 Not just AES? 15:31:09 heh then it really would be an ASIC after all 15:31:19 Yes, some helper instructions for RandomX. We can't rule that out without seeing the risc-v binaries. 15:32:02 well I've booted up my licheepi4a but they recommend I update the firmware image. it's got debian preloaded 15:33:36 maybe we can live in a world where all major CPU manufacturers add RandomX helper instructions (: 15:34:44 A helper instruction to load data from scratchpad masked address (L1/L2 mask) would be very useful 15:34:56 It could save 2-3 instructions on every scratchpad reading instruction 15:35:01 tevador ^ 15:35:16 Yes, that's one of the instructions I had in mind. 15:40:21 randomx-tests passes. skipped randomx_reciprocal_fast and cache init sse /avx. no surprise there 15:40:57 Yes, those are x86-only. 15:41:32 running the 10M benchmark now 15:43:57 mem init was 29.5805s with 4 threads 15:44:00 What is your hashrate? 15:44:11 10M might take 1 day or more 15:44:17 I should've tested 1M first 15:44:19 lemme kill this 15:44:37 for me 10M took about 38 hours. 15:45:34 yeah. I only got 46.77 H/s 15:45:40 on 1000 nonces 15:46:37 1 thread? 15:46:44 4threads 15:47:16 and with hugepages 15:47:41 Shoud be more. It's a more powerful chip than the JH7110 I have. 15:47:49 23.933 H/s with 2 threads 15:48:15 yeah seems a bit too slow 15:48:31 Should be > 100 H/s without large pages based on the results from felixonmars. 15:48:51 maybe the newer firmware will improve that 15:49:20 I don't believe the chip is throttling, I've got the heatpad and fan mounted 15:51:45 but the results are consistent. 11.998H/s 1 thread 15:53:15 gcc (Debian 13.2.0-4revyos1) 13.2.0 15:53:25 I wonder if the compiler is just bad 16:04:29 11.28H/s without largepages 16:05:05 memory init 38.6s 16:13:32 Lichee Pi 4A is TH1520 quad core. The benchmarks in the PR show 35 H/s with 1T and 104 H/s with 4T. 16:14:05 My tests are with gcc (Debian 12.2.0-10) 12.2.0 16:26:33 yeah I wonder which board felixonmars used 16:27:29 looks like they use the same u-boot setup as arm android. flash mode comes up as an android fastboot usb device. same tool is used. 16:35:53 hmm time to get the beagle started up... turns out the random microb-usb cable I had in my draw (don't remember buying it, so probably found it years and years ago) is urm, well poo. So lets try a new shiny one :D 16:37:03 booted up again with fresh firmware 16:38:01 got 34.89H/s this time, without largepages 16:38:40 it's a newer kernel build, september vs july. 16:39:29 43.35H/s with largepages. 1 thread. 16:40:31 78.71H/s 2 threads 16:41:26 132.54H/s 4 threads 16:42:12 that 34.89 matches felixonmars' result without largepages 16:45:13 I guess I'll run a 1M and 10M now 16:46:16 should be a little over 2 hours for 1M 16:48:46 hmm my android tv boxes got 1M in around an hour 16:49:19 sure but they have all the acceleration goodies 16:49:39 hyc: cool 16:54:07 hyc: is this the default build or native? 16:54:26 I just used "cmake .." no other options 16:54:37 OK, so the default one (rv64gc) 16:56:33 I guess we don't care about the 1M or 10M result on default build now? should I just stop this and rebuild with native? 16:59:39 The default build is more important because monerod will ship with it. 16:59:48 ok 17:00:14 I'll get the 1M and 10M results on default build then. 17:03:03 It seems that monero doesn't release binaries for risc-v yet. But it might be a good time to start. https://github.com/monero-project/monero/releases/tag/v0.18.3.1 17:04:01 will have to see how stable the toolchains are. sipeed is still maintaining their own patches to gcc 17:05:01 I guess if we're doing generic rv64gc that shouldn't matter 17:05:16 yes, I think rv64gc has been stable for quite some time 17:06:15 about as appealing as raspberry pi ... bleah 17:07:39 meh, well beagle isn't auto connecting via ethernet, boots damn quick though. I just don't have a means of connecting to the damn thing. Got the screen hooked up, but can't think of a way off hand of 'forwarding' my keyboard to it through the usb-b micro cable :/ 17:07:43 IMO it makes more sense than the armv7 build 17:08:20 and the uart->usb isn't connecting either by the looks of it 17:11:35 I have a bunch of wireless airmouse minikeyboards for my tvboxes. plugged one of those into usb 17:11:49 but the one I used right now, the mouse pointer isn't working. oh well 17:13:09 tevador yeah we should prob think about dropping all of the 32bit builds 17:14:16 beagle doesn't come with usb2/3 :| well it has, but I'm powering it from my usb3 on the 'host' machine 17:16:59 When this RandomX patch is included in monerod, I'm going to run a node on my risc-v board. 17:19:04 I'm prob gonna check how well it streams movies :P 17:20:00 I wonder if termux has risc-v binaries yet. if I decide to install android 17:25:45 finally! damn thing decided to wake up 17:26:56 Linux BeagleV 5.10.113-g52fbe8443ea1-dirty #1 SMP PREEMPT Tue Jul 11 17:16:44 UTC 2023 riscv64 riscv64 riscv64 GNU/Linux 17:30:14 cool 17:31:34 hyc: does your kernel list any isa extensions in /proc/cpuinfo? 17:53:51 It still astounds me that this little circuit board, powered by a USB cable, runs faster than and does way more things than the $1600 PC from 25 years ago sitting in the cupboard next to me could ever dream of doing 17:59:09 granted, takes a bit longer to build xmrig from src on this than it does on a 7950x... but heck, its smaller than my work pass 18:00:57 tevador: our depends build system allows for cross compiling to risc-v, we just don't have reproducible builds setup yet so no release https://github.com/monero-project/monero/actions/runs/6384731213/job/17327984104 18:02:52 What needs to be done to get reproducible builds for riscv64? 18:03:10 hyc, quick question - what flags should I use for compiling xmrig on this C910? 18:06:28 Would this be a Linux risc-v release? or which OS do most people use? 18:07:36 just grabbed the git and built libuv, openssl and hwloc as static libraries 18:08:44 selsta: yes, I would assume that linux-riscv64 would be the most useful release. 18:09:30 I think we would have to add g++-riscv64-linux-gnu here, and a couple more steps that hyc knows best https://github.com/monero-project/monero/blob/master/contrib/gitian/gitian-linux.yml 18:10:25 and something to HOSTS 18:18:39 selsta: I found a related bitcoin PR https://github.com/bitcoin/bitcoin/pull/13665 18:24:52 nice, shouldn't be too much work to adapt for our codebase. we also don't need changes for qt and related packages. 20:49:42 tevador: /proc/cpuinfo https://paste.debian.net/hidden/5f97a3f0/ 20:49:49 it's a 5.10 kernel 20:51:03 selsta yeah I think mostly it should be a simple dropin to gitian-linux.yml 20:52:22 I tried cmake -DARCH=native and it still just did -march=rv64gc ... hmm 20:55:42 ah yeah both zba and zbb give illegal instruction 21:01:57 so default and native are identical here 21:07:42 looking at that bitcoin PR, a lot of it is just updating autoconf to detect rv64 machine 21:17:36 heh. so we already have a 64core chip here. that 64bit thread affinity mask is going to need increasing soon ;) 21:22:02 anyway, my 1M run matched 21:37:58 I'll try in the next days to add riscv to gitian