12:46:41 <gingeropolous> anyone getting one of these RISCV miners? 13:43:24 <plowsof> They "sold out"(tm) in a few seconds 13:49:43 <kico> you mean in bitmains website? 13:50:05 <kico> https://mineshop.eu/antminer-x5/ 13:50:43 <kico> sigh 3,299.87 13:58:35 <nioc> They don't say how loud it is :D 13:59:28 <nioc> Also needs 220v service 13:59:46 <nioc> 200-240v 13:59:53 <m-relay> <hbs:matrix.org> I am surprised by the weight 14:02:54 <sech1> plowsof so GPU scalpers jumped on this too? 14:38:26 <kico> iBet these "resellers" just bought the whole batch 14:39:55 <hyc> that'd be most likely 14:40:14 <hyc> hell they may have been preordered, which is why it "sold out in seconds" 14:46:32 <sech1> so Bitmain's price has never been the real price 14:53:49 <tevador> When are they shipping? Would be interesting if someone could post some pics of the board without heatsinks. 15:41:53 <kico> word wish I could get one just to tear it apart :P 15:42:22 <kico> tevador, mineshop.eu says "In stock shipping in 7-10 days" 15:42:31 <kico> "Available Sept 10 – 30th sept 2023" 15:42:52 <kico> but god knows they might be taking pre orders to then order or smth 15:46:20 <sech1> "naked board without heatsinks" lol, sounds like porn 15:46:46 <sech1> someone will eventually publish the pictures, with all part numbers 15:58:21 <tevador> It's probably something based on XuanTie C910. 16:05:17 <sech1> probably, but it has to be heavily customized for RandomX 16:05:24 <sech1> XuanTie C910 doesn't have crypto instructions (?) 16:10:27 <tevador> I don't think it would be worth it to design a custom chip just for Monero. It's most likely a stock C910 with 4 cores/8 MB of cache per cluster. 16:10:54 <tevador> At most they could have added a coprocessor for the scratchpad expansion, but most likely they just dedicated a few cores for that. 16:59:17 <sech1> They probably have multiple chips on the board. My nonce analysis showed 52 mining threads in each of the 10 horizontal lines, so around 520 mining cores per device 16:59:33 <sech1> so probably 16x4 core chips, but only 13 are active 17:05:53 <sech1> These chips are TSMC 12nm by the way 17:13:37 <hyc> 12nm? how do you know? I think even my car radio's ARM chip is on 8nm 17:16:21 <sech1> It's on the last page https://img.102.alibaba.com/1627958419409/49652c9412c41cb6f39b36fed1244e6e.pdf 17:18:12 <hyc> cool thanks 17:18:24 <m-relay> <xfedex:matrix.org> 12 nm is really a lot. I guess that if RISC-V CPUs are underdeveloped. 17:20:27 <hyc> configs up to 8MB L2 cache. that'd be what they'd need for 4 cores 17:22:40 <hyc> any idea what the XIE instruction extensions do? 17:26:23 <hyc> there is also a C920 64-core chip, same instruction features 17:28:32 <tevador> C910 is open source: https://github.com/T-head-Semi/openc910 17:29:56 <hyc> yes, I already saw that https://github.com/T-head-Semi/openc910/issues/17 17:30:01 <hyc> while looking for the XIE docs. 17:30:20 <hyc> but doesn't seem like XIE has anything to help crypto 17:34:27 <sech1> If they really use it "as is", the only way to nerf it is to have moooore AES :D 17:35:01 <sech1> but AES is already like 5-8% of hash time on Ryzen 17:35:13 <tevador> AES is too ASIC friendly 17:35:52 <tevador> AFAIK there are no chips that support the crypto extension yet 17:37:08 <sech1> It's possible to squeeze some AES in step 11 in https://github.com/tevador/RandomX/blob/master/doc/specs.md#462-loop-execution 17:37:26 <sech1> because next come steps 1-2 with 30-50 clock cycles wait for read from L3 17:37:30 <sech1> which is not used by anything yet 17:37:54 <hyc> C series instruction manual: 17:37:58 <hyc> https://github.com/rjiejie/XuanTie-doc/blob/807dca46f8f67df9a65192d7aa35e9965d6272af/XuanTieCseriesinstructionmanual.pdf 17:38:11 <sech1> for example, registers f0-f3 could go through a few AES round before writing them back to scratchpad 17:38:16 <sech1> just generating ideas now 17:40:13 <sech1> 30-50 clock cycles window means we could have 8 AES rounds done on 4 registers (32 instructions in total) on each loop iteration 17:40:22 <sech1> Assuming AES latency is 4 cycles 17:41:30 <sech1> hyc yep, I don't see any crypto instructions 17:45:00 <sech1> 32 AES instructions per loop => 524288 additional AES instructions per hash 17:45:19 <sech1> Right now one RandomX hash does 262144 AES instructions 17:45:27 <sech1> so it's 3x increase on AES side, almost for free 17:45:43 <sech1> I like it 17:45:46 <tevador> We'd have to benchmark the impact on light mode perf. 17:46:06 <sech1> Why light mode? Step 11 is the same in light mode 17:46:18 <sech1> "The values of registers f0-f3 are written to the Scratchpad (L3) at address spAddr0 (64-byte aligned)." 17:46:24 <sech1> This is where I suggest to add AES 17:46:48 <sech1> and the L3 read delay in steps 1-2 will also be the same 17:46:56 <sech1> AES will interleave with this L3 read delay 17:48:09 <tevador> If it fits in L3 cache delay, it's probably fine. 17:48:21 <sech1> also, it will improve scratchpad entropy 17:48:35 <sech1> because f0-f3 are not totally random when we store them 17:50:11 <tevador> The only complication is the soft AES mode, which would probably need to call a subroutine there. Currently we don't have any AES in VM code. 17:51:45 <sech1> yes, but this subroutine would be just a bunch of AES calls, easy to write 17:52:15 <sech1> soft AES becomes less and less relevant (for x64 and ARM) 17:52:25 <sech1> newer RISC-V CPUs will also have it 17:54:12 <tevador> For example, Raspberry Pi 4 doesn't have AES, so it would be impacted. But if the verification time doesn't increase by more than a few percent, it's probably fine. 17:55:15 <sech1> IIRC soft AES mode spends around 30% of time calculating AES 17:55:44 <sech1> so we could add 4 AES rounds per loop (16 instructions), doubling the total AES instruction count 17:55:58 <sech1> which would mean ~50% of time for AES in soft AES mode 17:56:36 <sech1> or 30% slowdown for CPUs without AES, if my math is right 17:57:01 <sech1> "CPUs without AES" = "Bitmain professional miners without AES" 17:58:00 <sech1> Raspberry Pi, I hate them. They cheaped out on crypto extensions. Literally all other SBCs have them. 17:59:53 <sech1> also, RPi4 was released in 2019, and we're discussing tweaks for future CPUs, tweaks that will go live in 2024 (RPi4 will be 5 years old by then). 18:03:20 <elucidator> they left it out for cost reasons tbh, not everybody gets to use free instruction sets :P 18:06:10 <sech1> Yes, but RPi4 is more like an exception, not a rule 18:06:30 <sech1> I like this AES tweak also because it improves scratchpad entropy 18:06:40 <tevador> What is relevant is the impact on light mode verification time for low end hardware that might be running a Monero node. We don't want to increase the hardware requirements for running a node. 18:07:50 <sech1> I have RPi 3b+ for testing 18:08:07 <sech1> It already takes almost a second to verify a hash (light mode + soft AES mode) 18:08:17 <sech1> It doesn't have enough RAM to run full dataset mode 18:08:39 <sech1> Or my memory betrays me 18:08:46 <sech1> I remember it could do something around 20 h/s when mining 18:08:53 <sech1> so 5 h/s per core, or 200 ms to verify a hash 18:09:24 <sech1> so with this AES tweak, it would be 30% more - 260 ms 18:12:44 <elucidator> 18:06:10 <@sech1> Yes, but RPi4 is more like an exception, not a rule => yeah ofc. i got all generations of rpis i can test with, rpi4 is my home node. can test it and report 18:44:43 <elucidator> welp we don't have aarch xmrig handy, got it and testing now 18:47:20 <elucidator> > miner speed 10s/60s/15m 102.5 n/a n/a H/s max 103.0 H/s 18:49:56 <elucidator> https://qu.ax/bDmY.png well now it's worse than that, i think i should shutdown monerod 18:55:18 <elucidator> yeah barely 90h/s https://qu.ax/Dko.png 19:00:49 <sech1> You need to enable huge pages 19:06:21 <elucidator> tried with sudo, let me check other options 19:09:35 <elucidator> i think this needs a kernel recompile 19:15:45 <sech1> it should be more than 100 h/s with huge pages 23:41:32 <gingeropolous> so riscv CPUs are unwanted in the monero mining ecosystem?