17:16:19 just got a linkedin request from a sales manager at Bitmain. 17:16:35 anyone else? 17:16:51 from Etsuka Tomonaga 18:20:45 hyc nope 18:21:18 On RandomX v2 topic: increasing program size does increase the amount of +inf FP values, but the main culprit is not FMUL, but FDIV_M instruction 18:22:20 reducing FDIV_M from 4 to 3 and increasing FSQRT_R from 6 to 7 brings the amount of +inf values to only 1.5x higher than v1 levels (it was 6.5x higher without the fix) 18:22:35 I'm testing with program size 384 18:23:15 With these parameters, v2 program still has ~4.5 FDIV_M instructions per program, which is more than in v1 18:24:27 So program size = 384, RANDOMX_FREQ_FDIV_M = 3, RANDOMX_FREQ_FSQRT_R = 7 are the tentative values for RandomX v2 18:32:45 "About 2% of programs produce at least one infinity value." 18:32:59 For v2 with the above parameters, it's 2.8% 18:36:27 Also, changing RANDOMX_FREQ_FDIV_M from 4 to 3 and RANDOMX_FREQ_FSQRT_R from 6 to 7 is very convenient to implement - need to change just 2 neighboring values in the instruction table 18:37:47 Actually, just one value 18:39:29 Did one more test without changing any frequencies - got 6.85% of programs with at least one infinity value 18:39:34 I think it's acceptable too? 18:40:16 Made preliminary changes and pushed test vectors https://git.gammaspectra.live/P2Pool/go-randomx/commit/7f9393533a90e89be97344f93a8e8359bcb957e0 18:41:42 For v1, 85% of all hashes never have any +inf value during execution 18:42:09 For v2 without instruction frequency changes, it's 56.7% 18:42:18 I think it's better to have +inf values more often 18:42:45 So ASIC must implement their support, or have almost every second hash invalid 18:43:27 in the semifloat code I saw, the inf path was done slow as it was very unlikely to be hit, indeed 18:43:57 Even without frequency changes, 0.12% of individual group E values are +inf after a main loop iteration 18:44:57 (1-0.0012)^8=0.99 18:45:08 so 99% of program iterations don't have +inf in group E registers 18:45:11 I think it's fine 18:45:31 Unsure whether it's been pointed out, but if an hypothetical asic does not implement infinities, it can early out at the first infinity it encounters, so N% of programs yielding an infinity anywhere means less than N% hash rate loss. 18:45:33 It won't hurt scratchpad entropy, because it does AES now anyway 18:46:47 moneromooo true 18:46:58 but I think RandomX v1 doesn't have enough +inf values 18:47:19 v2 has a bit more, but not too much - so it's good 18:47:31 I don't think we need to change instruction frequencies at all 18:48:50 reaching inf also allows short path operations afterward, though 18:49:02 inf sticks 18:49:08 "99% of program iterations don't have +inf in group E registers" 18:49:16 I can't give more than 1% speedup 18:49:56 I need to add some more counters to check this number 18:50:24 yeah, I'll add some metrics to mine, lemme see 18:50:33 also, when there is an infinity, it's usually just one of 4 group e registers 18:54:05 https://paste.debian.net/hidden/e6aa1a28 18:54:43 This is randomx-benchmark binary, running in interpreter mode with added counters 18:55:34 So 0.77%, and when it does happen, it's almost always just one group E register 18:55:54 so the theoretical speedup of "sticky +inf" optimization will be something like 0.2% 18:57:10 DataHoarder you can revert instruction frequencies to the old values :) 18:57:17 yeah will do :) 18:57:22 it's on the bench/testing branch 19:02:00 you are only checking after the intepreter exits the loops right? 19:02:13 lemme check all total operations 19:04:30 Yes, after loop exit 19:04:43 Because as you said, infinity sticks 19:12:48 Of the tasks in https://github.com/tevador/RandomX/pull/274 , almost all is done. Only the "New PowerPC intrinsics" and "Update documentation" left 19:13:00 I mean, done in https://github.com/SChernykh/RandomX/tree/v2 branch 19:13:19 I don't have any big endian PPC for testing though 19:13:56 I can of course copy over the fallback intrinsic code there and call it a day :) 19:14:14 because the fallback code passes the tests on s390x which is big endian 20:12:16 Updated the documentation. That's basically it, we can start testing v2 on different systems. 22:12:02 What kind of targeted testing is being looked for V2? 23:17:53 V1 vs V2 hashrate and power at the wall on different mining rigs. I'll dig up my old PCs (3700X and 5600X) from the basement tomorrow 23:18:49 Also, v2 hashrate vs program size graph. Since it's not in xmrig yet, randomx-benchmark with 100K nonces will do (but only with large pages + MSR enabled) 23:32:19 I guess we can also use the brand specific cpu monitoring