00:00:27 Been doing that today, it takes a while to run. Collecting more samples from both before and after 00:09:44 cool thanks 00:54:11 * jberman[m] uploaded an image: (32KiB) < https://libera.ems.host/_matrix/media/r0/download/matrix.org/rMhVIJPdfaxcfSbUjrhiILJF/get_outs%20(pre%20vs.%20post-fix).png > 00:54:56 Pausing my node from syncing enabled deterministic results to compare, and I could reduce simulation size to 10k, so it wouldn't take as long 00:55:31 So this means the pre-fix just has a longer tail. The final piece to the puzzle I think is comparing to observed on-chain data 00:59:39 My first thought is that the difference between pre- and post-fix here suggests it would be ok to release the fix. Because the difference is so subtle, I don't believe you would be able to single out which transactions have the fix and which don't 01:01:46 * jberman[m] posted a file: (318KiB) < https://libera.ems.host/_matrix/media/r0/download/matrix.org/jdnlviZqCZtErrjDIVKmbmgc/get_outs%20(old%20vs%20new).ods > 01:01:47 That's my data 01:03:51 *pre-fix has a fatter tail, not just longer tail 01:29:23 By deterministic results do you mean starting from the same entropy as well? 01:33:05 Because the algorithm factors in block density from that average output time calculation, while a node syncs the distribution of density shifts as well, so it throws off a way to get a similar distribution after many simulations 01:33:05 Deterministic wasn’t the right word, not sure what is 01:33:56 It also means the distribution produced by the client is sort of a moving target, which makes analysis a bit more challenging here 01:35:13 when secparam[m] said he expected a peak at 1300, I wonder if that is actually a peak at '1300 outputs ago'. Currently blocks have 24 txs on average, so ~48-55 outputs on average. At 20 blocks that is ~1000 outputs. 01:36:44 which maybe means the algorithm's bug is thinking about block density in the first place 01:47:06 This is my suspicion as wel 01:57:26 It would seem that factoring in density doesn’t mesh well with the expected spend time distribution, and that’s been nagging me. Like if we know that most outputs are spent after 10 blocks - 20 blocks, but there is a super dense block 25 blocks out, I’m thinking that would cause the algorithm to bias toward selecting an output from that 25 block out block, rather than a more recent block 01:57:26 Comparing to on-chain data and observing dual peaks in the on-chain data would provide the answer we’re looking for I think. Won’t be by my computer to continue on this for a while though 04:10:10 For what its worth, i'd cross check out data analysis first. See if we haven't screwed something up parsing the chain or the like. 12:19:51 At some level, there's going to be a tradeoff being density and time in the selection process. This is what lead to discussions of trying to account for density in the first place 12:21:08 If block density variance means anonymity sets overselect coinbase outputs, that's problematic. If you try to account for it and shift away from a desired time distribution, that's also problematic 14:49:54 Ya makes sense. The question I’m trying to get at is how problematic if at all it has been in practice, also taking into account that bug. Could very well be not problematic at all in practice, and the on chain data would provide some clarity I think. Going to continue analysis this weekend 18:30:53 (thank you for putting all this effort on it, jberman , awesome stuff mate) 19:43:21 Yeah @jberman[m] this is fascinating, really impressive work. You’ve got a good eye 19:46:32 Do we have a back of the envelope calculation for what kind of transaction volume would trigger the divide by 0 error? 19:46:42 Actually, output volume is probably more important 19:47:07 If I wanted to trigger intentionally I would be making 16-output txns to speed the process along 19:48:57 Roughly >100 outputs per block I think, over a 1 year period. 19:50:45 Lol no sorry, 121 outputs per block. 19:55:52 120.00001 outputs per block will be enough 19:56:02 each transaction has at least 2 outputs, so it's 60 transactions per block 19:56:19 or ~43200 tx per day which is 2 times more than we have today 20:08:03 Interesting 20:08:14 Or 8 txns per block with 16 outputs each 20:10:25 So at current txn volume this is a nontrivial attack, but as organic output volume increases the barrier to trigger actually comes closer 20:11:06 That’s a funny quirk 20:24:54 Credit to @sech1 on recognizing this as an attack vector, I was more focused on the privacy implications to see it 20:55:40 I am looking at onchain data and I don't see a peak at 20 but rather at 11 (= 10 block lock + 1 block) 20:55:45 Here is the distribution since April 2019 https://paste.debian.net/hidden/ba302d8d/ 20:56:07 and here is the distribution since June 2021 (last month) https://paste.debian.net/hidden/94ffaf80/ 21:16:16 well, April 2019-today dist peak is 11, June 2021-today dist peak is 12 21:42:48 jberman sech1 what was the attack? I think some messages didn't come through 21:51:29 chad https://github.com/monero-project/monero/pull/7798#issuecomment-881570800