-
jberman[m]
Been doing that today, it takes a while to run. Collecting more samples from both before and after
-
UkoeHB
cool thanks
-
-
jberman[m]
Pausing my node from syncing enabled deterministic results to compare, and I could reduce simulation size to 10k, so it wouldn't take as long
-
jberman[m]
So this means the pre-fix just has a longer tail. The final piece to the puzzle I think is comparing to observed on-chain data
-
jberman[m]
My first thought is that the difference between pre- and post-fix here suggests it would be ok to release the fix. Because the difference is so subtle, I don't believe you would be able to single out which transactions have the fix and which don't
-
-
jberman[m]
That's my data
-
jberman[m]
*pre-fix has a fatter tail, not just longer tail
-
UkoeHB
By deterministic results do you mean starting from the same entropy as well?
-
jberman[m]
Because the algorithm factors in block density from that average output time calculation, while a node syncs the distribution of density shifts as well, so it throws off a way to get a similar distribution after many simulations
-
jberman[m]
Deterministic wasn’t the right word, not sure what is
-
jberman[m]
It also means the distribution produced by the client is sort of a moving target, which makes analysis a bit more challenging here
-
UkoeHB
when secparam[m] said he expected a peak at 1300, I wonder if that is actually a peak at '1300 outputs ago'. Currently blocks have 24 txs on average, so ~48-55 outputs on average. At 20 blocks that is ~1000 outputs.
-
UkoeHB
which maybe means the algorithm's bug is thinking about block density in the first place
-
jberman[m]
<UkoeHB "which maybe means the algorithm'"> This is my suspicion as wel
-
jberman[m]
It would seem that factoring in density doesn’t mesh well with the expected spend time distribution, and that’s been nagging me. Like if we know that most outputs are spent after 10 blocks - 20 blocks, but there is a super dense block 25 blocks out, I’m thinking that would cause the algorithm to bias toward selecting an output from that 25 block out block, rather than a more recent block
-
jberman[m]
Comparing to on-chain data and observing dual peaks in the on-chain data would provide the answer we’re looking for I think. Won’t be by my computer to continue on this for a while though
-
secparam[m]
For what its worth, i'd cross check out data analysis first. See if we haven't screwed something up parsing the chain or the like.
-
sarang
At some level, there's going to be a tradeoff being density and time in the selection process. This is what lead to discussions of trying to account for density in the first place
-
sarang
If block density variance means anonymity sets overselect coinbase outputs, that's problematic. If you try to account for it and shift away from a desired time distribution, that's also problematic
-
jberman[m]
Ya makes sense. The question I’m trying to get at is how problematic if at all it has been in practice, also taking into account that bug. Could very well be not problematic at all in practice, and the on chain data would provide some clarity I think. Going to continue analysis this weekend
-
utxobr[m]
(thank you for putting all this effort on it, jberman , awesome stuff mate)
-
isthmus
Yeah @jberman[m] this is fascinating, really impressive work. You’ve got a good eye
-
isthmus
Do we have a back of the envelope calculation for what kind of transaction volume would trigger the divide by 0 error?
-
isthmus
Actually, output volume is probably more important
-
isthmus
If I wanted to trigger intentionally I would be making 16-output txns to speed the process along
-
UkoeHB
Roughly >100 outputs per block I think, over a 1 year period.
-
UkoeHB
Lol no sorry, 121 outputs per block.
-
sech1
120.00001 outputs per block will be enough
-
sech1
each transaction has at least 2 outputs, so it's 60 transactions per block
-
sech1
or ~43200 tx per day which is 2 times more than we have today
-
isthmus
Interesting
-
isthmus
Or 8 txns per block with 16 outputs each
-
isthmus
So at current txn volume this is a nontrivial attack, but as organic output volume increases the barrier to trigger actually comes closer
-
isthmus
That’s a funny quirk
-
jberman[m]
Credit to @sech1 on recognizing this as an attack vector, I was more focused on the privacy implications to see it
-
neptune[m]
I am looking at onchain data and I don't see a peak at 20 but rather at 11 (= 10 block lock + 1 block)
-
neptune[m]
Here is the distribution since April 2019
paste.debian.net/hidden/ba302d8d
-
neptune[m]
and here is the distribution since June 2021 (last month)
paste.debian.net/hidden/94ffaf80
-
neptune[m]
well, April 2019-today dist peak is 11, June 2021-today dist peak is 12
-
chad[m]
jberman sech1 what was the attack? I think some messages didn't come through
-
sech1