-
jberman[m]
secparam: I ran multiple simulations of get_outs in my client making sure to isolate the gamma outputs. Here were my results attempting get_outs 100k times
-
-
-
sethsimmons
Where did you report that bug, jberman?
-
sethsimmons
I don't see an issue or mention in #monero-dev:monero.social and want to make sure its appropriately shared and reviewed.
-
jberman[m]
DM'd moneromooo, luigi and fluffy. Talked about it with moneromooo, and ran it by him before sharing it here. He suggested sharing in -dev too. Should I take it there also?
-
sgp_[m]
dev is a good place if you're making these PRs, yup :)
-
sethsimmons
<jberman[m] "DM'd moneromooo, luigi and fluff"> Great! Yeah, that would be a good idea if moneromooo and co are game with it being public.
-
jberman[m]
Sounds good
-
sech1
jberman that line in the code, "average_output_time = DIFFICULTY_TARGET_V2 * blocks_to_consider / outputs_to_consider;", calculates time in seconds so it should be ok with integer division
-
sech1
actually it depends on what range "outputs_to_consider" covers... Need to step through it in the debugger to find out
-
UkoeHB
sech1 it is supposed to be computing the average rate that outputs were added to the chain over up to the most recent 1 year period
-
sech1
"DIFFICULTY_TARGET_V2 * blocks_to_consider" is definitely in seconds. The questionn is, what's the resulting value there (average_output_time)
-
UkoeHB
It does feel a bit embarrassing that I didn't notice this integer division issue either
-
sech1
if it's 1 or 2 then it must be computed as double
-
sech1
but if it's ~164000 (1.9 days in seconds) then it's ok
-
UkoeHB
outputs_to_consider should be the total number of outputs in the last year's worth of blocks
-
sech1
then it should be computed as double
-
jberman[m]
How UkoeHB is reading it is how I read it too^
-
jberman[m]
Using integer division, you end up rounding the true average of ~1.9 seconds per output over the past year down to 1 second, which is a significant difference from the true average, that then flows into the output selection in pick(). I can’t see why it would be expected to have such a large difference from the true average
-
jberman[m]
Plus, just taking a step back, it wouldn’t make sense for average_output_time to be a double if the intent was to do integer division there
-
UkoeHB
I think this algorithm was built when tx volume was way lower, so it must not have affected test runs
-
jberman[m]
Ya the more outputs, the worse the impact on the calculation downstream
-
sech1
*the more often outputs appear
-
sech1
once it got sub 5 seconds per output on average, it got bad
-
sech1
hmm, won't we get 0 there eventually, if we have a lot of transactions for more than a year?
-
sech1
more than 120 outputs per block on average
-
jberman[m]
Yep, probably would’ve been discovered naturally soon
-
jberman[m]
Through normal use
-
sech1
division by zero on line 1031, classic
-
jberman[m]
Ya seems clients would start failing to construct tx’s
-
selsta
should be easy to fix, right? was planning a new release anyway
-
jberman[m]
-
jberman[m]
Waiting on repo rights to submit a PR, but I don’t care if it’s not me and you wanna release in the meantime
-
sech1
this will however change output selection algorithm and we'll have two different versions for quite a while. There will be probably privacy implications
-
sech1
two anonymity "puddles" instead of one
-
sech1
for quite a while = until next hardfork
-
jberman[m]
That’s true, maybe it does makes sense to hold off
-
sech1
wrong output selection hurts privacy already now though
-
sech1
I think the benefits of the fix outweigh it
-
sech1
pools (miner payouts) and exchanges (deposits/withdrawals) will update quickly and then we'll have big enough anonymity set for the new output selection
-
sech1
besides, division by zero bug won't wait
-
sech1
so if I'm reading the code right, this bug makes all decoy outputs older than they should've been (1.9x older), which makes "newest output = real spend" heuristic work better
-
sech1
in this case, it should be fixed asap
-
UkoeHB
hmm the odd thing is secparam[m] 's results which showed most decoys at 20 blocks, which jberman[m] seemed to corroborate
-
sech1
hmm, I read the code wrong. "uint64_t output_index = x / average_output_time;" -> here the bug makes it bigger, but...
-
sech1
"output_index = num_rct_outputs - 1 - output_index;" -> here it becomes smaller than it should've been, so it skews selection to newer outputs?
-
sech1
looks like it
-
jberman[m]
average_output_time is underestimated -> x / average_output_time is overestimated -> num_rct_outputs - 1 - output_index lands you with an output index further back in the chain
-
sech1
are rct_offsets ordered from newer to older?
-
jberman[m]
<UkoeHB "hmm the odd thing is secparam 's"> ya, I think more rigorous analysis taking this into account and comparing to observed distributions over block intervals since this was introduced could provide more clarity. was introduced ~April 2019
-
sech1
the final value of output_index is smaller because of the bug, so it chooses smaller index from rct_offsets
-
UkoeHB
`average_output_time` is smaller (rounded down), which means `output_index` is larger on initial value (division), which means its final value is lower, so it causes the algorithm to select a lower block
-
jberman[m]
that^
-
sech1
UkoeHB how are rct_offsets ordered? From lower (older) block to higher (newer)?
-
jberman[m]
yea that's how they're ordered
-
sech1
-
sech1
which suggests it's in increasing height order
-
sech1
so after untangling all that, the bug skews to older outputs
-
jberman[m]
if we were to get an on-chain distribution since this was introduced, and we observe something like dual peaks, one at ~10 blocks and one further right, then that would suggest the newest heuristic would work well and the left-most peak is likely from real outputs. But that's not the only cause of twin peaks. The less bad cause would be the point at which the average output time shifted from 2 to 1, that would cause twin peaks as
-
jberman[m]
well
-
sech1
the fix is in PR 7798 and 7799
-
UkoeHB
jberman[m]: can you redo your get_outs simulation post-fix?