#monero-research-lab

02:01

jberman[m]

secparam: I ran multiple simulations of get_outs in my client making sure to isolate the gamma outputs. Here were my results attempting get_outs 100k times
02:01

» jberman[m] uploaded an image: (33KiB) <libera.ems.host/_matrix/media/r0/do…%20100k%20simulations%20results.png>
02:02

» jberman[m] <libera.ems.host/_matrix/media/r0/do…786e882cadd487d753a2107/message.txt>
02:40

sethsimmons

Where did you report that bug, jberman?
02:40

sethsimmons

I don't see an issue or mention in #monero-dev:monero.social and want to make sure its appropriately shared and reviewed.
02:41

jberman[m]

DM'd moneromooo, luigi and fluffy. Talked about it with moneromooo, and ran it by him before sharing it here. He suggested sharing in -dev too. Should I take it there also?
02:42

sgp_[m]

dev is a good place if you're making these PRs, yup :)
02:42

sethsimmons

<jberman[m] "DM'd moneromooo, luigi and fluff"> Great! Yeah, that would be a good idea if moneromooo and co are game with it being public.
02:44

jberman[m]

Sounds good
06:13

sech1

jberman that line in the code, "average_output_time = DIFFICULTY_TARGET_V2 * blocks_to_consider / outputs_to_consider;", calculates time in seconds so it should be ok with integer division
06:15

sech1

actually it depends on what range "outputs_to_consider" covers... Need to step through it in the debugger to find out
06:22

UkoeHB

sech1 it is supposed to be computing the average rate that outputs were added to the chain over up to the most recent 1 year period
06:24

sech1

"DIFFICULTY_TARGET_V2 * blocks_to_consider" is definitely in seconds. The questionn is, what's the resulting value there (average_output_time)
06:24

UkoeHB

It does feel a bit embarrassing that I didn't notice this integer division issue either
06:24

sech1

if it's 1 or 2 then it must be computed as double
06:24

sech1

but if it's ~164000 (1.9 days in seconds) then it's ok
06:25

UkoeHB

outputs_to_consider should be the total number of outputs in the last year's worth of blocks
06:26

sech1

then it should be computed as double
06:27

jberman[m]

How UkoeHB is reading it is how I read it too^
06:27

jberman[m]

Using integer division, you end up rounding the true average of ~1.9 seconds per output over the past year down to 1 second, which is a significant difference from the true average, that then flows into the output selection in pick(). I can’t see why it would be expected to have such a large difference from the true average
06:27

jberman[m]

Plus, just taking a step back, it wouldn’t make sense for average_output_time to be a double if the intent was to do integer division there
06:28

UkoeHB

I think this algorithm was built when tx volume was way lower, so it must not have affected test runs
06:29

jberman[m]

Ya the more outputs, the worse the impact on the calculation downstream
06:30

sech1

*the more often outputs appear
06:30

sech1

once it got sub 5 seconds per output on average, it got bad
06:31

sech1

hmm, won't we get 0 there eventually, if we have a lot of transactions for more than a year?
06:31

sech1

more than 120 outputs per block on average
06:31

jberman[m]

Yep, probably would’ve been discovered naturally soon
06:32

jberman[m]

Through normal use
06:32

sech1

division by zero on line 1031, classic
06:33

jberman[m]

Ya seems clients would start failing to construct tx’s
06:37

selsta

should be easy to fix, right? was planning a new release anyway
06:39

jberman[m]

Super easy: j-berman/monero #3/files
06:40

jberman[m]

Waiting on repo rights to submit a PR, but I don’t care if it’s not me and you wanna release in the meantime
06:47

sech1

this will however change output selection algorithm and we'll have two different versions for quite a while. There will be probably privacy implications
06:48

sech1

two anonymity "puddles" instead of one
06:48

sech1

for quite a while = until next hardfork
06:59

jberman[m]

That’s true, maybe it does makes sense to hold off
07:03

sech1

wrong output selection hurts privacy already now though
07:04

sech1

I think the benefits of the fix outweigh it
07:04

sech1

pools (miner payouts) and exchanges (deposits/withdrawals) will update quickly and then we'll have big enough anonymity set for the new output selection
07:05

sech1

besides, division by zero bug won't wait
07:44

sech1

so if I'm reading the code right, this bug makes all decoy outputs older than they should've been (1.9x older), which makes "newest output = real spend" heuristic work better
07:44

sech1

in this case, it should be fixed asap
07:49

UkoeHB

hmm the odd thing is secparam[m] 's results which showed most decoys at 20 blocks, which jberman[m] seemed to corroborate
07:53

sech1

hmm, I read the code wrong. "uint64_t output_index = x / average_output_time;" -> here the bug makes it bigger, but...
07:54

sech1

"output_index = num_rct_outputs - 1 - output_index;" -> here it becomes smaller than it should've been, so it skews selection to newer outputs?
07:55

sech1

looks like it
07:55

jberman[m]

average_output_time is underestimated -> x / average_output_time is overestimated -> num_rct_outputs - 1 - output_index lands you with an output index further back in the chain
07:59

sech1

are rct_offsets ordered from newer to older?
07:59

jberman[m]

<UkoeHB "hmm the odd thing is secparam 's"> ya, I think more rigorous analysis taking this into account and comparing to observed distributions over block intervals since this was introduced could provide more clarity. was introduced ~April 2019
08:00

sech1

the final value of output_index is smaller because of the bug, so it chooses smaller index from rct_offsets
08:02

UkoeHB

`average_output_time` is smaller (rounded down), which means `output_index` is larger on initial value (division), which means its final value is lower, so it causes the algorithm to select a lower block
08:04

jberman[m]

that^
08:06

sech1

UkoeHB how are rct_offsets ordered? From lower (older) block to higher (newer)?
08:07

jberman[m]

yea that's how they're ordered
08:07

sech1

I only found this code: github.com/monero-project/monero/bl…ryptonote_core/blockchain.cpp#L2304
08:08

sech1

which suggests it's in increasing height order
08:08

sech1

so after untangling all that, the bug skews to older outputs
08:21

jberman[m]

if we were to get an on-chain distribution since this was introduced, and we observe something like dual peaks, one at ~10 blocks and one further right, then that would suggest the newest heuristic would work well and the left-most peak is likely from real outputs. But that's not the only cause of twin peaks. The less bad cause would be the point at which the average output time shifted from 2 to 1, that would cause twin peaks as
08:21

jberman[m]

well
09:17

sech1

the fix is in PR 7798 and 7799
23:13

UkoeHB

jberman[m]: can you redo your get_outs simulation post-fix?

4 years ago

« 2 days earlier

a day later »

today »