-
secparam[m]
Hi all, my name is Ian Miers, I'm a professor at the University of Maryland, zcash was my PhD thesis. A couple of students of mine have been looking into Monero, we're trying to work out some formal models for how private it is. We found something a little surprising, though it appears others have seen it too. If you plot the age of ring members, you get a peak at about 20 blocks. As we understand it, the decoys are sampled
-
secparam[m]
from an exp-gamma with shape 19.29, rate 1.61, and the plot of that peaks around 1300 instead.
-
secparam[m]
The data we observed aligns with some graphs from isthmus's talk at the Monero conference in 2019.
-
isthmus
Hey @secparam[m] 👋
-
isthmus
Very intriguing find, glad your eyes are on it.
-
isthmus
Can you tell whether signature is common among typical transactions, or is there a subset of transactions exhibiting the feature (perhaps across multiple rings, which would be the smoking gun for a custom decoy selection algorithm)
-
secparam[m]
We're still digging. Tentatively, we've eliminated a couple obvious causes (which is good, those would be very damaging for privacy), but beyond that, we're still exploring
-
tobtoht
50% of decoys are sampled from the last 1.8 days, or was that taken into account in your analysis?
-
isthmus
Hmm, initially my assumption was that somebody’s decoy selection algorithm was weighted wrong.
-
isthmus
But an alternative hypothesis is that a large number of true spends occur at exactly 20 blocks, and this is simply the ground truth showing up through the noise. :thinking:
-
gingeropolous
exactly 20 blocks though?
-
isthmus
Could be the case if a large exchange or mining pool or something moves like clockwork when 20 blocks pass...
-
tobtoht
I would expect the largest number of true spends occurs at 10 blocks (immediately following standard locktime)
-
gingeropolous
hrm, i guess "a peak" could mean anything.
-
gingeropolous
whats the delta
-
gingeropolous
ah the 1300 probably means that
-
secparam[m]
The analysis is a plot of mix in ages, so how its sampled shouldn't matter as far as I can tell. And yes, we expected to see a peak at about 1300 blocks (roughly 1.8 days) We didn't. Its possible thats a data analysis mistake on our part, which is part of why i'm asking.
-
gingeropolous
yeah i guess what im trying to discern is how far does the peak deviate from the average.
-
secparam[m]
But it also lines up at least partially with isthmus 's talk on juvenile rings
youtu.be/XIrqyxU3k5Q?t=1019
-
gingeropolous
because if its substantial but not much, could be a large player that has clockwork activity
-
gingeropolous
but if its like 100-fold then thats something else
-
secparam[m]
Yeah, this is possible.
-
-
UkoeHB
> from an exp-gamma with shape 19.29, rate 1.61, and the plot of that peaks around 1300 instead.
-
UkoeHB
Have you tried running the decoy selection algorithm itself to see if it behaves as expected (matches the theoretical plot)?
-
» gingeropolous scrolls wheel trying to spread x axis
-
tobtoht
-
secparam[m]
UkoeHB: no we haven't yet. Wasn't clear how one would easily go about isolating the actual decoy selection algorithm and instrumenting it. It's something we're considering doing. But we have both a simulation and a direct plot of the exp-gamma. I guess my first question, is the ring selection algorithm supposed to be an exp-gama with shape 19.29 and rate 1.61 still?
-
isthmus
Hmm, yea. A 10 block wait is for people in a rush. A 20 block wait is a logical choice for developers who want simple backend code that doesn’t have to worry about reorgs breaking anything. (e.g. if an exchange waits for 20 blocks on each deposit before sweeping it to cold storage)
-
gingeropolous
are you using a particular subset of the blockchain data? i.e., which consensus era are you analyzing
-
secparam[m]
blocks 1.8 Million through 2 is the plot I just gave you. The student is running ones on other segments which should have completed.
-
secparam[m]
But i don't have them at the moment. If its just 1.8 through 2, great. Though ... man thats strange.
-
gingeropolous
eh, it could be someone flood attacking or whatever its called
-
gingeropolous
though even im not that lazy to write an automated churning script that doesn't randomize the sleep interval
-
isthmus
@secparam[m] do you have a version of that plot with log x-axis?
-
isthmus
log y might be good too
-
isthmus
lol @gingeropolous
-
secparam[m]
No, but i can get one when the student is back at their terminal. But just to confirm, the sampling distribution is supposed to be exp-gamma with shape 19.29 and rate 1.61? I.e. if our data analysis tools are correct ,what we are seeing isn't expected?
-
secparam[m]
Someone didn't, for example, move the mean target to be much much sooner because people were spending UTXOs faster.
-
secparam[m]
* Someone didn't, for example, move the mean target to be much much sooner because people were spending UTXOs faster?
-
gingeropolous
-
gingeropolous
-
gingeropolous
#define GAMMA_SHAPE 19.28
-
gingeropolous
#define GAMMA_SCALE (1/1.61)
-
secparam[m]
Right, we went through that code. Both my and my graduate students conclusion was its still the same distribution as specified in this pull request
monero-project/monero #3528
-
gingeropolous
yeah, i figured i was pasting obvious things
-
secparam[m]
Well, i wasn't sure it was obvious, so good to double check.
-
gingeropolous
well im definitely no authority on this matter.
-
secparam[m]
Who would be?
-
isthmus
Hmm maybe one of the devs can confirm whether that’s still the spec. I don’t remember any changes since then but sometimes I’m AWOL for a month or two when meatspace is hectic, so I can’t say for sure (might have just missed it)
-
sarang
The selection now accounts for non-uniform block density (with respect to output count) by computing an "average output time" and redrawing within the resulting block
-
sarang
Otherwise, for example, outputs in less dense blocks can tend to be overselected
-
sarang
It's by no means an ideal approach
-
secparam[m]
Hrm, that might explain why we get a slightly more spiky plot if we plot by minute vs bock number. But i'd still expect mostly the same results, is that about right? Like it wouldn't explain the difference between a 1300k major peak (the expected exp gama) and say 20 or 30, which is the observed
-
isthmus
I'm curious whether those transactions have anything in common (# of inputs, # of outputs, absence/presence of encrypted/unencrypted payment IDs, unlock time, etc)
-
isthmus
Give or take a few false positives by unrelated transactions that just happen to have a decoy at 20 blocks
-
gingeropolous
did any of the components from your PCA have this as a significant enrichment?
-
gingeropolous
or however that would be phrased. covariate?
-
isthmus
Any signal at multiples of 20? (using 30, 50, 70, ... to correct if there's any signal from multiples of 10)
-
isthmus
Due to chain output linking, that would strongly suggest that it's the true spends rather than a quirky decoy selection algorithm
-
gingeropolous
well, i dunno what the PCA would really tell us. I mean, one interpretation of the PCA *not* identifying such (what would be a) strong signal. Well, was PCA run on all possible knobs on a transaction?
-
gingeropolous
sorry. fragmented thought there. one interpretation would be that because it didn't manifest in PCA, that the other knobs muct be so different as to bury the signal?
-
gingeropolous
oh hah. i think its pool payouts.
-
isthmus
Found the culprit?
-
isthmus
We haven't gotten around to PCA/SVD on the numeric values, but it's been on my to-do list since that's a key step in the heuristic generator sketched out in '19
-
secparam[m]
We haven't looked at that yet. We're tossing around a few ideas, but i figured I wanted to check 1) this wasn't what we'd expect to observe 2) we were actually observing it and it wasn't some error in our analysis code
-
secparam[m]
So far, we've probably eliminated it being a small number of fast moving UTXOs constantly churning.
-
isthmus
Interesting
-
isthmus
Curious what it'll turn out to be. If it is pool payouts, it would be possible to strongly confirm based on number of hops to a coinbase in the transaction tree
-
isthmus
It's cool that you've got somebody working on this
-
Inge
do many exchanges operate with 20 block confirmation times? Kraken is currently at 16
-
Inge
15*
-
moneromooo
git grep 'TEST(select_outputs, gamma)' to get a sample use of the output selection.
-
moneromooo
20 sounds like maybe the 10 block "can't spend" offset is counted twice.
-
moneromooo
Though theoretically it should just offset everything a smidgen, not peak.
-
merope
Maybe some of the non-official wallets don't follow this distribution?
-
merope
And/or some exchange/entity using custom implementations
-
merope
We've already seen custom implementations in regards to the fee/kB
-
moneromooo
That's a good point.
-
chad[m]
In response to the idea that these might be real spends (e.g. pool payouts or something), is it possible for a transaction to have multiple real outputs in a ring?
-
chad[m]
My understanding was that its only possible to have 1 real output and the other 10 must be decoys. If that is the case, then even if there is a mountain of real spends at 20 blocks, there should be at least 10 decoys for each of those real spends in other block heights.
-
chad[m]
An interesting question to answer IMO:
-
chad[m]
For these TXs using block 20 in a ring.. Are there other TXs from block 20 in the ring too? That might help distinguish if these are real spends, or decoys from an unexpected sample distribution
-
sethsimmons
> <@chad:monero.social> In response to the idea that these might be real spends (e.g. pool payouts or something), is it possible for a transaction to have multiple real outputs in a ring?
-
sethsimmons
>
-
sethsimmons
> My understanding was that its only possible to have 1 real output and the other 10 must be decoys. If that is the case, then even if there is a mountain of real spends at 20 blocks, there should be at least 10 decoys for each of those real spends in other block heights.
-
sethsimmons
No, each output gets it's own ring with 10 decoys, so there is only one true spend per ring.
-
sethsimmons
* No, each input TXO gets it's own ring with 10 decoys, so there is only one true spend per ring.
-
sethsimmons
The only reasoning I could see for this odd distribution is:
-
sethsimmons
a) Exchanges moving funds automatically at 20 confs to avoid re-orgs/move after confirmation in depositors account
-
sethsimmons
b) An attacker or incompetent wallet dev skewing/changing the decoy selection (as it's not enforce by protocol)
-
sethsimmons
* The only reasoning I could see for this odd distribution is:
-
sethsimmons
a) Exchanges moving funds automatically at 20 confs to avoid re-orgs/move after confirmation in depositors account
-
sethsimmons
b) An attacker or incompetent wallet dev skewing/changing the decoy selection (as it's not enforced by protocol rules)
-
sethsimmons
* The only reasoning I could see for this odd distribution is:
-
sethsimmons
a) Exchanges moving funds automatically at 20 confs to avoid re-orgs/move after confirmation in depositors account
-
sethsimmons
b) An attacker skewing/changing the decoy selection (as it's not enforced by protocol rules) to reduce valid decoys/reduce fungibility
-
sethsimmons
c) An incompetent wallet dev changing the output selection algo for some unknown reason
-
-
-
-
on-looker[m]
Would someone be able to explain atomic swaps to me or point me towards a good resource explaining it. I've been listening to some podcasts with the samouri guys and It sounds super interesting. I understand the high level overview of a trustless swap between chains but I am curious on a very technical level how can you arrange a swap between two chains with no middle man.
-
sethsimmons
<on-looker[m] "Would someone be able to explain"> This isn't the best place to chat about that, but here are some good resources:
-
on-looker[m]
sethsimmons: Thanks! I'll move over there if I have any more questions
-
» isthmus nods
-
isthmus
fwiw it's possible that the deposit lock that an exchange shows the user (10 blocks, 15 blocks) is different from when the backend sweeps deposits from hot to cold or whatever
-
isthmus
On the front end we want shorter unlocks for good UX, on the back end we want slow and stable .
-
gingeropolous
ugh all these relays. sethsimmons , to me your last message ended like this "are some good resources:"
-
secparam[m]
sarang: just to check, you're sure Monero still uses the same decoy select mechanism is
monero-project/monero #3528 which is , roughly, exp-gamma but with some adjustments for block times?
-
secparam[m]
* sarang: just to check, you're sure Monero still uses the same decoy select mechanism in
monero-project/monero #3528 which is , roughly, exp-gamma but with some adjustments for block times?
-
zkao
on-looker[m]: have a look at these atomic swaps RFCs here
github.com/farcaster-project/RFCs
-
zkao
and join #monero-swap if you'd like to discuss more
-
UkoeHB
secparam[m]: it's supposed to be using that mechanism, although your findings make me suspect there is a bug
-
sgp_[m]
Maybe this could be mining pool behavior?
-
sgp_[m]
secparam: here's some context on the adjustments, I'm still searching for the PR
monero-project/meta #307
-
sgp_[m]
maybe moneromooo would have it handy
-
sgp_[m]
-
moneromooo
A PR for what ?
-
moneromooo
Your changes to output selection for pools ? That was never done.
-
neptune[m]
FYI I am looking at this channel on both IRC and Matrix, and on Matrix, I cannot see IRC messages since "on-looker 'Thanks! I'll move...'", that were sent by isthmus, gingeropolous, zkao, UkoeHB, moneromooo. The IRC->Matrix side of the relay seems to have stopped at that point.
-
neptune[m]
here is that backlog since I am looking at it
-
-
secparam[m]
isthmus: I think you were asking if the spending anomaly we saw was only for the one block interval we saw. Its not. We see it going back to at least block 1,250,000.
-
-
-
-
UkoeHB
secparam[m]: is there chance you are plotting offsets directly, rather than [height of block that contains tx - reference index]? Output references are stored as a sequence of offsets.
-
UkoeHB
The first offset is absolute within the blockchain history, and each subsequent offset is relative to the previous. For example, with real offsets {7,11,15,20}, the transaction records {7,4,4,5}.