01:42:45 Hi all, my name is Ian Miers, I'm a professor at the University of Maryland, zcash was my PhD thesis. A couple of students of mine have been looking into Monero, we're trying to work out some formal models for how private it is. We found something a little surprising, though it appears others have seen it too. If you plot the age of ring members, you get a peak at about 20 blocks. As we understand it, the decoys are sampled 01:42:45 from an exp-gamma with shape 19.29, rate 1.61, and the plot of that peaks around 1300 instead. 01:44:14 The data we observed aligns with some graphs from isthmus's talk at the Monero conference in 2019. 01:56:31 Hey @secparam[m] 👋 01:56:32 Very intriguing find, glad your eyes are on it. 01:56:50 Can you tell whether signature is common among typical transactions, or is there a subset of transactions exhibiting the feature (perhaps across multiple rings, which would be the smoking gun for a custom decoy selection algorithm) 01:57:56 We're still digging. Tentatively, we've eliminated a couple obvious causes (which is good, those would be very damaging for privacy), but beyond that, we're still exploring 01:58:24 50% of decoys are sampled from the last 1.8 days, or was that taken into account in your analysis? 02:01:11 Hmm, initially my assumption was that somebody’s decoy selection algorithm was weighted wrong. 02:01:16 But an alternative hypothesis is that a large number of true spends occur at exactly 20 blocks, and this is simply the ground truth showing up through the noise. :thinking: 02:01:49 exactly 20 blocks though? 02:01:53 Could be the case if a large exchange or mining pool or something moves like clockwork when 20 blocks pass... 02:02:27 I would expect the largest number of true spends occurs at 10 blocks (immediately following standard locktime) 02:02:39 hrm, i guess "a peak" could mean anything. 02:02:45 whats the delta 02:03:15 ah the 1300 probably means that 02:03:34 The analysis is a plot of mix in ages, so how its sampled shouldn't matter as far as I can tell. And yes, we expected to see a peak at about 1300 blocks (roughly 1.8 days) We didn't. Its possible thats a data analysis mistake on our part, which is part of why i'm asking. 02:04:33 yeah i guess what im trying to discern is how far does the peak deviate from the average. 02:04:39 But it also lines up at least partially with isthmus 's talk on juvenile rings https://youtu.be/XIrqyxU3k5Q?t=1019 02:04:55 because if its substantial but not much, could be a large player that has clockwork activity 02:05:20 but if its like 100-fold then thats something else 02:05:22 Yeah, this is possible. 02:05:26 * secparam[m] uploaded an image: (957KiB) < https://libera.ems.host/_matrix/media/r0/download/matrix.org/YAhrAUMqdKtjnJadWIVMgscB/image.png > 02:05:52 > from an exp-gamma with shape 19.29, rate 1.61, and the plot of that peaks around 1300 instead. 02:05:52 Have you tried running the decoy selection algorithm itself to see if it behaves as expected (matches the theoretical plot)? 02:05:54 * gingeropolous scrolls wheel trying to spread x axis 02:07:26 ^ the function that manages decoy selection is over here: https://github.com/monero-project/monero/blob/master/src/wallet/wallet2.cpp#L8157 02:07:53 UkoeHB: no we haven't yet. Wasn't clear how one would easily go about isolating the actual decoy selection algorithm and instrumenting it. It's something we're considering doing. But we have both a simulation and a direct plot of the exp-gamma. I guess my first question, is the ring selection algorithm supposed to be an exp-gama with shape 19.29 and rate 1.61 still? 02:07:53 Hmm, yea. A 10 block wait is for people in a rush. A 20 block wait is a logical choice for developers who want simple backend code that doesn’t have to worry about reorgs breaking anything. (e.g. if an exchange waits for 20 blocks on each deposit before sweeping it to cold storage) 02:08:08 are you using a particular subset of the blockchain data? i.e., which consensus era are you analyzing 02:09:03 blocks 1.8 Million through 2 is the plot I just gave you. The student is running ones on other segments which should have completed. 02:09:17 But i don't have them at the moment. If its just 1.8 through 2, great. Though ... man thats strange. 02:09:53 eh, it could be someone flood attacking or whatever its called 02:10:27 though even im not that lazy to write an automated churning script that doesn't randomize the sleep interval 02:10:33 @secparam[m] do you have a version of that plot with log x-axis? 02:13:09 log y might be good too 02:13:58 lol @gingeropolous 02:14:26 No, but i can get one when the student is back at their terminal. But just to confirm, the sampling distribution is supposed to be exp-gamma with shape 19.29 and rate 1.61? I.e. if our data analysis tools are correct ,what we are seeing isn't expected? 02:15:03 Someone didn't, for example, move the mean target to be much much sooner because people were spending UTXOs faster. 02:15:13 * Someone didn't, for example, move the mean target to be much much sooner because people were spending UTXOs faster? 02:18:39 talk of gamma starts here: https://github.com/monero-project/monero/blob/master/src/wallet/wallet2.cpp#L8474 02:20:37 https://github.com/monero-project/monero/blob/master/src/wallet/wallet2.cpp#L137 02:20:47 #define GAMMA_SHAPE 19.28 02:20:47 #define GAMMA_SCALE (1/1.61) 02:20:49 Right, we went through that code. Both my and my graduate students conclusion was its still the same distribution as specified in this pull request https://github.com/monero-project/monero/pull/3528 02:21:10 yeah, i figured i was pasting obvious things 02:22:19 Well, i wasn't sure it was obvious, so good to double check. 02:23:13 well im definitely no authority on this matter. 02:32:41 Who would be? 02:37:42 Hmm maybe one of the devs can confirm whether that’s still the spec. I don’t remember any changes since then but sometimes I’m AWOL for a month or two when meatspace is hectic, so I can’t say for sure (might have just missed it) 02:40:54 The selection now accounts for non-uniform block density (with respect to output count) by computing an "average output time" and redrawing within the resulting block 02:41:52 Otherwise, for example, outputs in less dense blocks can tend to be overselected 02:42:08 It's by no means an ideal approach 02:44:05 Hrm, that might explain why we get a slightly more spiky plot if we plot by minute vs bock number. But i'd still expect mostly the same results, is that about right? Like it wouldn't explain the difference between a 1300k major peak (the expected exp gama) and say 20 or 30, which is the observed 02:49:23 I'm curious whether those transactions have anything in common (# of inputs, # of outputs, absence/presence of encrypted/unencrypted payment IDs, unlock time, etc) 02:50:46 Give or take a few false positives by unrelated transactions that just happen to have a decoy at 20 blocks 02:53:46 did any of the components from your PCA have this as a significant enrichment? 02:54:26 or however that would be phrased. covariate? 02:56:57 Any signal at multiples of 20? (using 30, 50, 70, ... to correct if there's any signal from multiples of 10) 02:58:08 Due to chain output linking, that would strongly suggest that it's the true spends rather than a quirky decoy selection algorithm 02:58:48 well, i dunno what the PCA would really tell us. I mean, one interpretation of the PCA *not* identifying such (what would be a) strong signal. Well, was PCA run on all possible knobs on a transaction? 02:59:56 sorry. fragmented thought there. one interpretation would be that because it didn't manifest in PCA, that the other knobs muct be so different as to bury the signal? 03:01:42 oh hah. i think its pool payouts. 03:04:26 Found the culprit? 03:06:11 We haven't gotten around to PCA/SVD on the numeric values, but it's been on my to-do list since that's a key step in the heuristic generator sketched out in '19 03:11:05 We haven't looked at that yet. We're tossing around a few ideas, but i figured I wanted to check 1) this wasn't what we'd expect to observe 2) we were actually observing it and it wasn't some error in our analysis code 03:11:47 So far, we've probably eliminated it being a small number of fast moving UTXOs constantly churning. 03:22:18 Interesting 03:23:37 Curious what it'll turn out to be. If it is pool payouts, it would be possible to strongly confirm based on number of hops to a coinbase in the transaction tree 03:24:59 It's cool that you've got somebody working on this 08:59:07 do many exchanges operate with 20 block confirmation times? Kraken is currently at 16 08:59:10 15* 09:34:25 git grep 'TEST(select_outputs, gamma)' to get a sample use of the output selection. 09:34:55 20 sounds like maybe the 10 block "can't spend" offset is counted twice. 09:35:16 Though theoretically it should just offset everything a smidgen, not peak. 09:36:04 Maybe some of the non-official wallets don't follow this distribution? 09:36:26 And/or some exchange/entity using custom implementations 09:37:17 We've already seen custom implementations in regards to the fee/kB 09:41:00 That's a good point. 14:07:43 In response to the idea that these might be real spends (e.g. pool payouts or something), is it possible for a transaction to have multiple real outputs in a ring? 14:07:43 My understanding was that its only possible to have 1 real output and the other 10 must be decoys. If that is the case, then even if there is a mountain of real spends at 20 blocks, there should be at least 10 decoys for each of those real spends in other block heights. 14:09:01 An interesting question to answer IMO: 14:09:01 For these TXs using block 20 in a ring.. Are there other TXs from block 20 in the ring too? That might help distinguish if these are real spends, or decoys from an unexpected sample distribution 14:09:09 > <@chad:monero.social> In response to the idea that these might be real spends (e.g. pool payouts or something), is it possible for a transaction to have multiple real outputs in a ring? 14:09:09 > 14:09:09 > My understanding was that its only possible to have 1 real output and the other 10 must be decoys. If that is the case, then even if there is a mountain of real spends at 20 blocks, there should be at least 10 decoys for each of those real spends in other block heights. 14:09:09 No, each output gets it's own ring with 10 decoys, so there is only one true spend per ring. 14:09:19 * No, each input TXO gets it's own ring with 10 decoys, so there is only one true spend per ring. 14:12:32 The only reasoning I could see for this odd distribution is: 14:12:33 a) Exchanges moving funds automatically at 20 confs to avoid re-orgs/move after confirmation in depositors account 14:12:33 b) An attacker or incompetent wallet dev skewing/changing the decoy selection (as it's not enforce by protocol) 14:12:52 * The only reasoning I could see for this odd distribution is: 14:12:52 a) Exchanges moving funds automatically at 20 confs to avoid re-orgs/move after confirmation in depositors account 14:12:52 b) An attacker or incompetent wallet dev skewing/changing the decoy selection (as it's not enforced by protocol rules) 14:13:55 * The only reasoning I could see for this odd distribution is: 14:13:55 a) Exchanges moving funds automatically at 20 confs to avoid re-orgs/move after confirmation in depositors account 14:13:55 b) An attacker skewing/changing the decoy selection (as it's not enforced by protocol rules) to reduce valid decoys/reduce fungibility 14:13:55 c) An incompetent wallet dev changing the output selection algo for some unknown reason 14:14:41 * sethsimmons < https://libera.ems.host/_matrix/media/r0/download/libera.chat/c5be103f16c573463da5d2a269ef79afebb44516/message.txt > 14:15:23 * sethsimmons < https://libera.ems.host/_matrix/media/r0/download/libera.chat/12c261a1b7ed003bd60d13a016fa6d371c1d9d58/message.txt > 14:15:39 * sethsimmons < https://libera.ems.host/_matrix/media/r0/download/libera.chat/fd691a009e18b5232a86aab5e7ff2bfe7707af48/message.txt > 14:56:35 Would someone be able to explain atomic swaps to me or point me towards a good resource explaining it. I've been listening to some podcasts with the samouri guys and It sounds super interesting. I understand the high level overview of a trustless swap between chains but I am curious on a very technical level how can you arrange a swap between two chains with no middle man. 15:03:37 This isn't the best place to chat about that, but here are some good resources: 15:04:43 sethsimmons: Thanks! I'll move over there if I have any more questions 15:34:23 * isthmus nods 15:35:21 fwiw it's possible that the deposit lock that an exchange shows the user (10 blocks, 15 blocks) is different from when the backend sweeps deposits from hot to cold or whatever 15:36:17 On the front end we want shorter unlocks for good UX, on the back end we want slow and stable . 15:44:06 ugh all these relays. sethsimmons , to me your last message ended like this "are some good resources:" 16:55:08 sarang: just to check, you're sure Monero still uses the same decoy select mechanism is https://github.com/monero-project/monero/pull/3528 which is , roughly, exp-gamma but with some adjustments for block times? 16:55:31 * sarang: just to check, you're sure Monero still uses the same decoy select mechanism in https://github.com/monero-project/monero/pull/3528 which is , roughly, exp-gamma but with some adjustments for block times? 17:06:53 on-looker[m]: have a look at these atomic swaps RFCs here https://github.com/farcaster-project/RFCs 17:07:19 and join #monero-swap if you'd like to discuss more 17:11:46 secparam[m]: it's supposed to be using that mechanism, although your findings make me suspect there is a bug 17:30:06 Maybe this could be mining pool behavior? 17:32:12 secparam: here's some context on the adjustments, I'm still searching for the PR https://github.com/monero-project/meta/issues/307 17:33:34 maybe moneromooo would have it handy 17:35:39 https://github.com/SarangNoether/skunkworks/tree/outputs/outputs 18:01:29 A PR for what ? 18:02:01 Your changes to output selection for pools ? That was never done. 18:02:16 FYI I am looking at this channel on both IRC and Matrix, and on Matrix, I cannot see IRC messages since "on-looker 'Thanks! I'll move...'", that were sent by isthmus, gingeropolous, zkao, UkoeHB, moneromooo. The IRC->Matrix side of the relay seems to have stopped at that point. 18:10:42 here is that backlog since I am looking at it 18:10:43 * neptune[m] < https://libera.ems.host/_matrix/media/r0/download/libera.chat/9c4b3cadd9169936867ab705d292b119af882059/message.txt > 23:25:58 isthmus: I think you were asking if the spending anomaly we saw was only for the one block interval we saw. Its not. We see it going back to at least block 1,250,000. 23:31:39 * secparam[m] uploaded an image: (382KiB) < https://libera.ems.host/_matrix/media/r0/download/matrix.org/vMQUQNcXZCtqurEtHhxtxSAF/image.png > 23:31:58 * secparam[m] uploaded an image: (382KiB) < https://libera.ems.host/_matrix/media/r0/download/matrix.org/MTySeFoDVDfAOgQQhGCRzPdQ/image.png > 23:32:58 * secparam[m] uploaded an image: (104KiB) < https://libera.ems.host/_matrix/media/r0/download/matrix.org/pHHdZdREOFjTVOutKQCvVeoD/image.png > 23:46:28 secparam[m]: is there chance you are plotting offsets directly, rather than [height of block that contains tx - reference index]? Output references are stored as a sequence of offsets. 23:46:28 The first offset is absolute within the blockchain history, and each subsequent offset is relative to the previous. For example, with real offsets {7,11,15,20}, the transaction records {7,4,4,5}.