14:48:24 https://rucknium.me/html/spent-output-age-btc-bch-ltc-doge.html 14:49:28 ^ Here is an analysis of the evolution of the distribution of spent output age on BTC, BCH, LTC, and DOGE. 14:49:35 Source code here: https://github.com/Rucknium/OSPEAD/tree/main/General-Blockchain-Age-of-Spent-Outputs 14:51:00 Thanks to xmrack for early feedback, gingeropolous for server administration, and CCS donors for the storage required for the intermediate data processing on the research computing server. 14:52:12 The analysis of the transparent blockchains helps inform statistical modeling choices for improving the Monero decoy selection algorithm. 14:53:16 The main surprises in the analysis are: 14:56:12 1) The age distribution is quite unstable from week to week. In general I expected a sort of gradual evolutionary trend over time. The instability probably is caused by exchange rate volatility. Moser et al. (2018) claimed that the Monero real spend age distribution was stable over time. I didn't believe it when I read it and I definitely don't believe it now after analyzing the major transparent UTXO means-of-payment 14:56:12 blockchains. 14:57:21 Therefore, we (I) need to pay close attention to dynamic risk. OSPEAD is static in nature -- that is what the "S" in the acronym stands for. 14:59:35 2) There was not a strong correlation across blockchains in the mean nor standard deviation of the age distributions. However, there was a substantial correlation for the skewness and kurtosis, which correspond to the 3rd and 4th statistical moment of distributions. This makes some sense since skewness and kurtosis describe what is happening in the tails of distributions. 15:01:06 When there is cryptosphere-wide volatility in the cryptocurrency exchange rates, it is likely for old coins of all blockchains to "wake up" and participate in speculative activity in exchanges. I expected to see the correlations in the means, too, however. 15:02:15 Some things that were not surprising to me: 15:07:28 1) There is a daily cycle in the data. This probably corresponds to the sleep-wake cycle. Ronge et al. (2021) "Foundations of Ring Sampling" also noticed this in BTC. See their Figure 4. 15:08:10 2) I chose to test just two fitting distributions for simplicity at this stage: Log-gamma (lgamma) and Right-Pareto Log-normal (rpln). lgamma is what Moser et al. chose to fit. lgamma has just two parameters. rpln has three. Therefore, rpln is inherently more flexible than lgamma. In general, rpln fit the empirical distributions better. 15:10:34 3) In the forecasting step, taking the "interval" of 8 weeks as the forecasted distribution tended to perform better than just taking the most recent (i.e. last) week or doing more sophisticated forecasting. I think the sophisticated forecasting can be improved. I just took a standard forecasting technique with default options at this stage for demonstration purposes. 15:12:23 If changes in the age distribution are mostly caused by exchange rate volatility, then forecasting will have high inherent difficulty since forecasting the age distribution would be almost equivalent to forecasting exchange rate volatility, which is obviously difficult (see efficient market hypothesis). 15:14:14 My guess is that Monero transactions are slightly less likely to respond to exchange rate volatility compared to BTC/BCH/LTC/DOGE given that Monero is listed on fewer exchanges, but I think Monero still responds somewhat strongly to exchange rate volatility. 15:15:44 Any comments or questions are welcome. 15:15:52 Excellent. Is the data about the mean/median/std etc. available in a raw format? Some of the charts are too tiny to be readable. 15:17:49 tevador: I could post it. Right now it is in R data format. Would you like CSV format? 15:18:31 Any human-readable format is fine. Thanks! 15:18:58 For example, the BTC median is a line close to 0 with a single spike in 2018. I'd like to see a zoomed in version. 15:20:50 Ok. Just a few moments... 15:21:34 Overall, the median appears to be the most stable marker, apart from short periods of instability. But it may be just due to different relative ranges of the y-axis. 16:02:21 tevador: CSV files are in 16:02:22 https://github.com/Rucknium/OSPEAD/tree/main/General-Blockchain-Age-of-Spent-Outputs/data/summary-stats 16:02:36 Let me know if the format is OK 16:06:22 Perfect, thanks. 16:22:44 Meeting in 40 minutes 16:30:31 "^ Here is an analysis of the..." <- 🚀 16:34:10 "https://rucknium.me/html/spent-..." <- Woah 17:00:24 meeting time https://github.com/monero-project/meta/issues/725 17:00:24 1. greetings 17:00:24 hello 17:00:55 Hello 17:00:59 Hi 17:01:10 Hi 17:01:55 Hi 17:03:38 hello 17:04:08 2. updates, what's everyone working on? 17:04:50 Published my X25519 code, now I'm back to updating the Jamtis specs. 17:04:54 me: finished legacy balance recovery for my seraphis library, started unit testing it 17:05:10 As mentioned above, an analysis of the evolution of the distribution of spent output age on BTC, BCH, LTC, and DOGE: 17:05:11 https://rucknium.me/html/spent-output-age-btc-bch-ltc-doge.html 17:05:24 Source code here: https://github.com/Rucknium/OSPEAD/tree/main/General-Blockchain-Age-of-Spent-Outputs 17:05:53 I have just been reading a lot about the different zero knowledge schemes and I started scanning the blockchain (BP and so forth) in Rust now. 17:07:40 Nothing research related on my end, looking out for things that need doing to make sure the hard fork goes smooth (pool update, PR review, etc) 17:09:06 jberman[m]: btw, it should be possible to hook up the seraphis lib to do balance recovery with the current chain, with a bit of work; might be an interesting project (which would be a good proof of concept for a future real wallet) 17:09:44 3. discussion 17:10:08 noted :) 17:12:32 Did anything jump out already from that spent output age evolution that would give hints how to change something for Monero? 17:12:50 Or is still early days? 17:13:16 Looks interesting, in any case. 17:13:18 I had a look at Rucknium[m]'s excellent write-up. I'd try to fit some simpler PDF function as opposed to a very complex one. A shifted Pareto is one option. It's basically a power function. Might be enough just to fit the general trend. 17:13:25 So basically it's outlining the final step, which is forecasting. 17:14:12 We want to get a sense of the variability of other blockchains so we can know what sort of risk profile we may be facing from the forecast step. 17:15:31 What was the trend? Was, in your gut feeling, variability high or low? 17:16:57 tevador: Thanks for the feedback. Right now the general direction I'm heading in is to test out some mixture distributions to have some flexibility. Basically, combine two or more parametric distributions. And then _maybe_ add something to account for the 24-hour cycle, like a periodic Laplace distribution. 17:17:19 But the final decision is going to be determined by the performance, taking into account to not overfit 17:18:12 Unfortunately I do not have the "documentation" published yet, but here is an image of some distributions fit according to the loss function criteria in what I published above: https://github.com/Rucknium/OSPEAD/blob/main/images/dry-run/estimate-div-target/estimate-div-target-L_FGT-flavor-1.png 17:18:37 ^ This is a "dry run" based on the old Moser et al. (2018) data for demonstration purposes 17:19:09 And it displays the ratio so that it's clearer how well the distributions fit 17:19:38 I am working on the "documentation" for the chart 17:20:08 rbrunner: Variability in the spent output age distribution for those 4 blockchains was higher than I expected. 17:20:32 Ok, interesting 17:21:18 Which means that the "dynamic risk" for a static Monero decoy selection algorithm would be higher if Monero's distribution is similarly unstable 17:21:44 The "S" in OSPEAD stands for Static, so a fully dynamic DSA is pushed off for further research. But it is important to know the risk rather than ignore it. 17:22:58 All my ducks are in a row for OSPEAD. "Just" need to write it up, present it to the review panel, and then do the necessary estimations. 17:28:15 hmm, any other questions/comments/topics people want to bring up? 17:28:28 By the way, a while ago I paused the project to estimate the effect of minexmr's increased pool fee on their share of hashpower, but I want to go back to it eventually. I have mostly settled on a model and I was waiting on more data, i.e. more time to pass. The preliminary results suggested very little effect, if any. 17:28:50 And now they will close :) 17:29:53 Yes, but if nanopool takes their role....we may want to put more effort into trying to resolve the challenges of tevador 's anti-pool proposal. 17:30:10 So far it doesn't look good: https://miningpoolstats.stream/monero 17:30:28 Or, anti-centralized pool. Pro-p2pool :) 17:30:48 UkoeHB: Yes, I would like to help developing the wallet (or what is needed) for Seraphis so we could have a working full prototype asap. How is the overall project being coordinated and how exactly I could help? 17:31:46 Any question concerning Seraphis with the word "exactly" in it is still quite risky ... 17:32:05 not too much coordination so far, I'm just trying to finish the library so I can hand it off 17:32:05 here are some proposed changes to the Jamtis key hierarchy: https://gist.github.com/tevador/50160d160d24cfc6c52ae02eb3d17024#gistcomment-4259423 17:32:27 yes I still need to look at that (on vacation and focusing on legacy integration) 17:32:43 Yeah, I see it is a bit vague now but someday maybe it will be reality :p 17:34:05 Also I'd like to remove "certified addresses" from the main specs. It's better to move it to the future invoice specs. I hope UkoeHB hasn't implemented it yet :P 17:34:12 a good starting point is to look at unit tests and get an understanding how the library is put together 17:34:15 tevador: Nice. I will start participating on this discussion soon. 17:34:16 Becomes a bit more complex with X25519 17:34:29 tevador: I only implemented core features 17:34:46 UkoeHB: Ok! 17:35:05 this is what I'm using for unit tests https://github.com/UkoeHB/monero/blob/6272be0845a07c5c7d9613af4d70f695678efba5/src/seraphis/jamtis_core_utils.h#L60 17:35:48 btw, my proposed change has one extra key 17:36:18 yeah 17:37:42 UkoeHB: are you using Blake2b or Keccak? 17:38:20 blake2b https://github.com/UkoeHB/monero/blob/6272be0845a07c5c7d9613af4d70f695678efba5/src/seraphis/sp_hash_functions.cpp#L60 17:38:26 cool 17:44:42 ok I think we can wrap it up here, thanks for attending everyone 17:47:59 given the existence of multiple implementations of the decoy selection algorithm, can you unambiguously identify which algorithm produced an arbitrary ring? 17:48:15 Thank you for the work guys :) 17:59:12 hyc: Any such identification, if possible, would be probabilistic, not deterministic. 17:59:33 if not, then seems to me you can't make any deterministic conclusions about any particular ring. and, in that case it's a strength to have multiple implementations, not a single uniform implementation in all wallets. 18:01:39 A bit more precise: if a certain DSA always "skipped" certain ranges of blocks and you observed that a ring chose more than one ring member from that interval of skipped blocks (must be more than one since one could just a be the real spend), then you could deterministically rule out a particular DSA. 18:02:00 However, all the good DSAs do not skip blocks. 18:02:41 anyway, this seems to me a strong argument against enforcing a single algorithm at consensus layer 18:02:59 it depends how different the alogs are. openmonero's was so different from wallet2's that txs could be tied to that implementation with what seemed to me a solid degree of accuracy to me when I eyeballed it searching for txs, though I never ran any sort of statistical tests. I could also make a solid guess as to an openmonero user spending change from a prior tx (tx A -> tx B, both use clearly non-standard implementation) since 18:03:00 so few people used that algo 18:43:30 I’d argue for a single DSA at consensus level. As jberman described, giving wallet designers choice can lead to anonymity puddles. hyc I’ve been thinking about creating a small dataset of 3-4 different DSAs to see how well an unsupervised learning algo could cluster different wallets. 18:51:48 I believe you shouldn't be able to use txextra or rings to gather metadata (aside from perhaps coinbase tx) 18:51:48 2c 19:05:51 To add: probabilistically distinguishing DSAs sounds weak, but in certain cases we could be literally talking about 99.999% probability. I think with ring size even as low as 11 you could reject the null hypothesis that a uniformly-selected ring was selected from the specified Log-gamma distribution with 1 - machine_precision probability with a Kolmogorov-Smirnov test. 19:06:04 I will have to find the bit of code I wrote. 19:07:38 By the way, the K-S test is somewhat weak since it is very general. A while ago I found a few gamma-specific tests that had a statistical power (i.e. capability of distinguishing distributions) of 50% greater than the K-S test. 19:08:18 You can check the references here: https://cran.r-project.org/package=gofgamma 19:10:26 Monero's current DSA is a slight modification of the gamma distribution, so I'm not sure if the tests would be valid.