16:15:14 Meeting in less than one hour 16:17:58 Roger that 17:00:04 meeting time https://github.com/monero-project/meta/issues/726 17:00:04 1. greetings 17:00:04 hello 17:00:55 hi 17:01:02 Hello to everyone. 17:02:43 Hi 17:02:55 Hi 17:04:38 2. updates, what's everyone working on? looks like low turnout so might be a short meeting 17:05:57 me: OSPEAD work 17:06:42 In my CCS proposal I said "The upcoming hard fork, which does not yet have a fixed date, will include an increase in the ring size. The discontinuity that the hard fork creates can be leveraged to better understand how ring signatures work in pratcice [sic] on the Monero blockchain. Therefore, some of the research work will occur after the hard fork." 17:06:49 me: finished unit testing legacy balance recovery with my seraphis lib, still on vacation until sept so working slowly, will probably make a new CCS next week 17:07:49 Judging by the crash in on-chain transaction volume and the number of wallets that were not ready for the hard fork, we can safely say that the expected discontinuity has been achieved! 🎉🎉🎉🎉🎉 17:09:10 I discovered that legacy balance recovery is 4-10x more work than seraphis balance recovery due to A) the key image import workflow for view-only wallets, B) the possibility of duplicate onetime addresses that needs to be handled. 17:10:06 I'm waiting for feedback on some Jamtis changes. In the meantime, I'm planning to implement a deterministic binning strategy as an alternative to UkoeHB's current code. 17:10:19 Some recent discussion happened here: https://github.com/monero-project/research-lab/issues/84 also tangentially related to time locks. 17:10:59 been working on hard fork monitoring/cold wallet stuff. I think the crash in volume is primarily from the light wallet ecosystem (MyMonero, Exodus, Guarda, Edge, not sure of others), though it seems some people are still having sporadic issues in other wallets 17:11:31 tevador: yeah I'll get there... tbh pretty worn out on making changes after a year of continuous development 17:12:19 If anyone wants the empirical probability mass functions of the BTC&BCH<C&DOGE spent output age for each week since 2015, it is here: https://rucknium.me/data/weekly-spent-output-age-empirical-pmfs.tar.xz 17:12:57 ^ I'm planning to use this. 17:13:33 tevador: Could you explain more? 17:16:28 From the datasets you collected, predicting the real spend-age distribution seems impossible. So I want to try a different empirical strategy. 17:16:39 Not fully fleshed out yet. 17:16:59 Ok sounds good. 17:18:29 I thought the hard fork went very well. The issues seemed to be minor and came from those that did not update the software beforehand. 17:20:13 Do we want to discuss possible research priorities for the next hard fork, whenever it is? In other words, research on things that could only be realistically implemented in a hard fork? 17:22:01 sure we can move on 17:22:03 3. discussion 17:23:53 idk what priorities there are for next hardfork, maybe improving the defaul decoy selection algorithm? 17:25:28 Does that need a hardfork however? 17:25:29 Priorities could be: (1) Seraphis, obviously (2) nlocktime removal (or not) (3) 10 block lock removal (4) making p2pool much more attractive than centralized pool mining (5) Fee discretization and its implications (6) Possible enforcement of a decoy selection algorithm (7) tx_extra issues 17:25:43 perhaps BP++ will gain enough steam to be implemented 17:26:05 No, decoy selection does not need a hard fork. It is "best" to be implemented in a hard fork, but not necessary 17:26:25 jberman updated the decoy selection algorithm twice last year without a hard fork 17:27:04 Interesting, wasn't even aware 17:27:26 "Best" as in it is 25% better to implement a new DSA at a hard fork ;) 17:27:28 a large-scale overhaul should probably coincide with a hard fork for better adoption (which has privacy impacts) 17:28:04 But much better to implement one ASAP as soon as we have a well-supported improved one. 17:29:50 I think the privacy impacts of a poor DSA occur as we speak. Yes, there will be some issues with transaction non-uniformity as people update their wallet software, but that was also the case for jberman's fixes last year. 17:30:59 btw, the removal of the current lock time field would also simplify decoy selection 17:32:59 tevador: really? you still need to keep track of historical locked outputs 17:33:14 it only has an impact for seraphis where we get a clean slate 17:36:04 How much of a demand is there to get rid of the 10 block lockout? 17:37:39 Pretty high demand. Haveno wants it gone or reduced. So does LocalMonero. Users also probably want it gone 17:38:41 I would assume it would simplify things for Serai. kayabanerve , is that right? 17:38:46 yeah it's probably the biggest hit for Monero's UX 17:42:00 as far as i understood we also never got close to the current limit of 10 and the reasoning behind choosing exactly 10 is unclear. At least to me 17:42:29 good to know an hard fork is not needed btw 17:43:28 A hard fork is needed to change the 10 block lock AFAIK. A HF is not needed for decoy selection changes. 17:44:02 Historically all known reorgs have been 2 or 3 blocks deep. Making the lock 10 blocks means there is a good safety factor in case of network instability that enables deeper reorgs. 17:46:51 Fwiw the changes I made to the algo wouldn’t result in definitively identifiable pools of decoy selection algos, except for the one where if the change wasn’t made, 99% of rings would be compromised (the integer truncation one) 17:47:14 I think a change to the algo that would result in identifiable pools without a HF needs a very high bar to pass 17:47:47 would *not* result I guess? 17:48:09 We need to define "identifiable" precisely at some point 17:48:14 In terms of probabilities 17:48:39 Well both need a very high bar to pass. But I did mean what was written 17:48:59 Ah, yes, now I understand 17:49:14 We should not do something like that lightly 17:49:28 Also sorry to derail, but I think the most critical next significant step we need to take post hard fork is getting security proofs for multisig/a more comprehensive audit completed with the aim moving it out of experimental. I think it’s worth reaching out to veorq for that 17:51:11 that would be nice to have 17:51:35 UkoeHB: The 10 block lockout has served Monero well and I don't see how you can lower or eliminate it without risking major problems? 17:54:03 Haha it is kind of a "Chesterton's Fence" situation 17:54:25 one-horse-wagon[: isthmus this is the rationale https://github.com/monero-project/research-lab/issues/104#issuecomment-1186552665 17:54:38 So what would be the next concrete steps for multisig? The MAGIC Monero Fund could put in some funds, possibly. 17:55:07 A while back I tried to formulate a framework for approaching a potential reduction of the lock time, I dunno if it is helpful 17:55:08 https://raw.githubusercontent.com/noncesense-research-lab/lock_time_framework/master/writeup/lock_time_framework.pdf 17:55:41 I do not believe #2 to be correct anymore, based on subsequent work by hasu showing that lock time is not a strong mechanism against 51% attacks 17:57:14 Specifically this work: https://uncommoncore.co/wp-content/uploads/2019/10/A-model-for-Bitcoins-security-and-the-declining-block-subsidy-v1.02.pdf 17:58:08 Not all aspects apply to Monero (because the paper treats BTC mining as specialized-purpose equipment and RandomX is for general-purpose equipment, meaning that the switching costs are not the same). But much of it is very applicable 17:58:10 * isthmus ends ramble 17:58:10 isthmus: Is it OK if I link that here?: 17:58:11 https://github.com/monero-project/research-lab/issues/94 17:58:15 Yep 18:02:22 ok we are at the end of the hour so I'll call it here, thanks for attending everyone 18:06:13 "So what would be the next..." <- I'll be back with a plan next week :) 18:24:53 Hey Ruck do you have viz for these PMFs or should I bake some up? 18:26:32 isthmus: Yes I have some beautiful gifs if I do say so myself: 18:26:32 https://libera.monerologs.net/monero-research-lab/20220810 18:27:04 Oh hell yea there's a whole stats lecture in here 18:27:21 These are quite beautiful 18:30:01 I have some higher-resolution versions of the gifs if you want to see them. 18:37:13 Yea for sure, I could stare at these things for a while 18:37:14 https://usercontent.irccloud-cdn.com/file/C3COPill/image.png 18:37:28 What's the "x" column in the data you linked to earlier? 18:39:11 x is age. The unit is block. Or, rather, the target block time interval according to each coin 18:39:40 Ah great that makes sense, thanks 18:41:25 Note that "1" is actually "0" since I prepared these for the log plots. Zero as in confirmed in the same block, a "merlin" block, or less than half of the target block time, e.g. 5 minutes for bitcoin, since I rounded. 18:41:47 Isn't merlin the term for block with out-of-order time stamps? Can't find a good reference now... 18:42:04 Yea, that's the terminology I use 18:42:23 First ones in Monero were documented a few years ago in some obscure corner of our wiki 18:42:29 Ok. Citation: isthmus 18:42:33 :D 18:43:10 I didn't try to do any correction of those out-of-order blocks, in other words. The overall analysis was complicated as it is. 18:43:43 Yea, I think that's an edge case that can be ignored for the purpose what these analyses are getting at 18:44:00 How were the input data to generate these PMFs structured? (Just out of curiosity, to see if I might have some other uses for it) 18:45:47 Can you be more specific? Do you mean the process or something else? 18:46:53 The intermediate data is on the Research Computing Server by the way 18:47:12 Ah let me just explain directly what I'm after instead of asking vague questions 18:49:07 It is possible that NRL will soon move towards implementation of the unified heuristic framework for analyzing Monero’s transaction tree topology. We have sophisticated enough infra and a long enough (too long…) list of heuristics to work with. https://www.overleaf.com/read/bxbpmwxgs 18:49:16 At first this chain length analysis might not seem like a big deal (actually at first I thought it would not be very powerful) due to low sender / recipient fingerprint correlation. And then I realized, quite sadly, that change output chains are going to be where most of the information will be statistically loud enough to scream at you. 18:49:27 The thing I’m curious about is a heuristic for change output chain thread count and length. For example, a hot wallet that makes 1000 transactions will probably not have 1 chain that is 1000 long, except in edge cases like an initial deposit with a big enough balance and strictly sequential transactions. 18:49:37 What would we expect to see? 1000 transactions with 10 change chains that are an average of 100 blocks deep? Or 100 change chains that are an average of 10 blocks deep? 18:49:43 I was wondering if your input data (which clearly had some input/output structure) contains the information to learn about the statistical distribution of various characteristics of change chains (also maybe convergence and some higher-order stuff) 18:50:30 If it's low hanging fruit from the available data, it could be cool to get a read from other chains. But I wouldn't put much time into it, because I believe this is probably relatively dependent on many factors (perhaps most importantly the input selection algorithm) 18:53:19 Here are the high-resolution gifs. The text of the document is slightly out-of-date compared to the doc with the low-resolution gifs. Also I can't get the log-log scale graphs to load on Firefox but for some reason they load OK on Tor Browser. But of course it takes a long time to load the page with Tor 18:53:24 https://rucknium.me/html/spent-output-age-btc-bch-ltc-doge-HIGH-RESOLUTION.html 18:56:22 isthmus: The intermediate data has the full transaction graph. I think that's what you want, right? 18:56:50 Yeppers 18:56:54 In fact a large part of the processing code was re-purposed code from this project, which directly analyzed the BCH transaction graph: 18:56:58 https://rucknium.me/posts/cashfusion-descendants/ 18:58:16 So I have the full tx graph edgelist for each coin somewhere 19:00:45 It doesn't have any of the tx characteristics that are usually used to perform heuristic chain analysis to get the probably change outputs though. Just output -> tx -> output and amounts. Not even addresses. Just outputs defined by their position in the tx output field. 19:01:43 👀 19:04:12 Oh yea, I'll probably skip it then. I was already on the fence given the expected mediocre cross-chain representativeness, and I think that messing with attribution labels on a plaintext chain would just be a distraction from executing it on Monero 19:10:14 isthmus: Did you see Egger et al. (2022) "On Defeating Graph Analysis of Anonymous Transactions"? I think it may have implications for the "heuristics framework". 19:10:40 I discussed it and related papers here: https://libera.monerologs.net/monero-research-lab/20220706#c117336 19:11:09 I'm not very strong with graph theory though 19:18:47 Yea, they're related to a degree. I don't think I came right out and said it in the doc, but one of the main motivators for the unified heuristics framework was to leverage all of the information available from fungibility defects to seed initial weights for bipartite graph analysis 19:20:16 (which is how Egger et al formulate parts of the problem) 19:21:22 The reason I had that in mind is because of some 2019-ish MRL research along that vein, see https://www.youtube.com/watch?v=xicn4rdUj_Q 19:22:05 If you wanted to start stacking 5 years of MRL work into a monster graph matching machine, I'd use the heuristics framework as preprocessing for bipartite graph analysis (i.e. initial seeds) 19:24:30 My take-away from Egger et al. (2022) was that graph analysis was less powerful than I had thought., at least as a "global" attack. The million XMR question is if probability weights can strengthen graph analysis enough to make it substantially dangerous. 19:25:46 a silly counter measure to the change chain might be something that tells the user "your output has been used by n txs".. though the 10 block lock sorta..... 19:28:22 Sorry I just realized that my overleaf link above requires making an account. Here's a copy with no login https://raw.githubusercontent.com/Mitchellpkt/heuristics_framework_doc/main/heuristic_framework.pdf 19:28:51 I started this doc in late 2020 and haven't updated it in quite a while, so it might be missing some new heuristics or maybe we fixed some 19:29:02 @gingeropolous +1 19:29:12 Did somebody do that recently? Maybe I saw something on reddit... Can't quite recall 19:30:26 isthmus: Yes: https://github.com/pokkst/monero-decoy-scanner 19:30:33 🔥 19:31:54 IMHO, users deliberately waiting until their outputs have been used as decoys in X transactions is not an optimal strategy, if used systematically. Since it means an adversary could just rule out those first X transactions. 19:33:18 Agree, I decoy scanner is cool to watch but should not drive transacting habits 19:34:50 anyways, unfortunately Egger et al's optimistic takeaway does not translate to the heuristics framework described in the doc, since NRL's work uses transaction metadata to establish priors for partitioning the graph by fingerprint, whereas the partitioning samplers in their work are oblivious to that information 19:35:11 "partitioning" just being used in two very different contexts 19:36:42 I gotta get back to work, will swing by later 19:37:12 The MAGIC Monero Fund recently specifically outreached to one of the authors of the Egger et al. (2022) paper to see if they want to be funded for more Monero research. We'll see. 20:54:25 yeah 22:58:42 I discovered something unpleasant. On my machine it costs ~2ms per commitment to initialize generator caches for bulletproofs ( https://github.com/UkoeHB/monero/blob/a4ce3d1a4fb5cd6d5e9af9188ec6f411934c5118/src/seraphis/bulletproofs_plus2.cpp#L123 it's 64 generators per commitment, each of them needing one blake2b hash and one cryptonote hash-to-point). To get 128 commitments, that's a fixed cost of ~250ms the first 22:58:42 time you touch BP code. Right now we only need 16 commitments because max outputs is 16, but with seraphis in the squashed model you also need to range proof the inputs. I am working with 128 commitments: 16 max outputs + 112 max inputs. 22:58:42 kayabaNerve if elligator2 is significantly faster than cryptonote hash-to-point, this may be a good reason to use that. 23:04:46 sorry 128 generators per commitment* since you need 2 at each index 23:56:30 or maybe we just need a giant hash table baked into the source... 23:57:05 524kB