15:43:54 meeting ~1hr: https://github.com/monero-project/meta/issues/668 16:51:48 has this been discussed? https://eprint.iacr.org/2021/089 16:55:01 they seem to add fuzziness to the view tag, not the FindReceived key 17:00:03 Meeting time 17:00:09 1. Greetings 17:00:10 Hello 17:00:37 Hi 17:00:58 hello 17:01:23 Hi 17:03:01 Hiya 17:03:20 Hi 17:04:36 sgp_: view tags are already fuzzy 17:05:02 UkoeHB: oh yeah, of course duh. Sorry :) 17:05:12 however if you can compute a view tag then you can also compute a nominal spend key, which reveals more info 17:05:59 but we need that for efficiency on the client side 17:07:52 2. let's do updates, what is everyone working on these days? 17:09:04 The MAGIC Monero Fund has its first research grant application, by xmr-ack : 17:09:04 https://github.com/MAGICGrants/Monero-Fund/issues/15 17:09:52 me: I finished my CCS ( https://repo.getmonero.org/monero-project/ccs-proposals/-/merge_requests/256#note_15087 ) and plan to make a new one today. 17:10:29 The general idea is to see how accurately machine learning techniques can identify the real spend in a ring, using a synthetic dataset. 17:11:09 The MAGIC Monero Fund is asking MRL for feedback on the grant application. Of course, the final decision rests with the committee. 17:11:34 Maybe isthmus would have some input given his experience with machine learning. 17:12:04 Is that the background of the currently very high tx traffic on testnet? 17:12:20 rbrunner: Yes. 17:12:36 Sounds like an interesting project. 17:12:49 For a layman like me, at least 17:14:15 I'm wondering how you translate results/models obtained for a synthetic data set to the real data set. 17:14:31 oh jesus he's using the public testnet? 17:14:53 You did not notice that huuuuuge amount of traffic there over the last 3 weeks? :) 17:15:09 ugh thats unnecessary 17:18:09 Not sure. As soon as somebody wants to confirm results, a public blockchain may be very handy 17:18:36 Also, plowsof and I set up an instance of WIKINDX at https://moneroresearch.info/ . It's a place to collect Monero-related papers and annotate them. My hope is that it can help onboard new researchers and help us establish a workflow for reviewing new papers that are written about Monero. 17:19:55 I've disabled public user registrations to avoid vandalism, but if anyone wants to create a user to be able to add, edit, and add annotations to papers, just message me and I will create one for you. 17:20:22 also, stagenet might potentially be better. testnet could get hella ugly if/when we actually test the new release. the randomx testnet was brutal 17:21:06 but yeah rbrunner re: public blockchain reproducibility considerations. 17:22:08 Hopefully that WIKINDX thing does not need to much babysitting and does not surprise with new security holes every fortnight :) 17:23:46 rbrunner: WIKINDX has been around since 2003 apparently. It was hard to set up, but hopefully it is "mature" by now. 17:23:51 3. I guess we can move to discussion. Any items to discuss? Perhaps from the agenda 17:23:54 and UkoeHB re: synthetic vs. real data. I share the same curiosity, and i'd propose to use the bitcoin blockchain with ring sigs superimposed somehow 17:24:24 but, its good to do things in multiple ways i guess 17:25:02 I have a question that may be of wider interest and was brought up by a recent video about Seraphis: 17:25:10 I refer to the following article: https://www.getmonero.org/2021/12/22/what-is-seraphis.html 17:25:15 gingeropolous: yeah you could probably generate the bitcoin blockchain with ring sigs all offline. 17:25:21 Under "membership proof delegation" it mentions that this may open up the following possibility: 17:25:28 Ignore 10-block lock time when transacting with a *trusted* party (i.e. allow them to make your tx's membership proofs and submit the tx to the network on your behalf). 17:25:35 Is that still current? And if yes can you sketch what that means and how that could work in practice? 17:26:34 rbrunner: You would send a `PartialTx` to your friend, and then later they can make membership proofs for the tx and submit it. 17:27:04 this guy: https://github.com/UkoeHB/monero/blob/bd46a0f92079080a3abde92041cd81160b8cb91d/src/seraphis/txtype_squashed_v1.cpp#L183 17:27:45 err actually would be this one in practice lol: https://github.com/UkoeHB/monero/blob/bd46a0f92079080a3abde92041cd81160b8cb91d/src/seraphis/txtype_squashed_v1.cpp#L201 17:27:59 And it's trusted because by building such a partial tx and sending it to my friend, I still could spend faster and cheat? 17:29:10 that and your friend will know the real spends 17:29:43 But I could send such a partial tx very early, within the 10-block spend limit? 17:29:50 "I'm wondering how you translate..." <- This is the reason I choose to collect it on the test-net, I need the data to resemble main-net as close as possible. With machine learning, small discrepancies in the training dataset compared to the testing dataset could result in large inaccuracies. I understand a large amount of traffic on the test-net is not ideal, so I'll soon be delaying transactions based on real-user 17:29:50 spending patterns. 17:30:35 but where are u getting those spending patterns? 17:30:46 rbrunner: yes 17:31:16 "also, stagenet might potentially..." <- This might be a good common ground. 17:31:38 yeah. testnet has the potential to get restarted, or rolled back, during dev testing etc 17:31:46 could really muck up your work 17:31:47 Alright, so that's only a "reduction" or "circumvention" of the 10-block limit in quite special circumstances. And I guess final submit has to wait then? 17:32:11 xmr-ack: One good reason to do it on testnet or stagenet would be if you were using network data as features for the machine learning algorithm. 17:32:33 rbrunner: right 17:32:40 Ok, thanks! 17:33:31 stagenet shouldn't be fiddled with in that way. stagenet is meant to mimic mainnet, just not have any value. testnet is meant to test consensus rules etc. at least thats the thought. but testnet only really got mucked with during the randomx testing as far as i recall. 17:33:37 delegation is more useful for multisig, tx chaining, collaborative funding 17:34:31 gingeropolous: The gamma distribution proposed in Moser et al. Additionally, I'm going to run an experiment soon where I crawl the last 1,000,000 transactions on main-net using the onion block explorer and calculate the distribution of transaction fees to simulate that as well. 17:35:08 gingeropolous: I didn't know this. That could be a problem 17:36:20 xmr-ack[m]: is there an advantage to generating real txs? your database just reduces to {block height, {reference heights}} 17:36:22 Well, testnet was quite stable for a long time now. 17:36:33 we haven't had to test new consensus rules :) 17:36:51 True. 17:37:14 i mean, the new ones shouldn't be that fiddly, but who knows 17:37:34 "xmr-ack: One good reason to do..." <- I have thought about incorporating network features and even ran some experiments in the past where a 1D conv-net was able to differentiate between remote node network traffic with a > 90% accuracy. But that was a quite small dataset and only preliminary results. 17:40:25 UkoeHB: I have some code here that simulates different wallet version strategies to arrive at this FWIW: https://github.com/j-berman/monero/commit/4baf4c99b002583905b4389402d9a5081d168059 17:40:54 i dunno about network features. the permanent thing is the blockchain 17:42:18 "xmr-ack: is there an advantage..." <- My reasons so far include:... (full message at https://libera.ems.host/_matrix/media/r0/download/libera.chat/8816854db00368e85005a07d2021036b621cd66d) 17:42:19 Monero adversaries almost certainly are collecting more than just blockchain data. 17:42:23 gingeropolous: I agree 17:43:35 Rucknium: They are but I think specifically for this research project that is out of scope. I may continue my research down the road and look into fingerprinting network patterns between: peers, remote nodes, etc... 17:44:12 Fingerprinting network patterns is really cool because you can pretty much bypass all encryption. 17:44:59 Uh, you may elaborate a bit, otherwise people will freak out reading the log of this meeting ... 17:45:02 * all encryption. Granted you can only classify high level actions ( ex. user spent monero vs user recieved blocks) 17:45:30 Yea I just edited hahah I realized that needed more context 17:45:57 Let me find a good paper for anyone thats interested 17:46:03 "Monero researcher finally admits: Monero *is* traceable" :) 17:46:26 ~never~ 17:46:40 Dandelion++ is supposed to reduce the efficacy of de-anonymization efforts based on monitoring network data, I believe. 17:46:56 Had the same thought, yes 17:49:33 I think we can call the meeting here, unless anyone has any last minute comments/questions. 17:50:34 ok thanks for attending everyone 17:50:50 Rucknium[m]: Yea good point. To clarify, the research scenario where traffic patterns could be fingerprinted were only tested in a highly privileged network location. ( ex. a local adversary that could view encrypted packet patterns ) 17:51:21 * packet patterns before the traffic reached the remote node) 17:52:27 I don't know much about ML, but doesn't training usually occur with real data, rather than simulated data? i.e. if you're training some model based on simulated data, or data that you control/are creating, doesn't that bias the model toward whatever data you generate, and therefore defeat the purpose of trying to match "main-net conditions", since it trivially is not those conditions, but the conditions of what you created? idk 17:52:27 if that makes sense 17:54:35 With the small problem how you test for correct results on mainnet, maybe? 17:56:38 Maybe a quite special case with Monero where your data actively resist analysis :) 17:59:13 jberman[m]: you're totally correct. 17:59:27 "I don't know much about ML..." <- Correct, but using test-net or stage-net is the best conditions we have. Collecting the dataset on main-net is not financially feasible and publicly disclosing the true spend of a large number of transactions would have serious privacy implications for other users. 18:00:30 * other users. There is still likely many features within test-net that can be applied to main net. 18:00:56 xmr-ack: any ... thoughts on using bitcoin data and superimposing ringsigs on top of it? 18:03:23 gingeropolous: I mean I could try it but then I can't imagine that dataset working better than true monero transactions. 18:03:57 i think both approaches are worth it 18:04:02 I think it would lose a lot of the value from test-net 18:04:03 honestly, a ML model should be able to do both 18:04:31 Let me post my feature-set ideas to the github proposal 18:04:39 "Correct, but using test-net or..." <- ok, but it would seem like you'd need to exclude your own data from the training set, no? cuz including your own data kinda defeats the point of matching any real world conditions? 18:04:46 i.e., one trained on fake data should be able to deconvolute one created from the bitcoin superimposed one 18:07:43 jberman[m]: Potentially, it depends on if the model overfits the training data or not. There are ways to combat this, such as hyperparameter tuning and reducing the number of parameters in the neural net. 18:09:47 Okay I just added the feature-set ideas to the proposal https://github.com/MAGICGrants/Monero-Fund/issues/15 18:10:26 These are just ideas not set in stone. If anyone finds an error or has a suggestion, please let me know! 18:10:28 Another idea, maybe related to gingeropolous 's idea, is to leverage the real spend data from the Moser et al. (2018) paper. That data is outdated, however. 18:11:45 In other words, reproduce the spending patterns from that data, but with current-version Monero transaction types. That's a lot of coding though.... 18:11:49 Rucknium: I don't think I ever found their actual dataset, I know their github repo has the code to make your own dataset though. If you have it could you send it my way? 18:12:35 jberman reproduced some parts of the data somehow 18:12:43 Rucknium: I think re-assessing the spending patterns could be very interesting. I'll look into that if I have time 18:14:00 https://github.com/maltemoeser/moneropaper 18:17:16 Adding this to the jupyter notebook after step 46 will download a CSV of the spend times they used to calculate the gamma: https://paste.debian.net/1232030/ 18:18:20 have that result somewhere too will send it over pm 18:18:27 Do they identify which transactions spend which outputs? Or only have the spend times available? 18:19:36 log_spend_times pretty sure is just spend times, but they also identify which transactions psend which outputs 18:21:11 think that query right above log_spend_times is basically just querying "what is the age of the output this transaction is spending" and aggregates the answer to that question 18:21:26 for all tx's they know real spends for 18:27:19 "Let me find a good paper for..." <- https://home.cse.ust.hk/~taow/wf/ 18:27:19 https://www.cypherpunks.ca/~iang/pubs/webfingerprint-wpes.pdf 18:27:19 https://nymity.ch/tor-dns/pdf/Panchenko2016a.pdf 19:41:27 hello 19:44:15 is this branch good to analise, to understand ringCT ? https://github.com/SarangNoether/skunkworks/tree/rct3 19:46:09 slave_blocker2: probably not, rct3 probably implies this paper https://eprint.iacr.org/2019/508 21:12:13 new ccs for seraphis wallet poc ( rbrunner ): https://repo.getmonero.org/monero-project/ccs-proposals/-/merge_requests/290 21:15:28 Thanks for the pointer. Let's hope this will get filled as fast as your first one once published.