01:05:47 tevador: re this: https://github.com/monero-project/monero/issues/7889#issuecomment-1002803963 01:06:12 we need some way to pass along a message with thorchain, and we are currently planning to use the encrypted pid for thios 01:06:16 s/thios/this/ 01:09:36 https://gitlab.com/thorchain/thornode/-/issues/919 01:10:00 it says 16 bytes but it's actually 16 characters (8 bytes) 01:11:41 we need a non-interactive way to pass along a small string at least 02:32:58 that sounds like (not encrypted) payment ids 02:37:14 tevador: U = x126582dfc357b10ecb0ce0f12c26359f53c64d4900b7696c2c4b3f7dcab7f730 X = x4017a126181c34b0774d590523a08346be4f42348eddd50eb7a441b571b2b613 03:19:11 Hey everyone! I have a big favor to ask. I'm constructing a deanonymized dataset of transactions on Testnet to research the plausibility of deep learning attacks against the Monero blockchain. I currently have an automated system in place to collect new transactions but I really need older transactions using inputs over a long period of time. If you have a Monero Testnet wallet that has made transactions in the past, please PM me 03:19:11 the seed! You can transfer any funds out of the wallet before giving it to me since I only need the transaction metadata. 03:19:11 Due to my automated system performing transactions as fast as possible ( once every 20 mins ) Most often, the output in the ring signature is the most recent UTXO. To construct a balanced dataset to prevent neural networks from overfitting, you want an even number of transactional outputs from each position in the ring signature ( 1 - 11 ). 03:19:11 Thanks :) 03:22:16 * balanced dataset and to prevent, * to prevent the neural networks 03:29:52 * Hey everyone! I have a big favor to ask. I'm constructing a deanonymized dataset of transactions on Testnet to research the plausibility of deep learning attacks against the Monero blockchain. I currently have an automated system in place to collect new transactions but I really need older transactions... (full message at https://libera.ems.host/_matrix/media/r0/download/libera.chat/add2b409a350808ade9d1a8e300c91937ac2bd5c) 05:30:38 My impression reading this xmr-ack is to make sure your model is effective against arbitrary transactions, for which the spend behavior isn't necessarily known. If you're testing by sending transactions ASAP, obviously a guess-newest is going to be effective against those. But not everyone does that in practice, hence why it's potentially unreliable in practice 05:31:07 the largest set of monero transactions for which ground truth exists are for pre-RingCT outputs 05:31:24 https://github.com/monero-blackball/monero-blackball-site 05:32:24 you can look at average numbers when comparing the real distribution against the expected average decoy distribution, but this has limited specific impact because there isn't ground truth to verify 11:02:18 "My impression reading this xmr-..." <- Yea I agree, having my script wait hours or days before transacting will take forever, thats why I’m looking to crowd source a lot of the transactions. Supervised learning problems need a ton of data and I’d really like to try and avoid oversampling/undersampling a small incomplete dataset. This is only one use case for the dataset as well, other models not classifying the true 11:02:19 spend will not care if they are all back to back. 12:11:42 So, Jamtis saves 9 bytes from 2/out TX size and 33 bytes from 3+/out tx size: https://gist.github.com/tevador/50160d160d24cfc6c52ae02eb3d17024#45-transaction-size 12:26:32 xmr-ack: why not run your own custom testnet, with a single node and a 5 second block time 12:27:11 That way you can set up several wallets and mine all your transactions whenever you want 12:30:27 Your model doesn't actually depend on timestamps, does it? It should only depend on block height 13:12:42 tevador: "These public keys can use the existing TX_EXTRA_TAG_ADDITIONAL_PUBKEYS field." Seraphis will completely re-do most of the existing tx structure. Txo pubkeys will go in a `tx_supplement` field. 13:32:28 UkoeHB: is this documented somewhere? 13:33:19 my PoC models the end result I am aiming for 13:33:51 here is the lib I am working on: https://github.com/UkoeHB/monero/tree/seraphis_lib 13:34:04 Btw, it might be worth considering the removal of tx_extra when we're at it... 13:34:51 https://github.com/monero-project/monero/issues/6668 13:36:17 my favorite quote: "An arbitrary plaintext data payload in a system whose privacy relies on indistinguishably is like a screen door on a submarine." 13:36:38 but leave tx_extra for coinbase transactions 13:38:43 I still think removing tx_extra is a mistake. For example, thorchain is planning to use tx_extra. What other use-cases would we prevent by removing the field? It is unknowable. It also creates an unhealthy dependency on core development, if core needs to evaluate and implement every new thing the ecosystem wants to do. 14:07:08 UkoeHB: They are using encrypted payment IDs afaik 14:07:23 we need some way to pass along a message with thorchain, and we are currently planning to use the encrypted pid for thios 14:07:24 ^ UkoeHB 14:16:38 sure, the epids are in the tx extra field, and people want to deprecate epids anyway - so thorchain would have to add their own tx extra field to get the result they want 14:33:10 the main point is that we should not add a dummy field to all transactions just because of a thorchain feature 14:47:28 AMEN my favorite quote: "An arbitrary plaintext data payload in a system whose privacy relies on indistinguishably is like a screen door on a submarine." 15:21:16 "the main point is that we should..." <- you shouldn't think of this as just a "thorchain feature"; you would potentially be removing the ability to express transaction intent in an encrypted way 15:21:59 subaddresses work when you can have the recipient cooperate, but they don't seem to work nearly as well if the sender needs to express intent non-cooperatively 15:22:37 you can still express intent and encrypt it with the shared secret using a custom tx_extra field 15:23:08 isn't that worse for privacy? I explicity tries to push them away from using a custom tx_extra implementation 15:23:13 in any case, the thorchain feature will need more than 8 bytes 15:23:18 s/explicity/explicitly/, s/tries/tried/ 15:23:36 tevador: did you see the gitlab issue? 15:24:01 the intent is NOT to store the whole string in tx_extra 15:24:21 s/string/command/ 15:24:26 I read it, it's supposed to be a hash value 15:24:42 yes a hash value 15:24:47 but a 64-bit hash is not sufficient 15:25:15 that gives you mere 32 bits of collision resistance 15:25:36 I can find collisions on my laptop in a few minutes 15:26:19 why was 8 bytes chosen for payment IDs initially? 15:26:40 the difference is that a payment ID is not a hash value 15:27:09 the original 3 15:27:15 32-byte PID could be 15:28:10 if you want real security, you probably need a 32-byte hash and store it in tx_extra in a custom field 15:30:53 I specifically pushed thorchain away from using tx_extra because it seemed people wanted to remove tx_extra (that risk seems to remain), and I felt it was best to work within the existing framework if possible for transaction indistinguishability (at least on the monero side) 15:35:12 Are people actually that concerned about having a small memo for each transaction? Zcash has that for a huge 512 byte one, and Firo will probably go with 32 bytes 15:36:41 If you want to throw distinguishability out the door, thorchain can just chuck the whole message in tx_extra and they'll be happier. I had figured for the sake of privacy people wouldn't want it done this way 15:42:23 we have this PR: https://github.com/monero-project/monero/pull/6410 15:43:53 I think a fixed memo of 32 bytes would be possible if we wanted to go this route 15:45:39 tevador: did you get the U/X hex strings working? 15:55:41 tevador: I generally agree with Sarang here: "For optimal uniformity, the field should be required and of a fixed size. But at that point, you effectively have a larger encrypted payment ID field, and the functionality overlaps. Allowing even a quantized variable size opens the door to multiple anonymity pools and fingerprinting." 15:56:18 We will happily use a memo instead of a payment ID if it wants to be renamed and redone, and if it's large enough (which 32 bytes definitely is) 15:57:58 s/wants to be renamed and redone, and if it// 15:59:57 I just need a decision to be made on what you want thorchain to use 16:02:30 tevador: while I definitely accept your point about collisions (you know a lot more than me), keep in mind that for one to be effective, you would need to pass a RUNE transaction with a request string that matches the expected format. So if you were trying to, say, receive the traded funds in another address, you would need to pass your malicious destination address properly in this string in the correct format such that it hashed to the 16:02:30 same value 16:12:09 UkoeHB: yes, thanks. I still need to make some modification to Polyseed and then I'll add the test vectors. 16:13:28 sgp_: the collisions can be in arbitrary format, e.g. I can just generate random addresses until the hash matches 16:14:31 yeah I see, you're right 16:15:25 in any case, the tx_extra memo field is orthogonal to the addressing scheme, so for now I'm just assuming that encrypted PIDs will be removed 16:16:23 32 bit collisions means collisions within the generated pool in 2^32 steps. If you want to collide with a preexisting set of 2^N size, it'd take 2^(64-N) steps or so. 16:17:02 ie, you might get collisions in your brute force set first, but they don't "count". 16:17:27 the thorchain issue doesn't really explain how the hash will be used, so I can't commend on specific attack scenarios 16:17:30 (just being pedantic, I don't know which one applies to that particular use) 16:17:35 comment* 16:53:30 Btw, since we are talking about generously increasing the tx size with encrypted memos, it would also be possible for the sender to include the encrypted 64-bit index of the account+subaddress. That would eliminate hashtable lookups and issues like "subaddress lookahead". 16:54:08 The tuple (i,j) would be encrypted twice: once in the address (using a block cipher), and second time by the sender of the tx (using a stream cipher). The recipient would decrypt (i,j) and check if K_s+(q+H(k_vb,i,j))G matches the tx key (the same number of EC ops). No need to cache any addresses in advance. 17:09:11 that would add 64 bits? 17:12:59 tevador: I think it can be done without loss of privacy for Tier 1s. Tier 1s can already link outputs sent to the same subaddress, so it is fine if the Tier 1 can open the stream cipher. Then, Tier 2 can open the block cipher. If the block cipher has a 2-byte MAC, then Tier 2s won't have to compute test spend keys for most unowned outputs. 17:17:07 Change and self-spends would have a dummy address ID 17:19:30 sgp_: 8 bytes per output and 8 bytes per address 17:19:47 UkoeHB: the MAC could be just 1 bytes, it's basically a level 2 view tag 17:21:07 hmm ok it might be fine; for every 10mill outputs and 1-byte MAC, you'd only need to compute a test spend key for 151 unowned outputs 17:23:52 > No need to cache any addresses in advance. 17:23:53 I think caching would still be useful for high-volume users. 17:24:58 yes, but this time it would be really just a cache and not a database 17:26:02 Blowfish seems like the best candidate block cipher, the permutation conveniently accepts two 32-bit integers: https://en.wikipedia.org/wiki/Blowfish_(cipher)#Blowfish_in_pseudocode 17:26:19 "tevador: I think it can be..." <- "tier 1s can already link outputs sent to the same subaddress" This was actually unknown to me, that definitely sucks 17:27:06 Yes, we talked about this and settled on hiding change outputs from Tier 1. 17:28:55 It's still an obvious privacy improvement over the status quo but in practice a lot of users will reveal what outputs they receive then 17:33:08 IMO it's the fundamental cost of offloading most computations to a third party. 17:35:42 even if subaddresses are reused, Tier 1 still won't learn amounts and when exactly outputs are spent 17:36:23 for the MAC we could use Siphash :p 18:21:10 it can work, but the question is if it's worth the 8 bytes per output 19:09:24 tevador: what are the values for X and U in jamtis? 19:27:44 tevador: nevermind, i just saw the hex in the above conversation 23:17:08 https://www.irccloud.com/pastebin/fdGSWiae/ 23:20:00 My goal with the interface -> impl relationships is to allow flexible composition. A MasterKey could be a local C++ object, or an interface for a hardware/cold wallet. An ENoteFinder could be a local C++ object, or an interface for a third-party scanning service. An AddressBook could be shared between multiple wallets. And so on 23:30:12 I also really want normal wallets and multisig wallets to be separate, hence most of FullWallet is encapsulated by modular sub-components.