-
FAA45
Hi there
-
FAA45
Will the meeting be here?
-
Rucknium[m]
FAA45: Yes, in about one hour. Welcome.
-
tevador
Possible meeting agenda item: Jamtis address tag size. Discussion starts here:
gist.github.com/tevador/50160d160d2…52ae02eb3d17024#gistcomment-4237034
-
UkoeHB
-
UkoeHB
1. greetings
-
UkoeHB
hello
-
tevador
Hi
-
hyc
hi
-
dangerousfreedom
Hello
-
Rucknium[m]
Hi
-
gingeropolous
hi
-
FAA45
Hi
-
jberman[m]
hi
-
UkoeHB
2. updates, what's everyone working on?
-
tevador
Working on an X25519 implementation for Jamtis to replace ed25519 in shared secret calculations.
-
UkoeHB
me: did some planning and initial work for integrating legacy cryptonote outputs into my seraphis lib. My goal is to provide enough utilities that a standalone wallet can be built, completely independent of wallet2. So, the new lib should be able to do balance recovery on the entire legacy chain plus spend any of those old outputs.
-
jberman[m]
not research related, but continuing review on 8076 (hopefully will be complete today), moving on to 7999/vtnerd's alternative next, and opened a CCS
-
Rucknium[m]
Working on OSPEAD. I reduced an algorithm with naive implementation computational cost of O(N^2) to approximately O(N) computational cost ๐
-
Rucknium[m]
-
dangerousfreedom
I have been trying to solve the mystery of a wrong signature validation using my Python tools and found out that Monero is performing a wrong operation for certain cases which leads to a malleability issue. Not exploitable or exploited as far I know by now. The details I will publish on my website and on github soon (tonight or tomorrow morning). :)
-
gingeropolous
got the storage up and running (14T on SSD, 76T on HDD). reminder that if anyone needs compute resources, feel free to contact me.
-
UkoeHB
3. discussion; today we should start by returning to the jamtis address index/tag question (whether to use 64 bits for the tag -> 56 bit address index + 8 bit MAC, or increase to something like 144 bits -> 128 bit index + 16 bit MAC); this was previously discussed in this meeting
monero-project/meta #697 and also here
gist.github.com/tevador/50160d160d24cfc6c52ae02eb3d17024
-
tevador
it's a costs vs benefits question
-
UkoeHB
personally, some extra bytes in outputs and addresses is worth it for peace of mind
-
UkoeHB
my more technical arguments are in the jamtis gist
-
UkoeHB
I guess, does anyone have any questions on that topic? Or is everyone reading? lol
-
jberman[m]
I think the benefit of secure client-side random address generation compatible with an ubiquitous simple global standard like UUID's (128 bit index achieves this) + reducing/removing the incentive for a 3rd party scanner to collect full balance keys to offer better UX (16 bit MAC achieves this) = significant
-
dangerousfreedom
UkoeHB: I'm not familiar with it yet, sorry :p
-
Rucknium[m]
We need to get ISO and NIST in on this ;)
-
Rucknium[m]
More seriously, I do not feel that I have the necessary knowledge base or skills to have much of an opinion on it.
-
tevador
jberman[m]: can you elaborate about the 16-bit MAC argument?
-
tevador
you mean people would give up their view-balance key to speed up their wallet sync?
-
jberman[m]
"Does saving 5 seconds per wallet sync for a subset of Monero users justify the costs?" > I think the answer to this question is yes. I think a MyMonero-like service would emerge that takes view balance keys and users wouldn't care
-
jberman[m]
users of that service
-
jberman[m]
ideally the UX is as close to 0 time spent syncing the wallet as possible so that kind of service has no incentive to do it
-
Rucknium[m]
An implementation of fee fingerprinting for MyMonero transactions could help us estimate the current on-chain transaction volume (if not proportion of users) of lws-like services.
-
jberman[m]
I also think the "light wallet scanning" use case where you run a server for family and friends, without having access to amounts nor definitive knowledge of spends/receives is highly attractive. I think this use case *could* end up a larger subset of Monero users and is worth optimizing for
-
tevador
Fair points.
-
moneromooo
5 seconds on how much ? Half an hour ? I've not synced a wallet in ages.
-
tevador
it would be something like 5 seconds vs 0.1 seconds with the larger MAC
-
moneromooo
On top of how much ?
-
moneromooo
I mean, you can optimize a step from 5 seconds to 0 seconds, but if the other steps tkae half an hour, it's de minimis.
-
tevador
This is the whole sync time from the moment you log in to the remote service.
-
UkoeHB
moneromooo: it's for view scanning with a remote helper like MyMonero (lws)
-
UkoeHB
local full-scanning wouldn't be materially impacted
-
moneromooo
I assume you're assuming new wallets, created after jamtis would be added then ?
-
UkoeHB
yes this is only for scanning seraphis enotes
-
tevador
I think this assumed ~250M seraphis outputs. The view tag will filter that to 1M and the 8-bit MAC to 3.9K, the 16-bit MAC to a handful.
-
UkoeHB
it would also impact old wallets that get migrated and then get set up with a remote scanner
-
tevador
We could also do 48+16 if we really wanted a 16-bit MAC, but decided to keep the address tag at 8 bytes.
-
moneromooo
Does the client scan the 3.9k ? If so, it provides the client with some layer of privacy vs winnowing really really well.
-
UkoeHB
winnowing?
-
moneromooo
pruning ?
-
moneromooo
Filtering
-
tevador
The client scans the 1M outputs.
-
tevador
3.9K outputs would pass the 8-bit MAC, so the client would have to calculate the output key and check it
-
tevador
with the 16-bit MAC, there would be almost no false positives
-
rbrunner
There is a separate key that I could hand over to somebody to do MAC scanning for me, right?
-
tevador
I guess the actual async time would be determined by the time to download the several MB of candidate outputs
-
moneromooo
So in addition to the 5 seconds or 0.1 seconds, you have to also add the download time for those million outputs too.
-
tevador
rbrunner: that's the full view key
-
tevador
you would not normally give it away
-
rbrunner
Ah, ok, that's too powerful then. But what is then the least powerful key I can give to any kind of third-party service?
-
tevador
the key to calculate the shared secret and the view tag
-
rbrunner
I see, thanks. So that will filter down to 1M in your example.
-
jberman[m]
the "find-received key"
-
tevador
it would be roughly 100 MB
-
UkoeHB
tevador: this CBC ciphertext stealing is an ugly mess for our use-case, can I just stick with my original idea? It's way easier to understand, read, and validate.
-
tevador
what is the problem with it? I think it just swaps the order of the last two blocks
-
UkoeHB
yes swapping the data around makes it an ugly mess, because with my version you can do operations on the ciphertext/plaintext directly
-
tevador
so that the block that contains the MAC is the first 16 bytes of the ciphertext
-
UkoeHB
but with this CBC cts you have to move data around in buffers...
-
tevador
It seemed like a good idea to use an existing standard for the encryption. Not rolling our own.
-
UkoeHB
in practice everyone who looks at this does the same set of steps: understand why the method works, read the code to see if it does that
-
UkoeHB
CBC ciphertext stealing is more complicated and harder to read, but does the same thing
-
UkoeHB
I'd rather say "equivalent to CBC ciphertext stealing" and then anyone who cares about the standard can go look at it
-
tevador
OK
-
UkoeHB
sweet thank you! and thanks for pointing out this standard, I had no idea it existed
-
rbrunner
Does something get slightly longer now with this decision? The address, because some more bytes?
-
UkoeHB
tevador: "rbrunner: that's the full view key" -> actually you can do this with the generate address key
-
UkoeHB
but there is not much advantage there, because MAC scanning is so cheap
-
UkoeHB
I guess data transmission*
-
jberman[m]
> My point was that MAC addresses work with a global 48-bit space.
-
jberman[m]
AFAIU MAC addresses need only be unique to a local network and it's not generally expected that local networks will have millions of devices connected the next 100 years
-
tevador
there is an advantage for the download size
-
jberman[m]
On the contrary, it's easy to envision internet businesses generating millions of addresses. The 3 bullets here make me feel uncomfortable even with a 56-bit address space:
gist.github.com/tevador/50160d160d2…ment_id=4238074#gistcomment-4238074
-
jberman[m]
And a collision is harmful to privacy, it doesn't only have the propensity to bork a distributed system that doesn't properly account for it. The downsides seem more significant in our case
-
UkoeHB
tevador: ah, actually download size isn't helped due to the self-send optimizatioin
-
UkoeHB
you need all view tag matches to find selfsends
-
rbrunner
I am a bit lost: The proposal that is now "on the table", how many bits does that have in this regard? More than those 56 bits?
-
UkoeHB
rbrunner: I proposed 128-bit address indices
-
tevador
rbrunner: with the 64-bit index, addresses are 180 characters and with the 144-bit index, the are slightly longer at 196 characters
-
tevador
output size is also slightly larger with the longer tag
-
rbrunner
That clears it up, thank you.
-
rbrunner
Seems to me if we accept 180 characters we can also accept 196 if it has tangible benefits. This is anyway already beyond bad :)
-
rbrunner
A bit unfortunate for QR codes however
-
tevador
I think there still isn't a good argument for the 16-bit MAC. Lazy users might still want to give up their view keys to avoid the 100MB data download.
-
tevador
the collision resistance of the longer tag is a clear benefit, so perhaps something like a 120+8 setup might be better even if that wastes 1 bit in the base32 encoding
-
rbrunner
128+8?
-
tevador
120+8, so we can use one block of the cipher
-
rbrunner
Ah, ok
-
tevador
UUIDs are 122 bits anyways
-
rbrunner
Your argument is basically "Who spends the time to download 100MB does not care too much whether scanning through that is 5s or 0.1s". Did I get this right?
-
UkoeHB
I prefer a full 128 bits for the index to avoid imposing a 'magic number' on users.
-
tevador
rbrunner: exactly
-
tevador
UkoeHB: the index is hidden from users
-
UkoeHB
16 bytes is a universal size, 15 bytes would be a special magic size
-
UkoeHB
like I said in the gist, users as in whoever is trying to use the index in their application
-
tevador
they will not use the index, the API will return some other identifier
-
UkoeHB
byte field, whatever
-
tevador
just like they will not be applying the Twofish cipher
-
tevador
the returned byte field can be 16 bytes
-
rbrunner
So it's down to API implementers who have to known about those 120 bytes?
-
tevador
you can stuff the 120 bits into an UUID, for example
-
tevador
and set two bits from the network flag
-
UkoeHB
I'm saying the API user has to know how many bytes they get to define...
-
UkoeHB
16 is a universal amount, 15 is a magic number
-
rbrunner
Maybe you two talk about different APIs?
-
tevador
The API user doesn't define. They ask for an address.
-
UkoeHB
I'm talking about the deeper API of the protocol itself, not any higher layer
-
moneromooo
15 is the max number you can represent on a nybble in 2's complement. A very universal number.
-
rbrunner
:)
-
UkoeHB
moneromooo: lol
-
moneromooo
Actually, scratch the 2's complement part.
-
rbrunner
Might be that in the real world pretty few people will ever step down so deep that they will see this 120. A handful of library implementers maybe.
-
Rucknium[m]
IMHO, if 15 is advantageous compared to 16, my low-info preference is to go with 15. Satoshi was the first to use base58 after all, since it fit a specific need.
-
UkoeHB
the only advantage here is saving 2 bytes
-
tevador
120+8 avoids the block cipher gymnastics
-
UkoeHB
and a slightly less complicated cipher
-
rbrunner
Sounds good?
-
UkoeHB
not to me lol
-
rbrunner
Yeah, because you are one of those unlucky few library implementers who get to see the sheer ugliness of 120 :)
-
UkoeHB
well so far I have avoided making ugly decisions, after 35k+ lines of code, and I certainly don't want to start...
-
tevador
I'm still not convinced we are not leaking information by using ECB on 2 overlapping blocks.
-
tevador
even if the first block covers the whole secret index
-
UkoeHB
tevador: it's not ECB, it's an overlapping CBC
-
UkoeHB
ECB doesn't have any XORing
-
tevador
it's CBC with an IV of all zeroes
-
tevador
btw 120 bits is not ugly, I'm using it for the hash identifier here since it's divisible by 5 bits for easy base32 encoding:
github.com/tevador/id32
-
rbrunner
Maybe a stupid question, but what happens if somebody cracks that encryption and gets to see those bits in clear? How bad would that be?
-
UkoeHB
it's ugly for a generic bytefield that anyone can freely define
-
Rucknium[m]
Seraphis is currently 35k+ lines of code? Impressive, koe
-
UkoeHB
Rucknium[m]: well that's the diff on my branch vs master
-
UkoeHB
rbrunner: you learn the address index that's all - it depends on how that index was defined, whether it means something
-
tevador
how about an UUID that has 122 bits you can define? :P
-
rbrunner
Ok. But of course still bad if somebody would catch to leak some info here. And the obligatory "Monero is finally cracked" articles ...
-
rbrunner
*would catch us
-
UkoeHB
tevador: the overlapping cipher is similar to ciphering the same block twice in a row; if that could leak information, then I don't see how ciphering would be secure
-
jberman[m]
The ease to develop: generate UUID -> address index, sounds very nice to me as well fwiw
-
rbrunner
But for that we would need the full 128 bits, right?
-
rbrunner
because 122
-
jberman[m]
I should say, the ease to develop: generate UUID v4 -> address index...
-
tevador
128 bits would have the advantage of supporting all versions of UUIDs
-
rbrunner
Hmm, maybe a bit overkill if you could as well just randomly choose those bits?
-
rbrunner
Or I don't get fully get your idea
-
rbrunner
*yet
-
jberman[m]
"128 bits would have the advantage of supporting all versions of UUIDs" -> true
-
jberman[m]
Compatibility with a UUID system enables e.g. a merchant system to use UUID's to identify orders in their system and then use as input to generate addresses. It's more of a compatibility/standardized integration thing. Sure people could just use random bytes instead of UUID's in all cases, but UUID's are an ubiquitous identifier standard people opt for instead
-
rbrunner
I see
-
rbrunner
Also has trust on its side
-
rbrunner
As some sort of psychological benefit
-
rbrunner
And the DB would already complain if there was a collision, right?
-
rbrunner
But well, it would do that also with the corresponding number of random bytes ...
-
UkoeHB
We are well past the hour on the meeting, so I think I'll call it here. The subject of jamtis address tags hasn't found a conclusion, but at least it's quite easy to change the implementation if needed.
-
UkoeHB
Thanks for attending everyone
-
jberman[m]
<tevador> "I think there still isn't a good..." <- 1M view tag matched outputs = 256M global outputs. After 5.5 years of RingCT, we have ~58M global outputs
-
jberman[m]
Bandwidth will likely improve by the time users will need to sit and wait to scan 1M outputs locally with any sort of frequency, at which point the benefit of a larger MAC would presumably be more significant. But, worth looking deeper at bandwidth trends. Perhaps this line of reasoning is moot and we should always expect bandwidth to be the bottleneck by a wide margin
-
Rucknium[m]
UkoeHB: When you post the meeting logs to GitHub, is there an easy way to make the text wrap? It's hard to read otherwise. I know I can copy-paste etc., but I also post links to logs sometimes to answer community questions.
-
moneromooo
You should probably get the average outputs per unit of time within the last month or so, then scale, instead of using outputs since the start of the chain, which includes the desert at the start.
-
jberman[m]
True
-
moneromooo
Wait. 5.5 years of ringct. You did not include the desert. I'm kinda amazed rct is already 5.5 years...
-
UkoeHB
Rucknium[m]: uh doesn't seem like it, your best bet would be plowsof's script
github.com/plowsof/post-libera-meeting-logs (I haven't been using because the logs I post only take me 30s or so)
-
FAA23
what's the link to the chat archive?
-
UkoeHB
-
tevador
"the overlapping cipher is similar to ciphering the same block twice in a row" > yes, and that leaks information. If we had a 16-byte MAC, you would leak a "plaintext" and the corresponding ciphertext: it's called a known plaintext attack.
-
tevador
it's probably now enough for a meaningful attack, but the leak is there
-
tevador
not enough*
-
UkoeHB
sorry, what information is leaked exactly?
-
tevador
You leak x and enc(x). Given enough of these, some attacks on ciphers are possible.
-
tevador
Even if x itself is not meaningful
-
UkoeHB
you leak the MAC? I'm not following
-
tevador
if we have a 16-byte index and 16-byte mac, the tag would be enc(index), enc(enc(index))
-
tevador
x = enc(index), you leak x and enc(x)
-
tevador
with the overlapping tag, the amount of information is obviously much less than 1 block
-
UkoeHB
are you claiming enc(enc(x)) == dec(enc(x))?
-
UkoeHB
looking at the implementation, that doesn't seem to be right
-
tevador
no, I'm claiming that if an attacker has access to many pairs of [plaintext,ciphertext] with the same key, some attacks are possible
-
tevador
-
UkoeHB
en.wikipedia.org/wiki/Twofish "The paper claims that the probability of truncated differentials is 2^โ57.3 per block and that it will take roughly 2^51 chosen plaintexts (32 petabytes worth of data) to find a good pair of truncated differentials."
-
tevador
I think this could be fixed if the MAC was not simply zeroes, but some hash value calculated over the index.
-
tevador
then you would not leak anything
-
tevador
maybe siphash?
-
UkoeHB
the information leak sounds like speculation - is there an actual known attack on twofish that can do this with a reasonable number of known plaintexts?
-
UkoeHB
iirc siphash is much more expensive than twofish
-
tevador
No, I'm talking about a general attack. No idea if there are known attaks against a particular cipher.
-
tevador
in crypto, you generally take the cautious approach
-
UkoeHB
I mean, in the wiki article you link it says "The AES non-linear function has a maximum differential probability of 4/256 (most entries however are either 0 or 2). Meaning that in theory one could determine the key with half as much work as brute force, however, the high branch of AES prevents any high probability trails from existing over multiple rounds". So differential analysis of AES is only 2x as good as brute force...
-
UkoeHB
Reading that (and supposing we were using AES instead of twofish here), would you then say 'well be better be cautious and include some random data'?
-
tevador
btw siphash-24 takes about 45 cycles for 16 bytes
-
tevador
so what is the worst that can happen if we make the MAC siphash13(index)?
-
tevador
that's about extra 10ns per output
-
UkoeHB
in that case you have to decipher both blocks to check the mac
-
tevador
then siphash just 8 bytes of the index
-
UkoeHB
no can do, the first block decipher only gives you the MAC plus 14 bytes of the first block's ciphertext
-
tevador
actually, there is no reason we couldn't siphash 8 bytes of the encrypted index
-
UkoeHB
although I guess you could siphash some of those bytes
-
UkoeHB
but if you're going that far, why not just duplicate some of the encrypted index bytes?
-
UkoeHB
what's the difference?
-
tevador
then you have a plaintext with a known relation or possibly a partially known plaintext
-
tevador
Anyways, it's just an idea. I'm just not feeling comfortable about the encryption process for the tag. The 64-bit tag was straightforward.
-
UkoeHB
To be clear then: you want to avoid a hypothetical differential analysis on a decades-old cipher that can use a 2-out-of-16 bytes known plaintext to extract bits of the cipher key (presumably more than 1 or 2 bits, and using on the order of only 2^20 unique ciphertexts)? If such an attack were possible, then twofish would be incredibly broken for 16-byte known plaintexts, which seems very unlikely.
-
tevador
We had a similar discussion about information leakage in PR 8061 and in the end, the most overkill solution was selected.
-
tevador
I think this will definitely come up in a future review.
-
UkoeHB
doesn't mean the solution was right
-
dangerousfreedom
I have updated my findings about the malleability issue
-
dangerousfreedom
-
dangerousfreedom
It was pretty shocking to me in the beggining of the week. I was kind of panicking but now I am getting more used to the idea that it is just a huge bug IMHO :p
-
dangerousfreedom
I would like to thank moneromoo, koe and luigi for their support in trying to figure this out together.
-
dangerousfreedom
I think I importunated them quite a lot Monday :p
-
dangerousfreedom
And let me know your crypto thoughts about that ;)
-
tevador
can we prove that no coins were created in those outputs?
-
dangerousfreedom
Thats the question :)
-
dangerousfreedom
What do you think? I would love to hear.
-
tevador
the attacker's wiggle room would be just 1 bit, so I think it's safe
-
dangerousfreedom
It is kinda my feeling also now
-
dangerousfreedom
But living with the doubt is pretty bad :p
-
UkoeHB
dangerousfreedom: you should include the code I sent which demonstrates the real scalar can be reconstructed
-
UkoeHB
the issue is effectively a serialization mis-representation of proof scalars that are non-reduced, which is definitely nothing close to an exploit
-
dangerousfreedom
UkoeHB: Can you put a detailed answer on github? I would be happy to check exactly the part of the wrong code. And I also think it would be useful for educational purposes and for the blockchain sanity.
-
kayabanerve[m]
Can I once again advocate for mandated canonical encodings of all ECC values at a fundamental level?
-
kayabanerve[m]
I do hear this is solely a serialization issue and I'm fine with that. I'm now frustrated we now have 3 different scalar deserialize behaviors
-
kayabanerve[m]
Strict, reduce, and whatever algorithm this did end up applying
-
kayabanerve[m]
*I also ack this is scalar, not point, and my general ristretto advocacy is point related.
-
UkoeHB
kayabanerve[m]: scalar reduction is already enforced for everything post-2016
-
UkoeHB
it's just this Borromean proof that was implemented very naively...
-
kayabanerve[m]
๐ but I'm still using this as ristretto propaganda :p
-
kayabanerve[m]
I more meant to demonstrate we wouldn't leave the opportunity to have this done naively. While we're not naive anymore, my advocacy remains removing the opportunity, especially due to performance benefits
-
kayabanerve[m]
*but yes, at this point, it's my job to submit the pr, I know*
-
dangerousfreedom
I totally agree with kayabanerve. Using LibSodium or Ristretto would not even allow me to get these errors as it is encoded at the fundamental level. I feel like my dirty Python tools are now more reliable then Monero (it is a joke :p)
-
kayabanerve[m]
dangerousfreedom: I believe the comment is that instead of x, or x % l, you should be able to define a invalidly_assume_normalized which transforms the scalar to how it was interpreted. If you have the link to koe's code, I may be able to take a stab
-
kayabanerve[m]
I literally have an issue in my work saying I need three different transaction types
-
kayabanerve[m]
One for canonical transactions, one for verifiable transactions, and one for wallet transactions
-
kayabanerve[m]
When building wallet code, I decided not to deal with 'malformed' txs and just reject them. The issue is monero is full of them :/
-
kayabanerve[m]
So I could merge canonical + verifiable, but them I'm applying a bunch of checks and deserializations on proof data I don't need just for wallet functionality
-
dangerousfreedom
I totally see.
-
kayabanerve[m]
And I need to be sure I'm right if I move from bytes to actual ecc types. As shown here, we have three different scalar deser algorithms .-.
-
kayabanerve[m]
So I understand this is historical, we haven't been hacked, and we'll survive. I'm trying to highlight the value in rigidity and simplicity
-
dangerousfreedom
kayabanerve[m]: Cant agree more.
-
kayabanerve[m]
dangerousfreedom: want to move your python to rust ๐
-
kayabanerve[m]
:p I'll try to poach you later. PM me a link to koe's snippet and I'll try to help you when I get back home in ~25, if you need it ;)
-
kayabanerve[m]
But sounds like you'll be able to handle it fine tbh :)
-
dangerousfreedom
kayabanerve[m]: I was actually talking to a very close friend that is much better in programming than me and he was very excited about it and I think it would be good for us if we can do it :) I will message you later
-
UkoeHB
tevador: on my machine it is about 105ns to do one twofish block decipher, and 40ns to do one halfsiphash hash
-
UkoeHB
ah actually more like 95ns for the twofish decipher, there seems to be a 20ns overhead from other stuff during the address tag decipher