-
jberman[m]
ok, first showing the speed-up of removing the extra ops in `ge25519_frombytes_negate_vartime` using what I believe is what UkoeHB intended in that snippet:
paste.debian.net/1229132
-
jberman[m]
that speeds up the conversion from ed25519->curve25519 by ~85%
-
jberman[m]
will share the view tag included results in a sec
-
jberman[m]
actually not sure if that MSB zero'ing out is right. shouldn't it be `ed25519_pk_copy[0] &= UCHAR_MAX >> 1;` and not `ed25519_pk_copy[31] &= UCHAR_MAX >> 1;`?
-
UkoeHB
most significant bit means the last bit
-
UkoeHB
little endian
-
UkoeHB
unless I am mistaken about the endianness..
-
UkoeHB
> memcpy(&ed25519_pk_copy, &ed25519_pk, 32);
-
UkoeHB
address-of an array pointer? don't think this is right
-
jberman[m]
ah true, going too fast
-
jberman[m]
I see how I can check this against libsodium too. i'll figure it out
-
jberman[m]
you were right on both, fixed:
paste.debian.net/1229133
-
jberman[m]
speed-up result is same
-
bigslim[m]
hi guys
-
bigslim[m]
Question for hyc or anyone that could answer regarding
monero-project/monero #4694
-
bigslim[m]
I am unable to find any usage information for the tool. I see outputs everywhere but cannot seem to find any information on actually running it or flags needed to run specific outputs only, or if it exports full data from block 1, etc..
-
bigslim[m]
* outputs everywhere online but cannot
-
bigslim[m]
nm, think I figured it out. was able to export data from utility to excel
-
hyc
that's what --help is for
-
bigslim[m]
lol I did not even think about that
-
bigslim[m]
you smart man you
-
jberman[m]
knaccc: When including the view tag check, I observed a 55% speed-up from `ed25519 scalar mult -> view tag check` compared to the faster `ed25519->x25519 -> view tag check -> if match, ed25519 scalar mult`:
paste.debian.net/1229134
-
jberman[m]
But there's one aspect to the results that doesn't make sense to me. It seems the view tag check slows the test down by ~5% across both types, rather than a fixed ~10ms as I would have expected. Something interesting going on in there
-
jberman[m]
Also, here's time to hash view tag per 10,000 view tags (keccak, siphash 2-4, blake2, blake3):
github.com/j-berman/monero/blob/b48…e_tests/derive_view_tag.h#L217-L236
-
jberman[m]
here's a slightly cleaner view tag check 55% speed-up paste:
paste.debian.net/1229139
-
jberman[m]
also realized I left out `curve25519 scalar mulst -> view tag check`. Here it is on github:
github.com/j-berman/monero/blob/cur…ance_tests/curve25519.h#L1171-L1198
-
jberman[m]
tevador reporting that the speed-up in assembly implementations on the conversion step is not as significant:
gist.github.com/tevador/50160d160d2…ment_id=4048951#gistcomment-4048951
-
knaccc
jberman great work, thanks! it'll be a huge disappointment if it turns out that the fastest ed25519 lib available is only 12% slower than x25519. that doesn't sound like it might be a compelling-enough improvement
-
knaccc
jberman looks like sandy2x isn't all that much faster than amd64-64 :(
bench.cr.yp.to/impl-scalarmult/curve25519.html
-
jberman[m]
-
knaccc
jberman so then i guess the problem is that libsodium does not have the fastest ed25519 implementation?
-
jberman[m]
ya I'm finishing up swapping out for Monero functions now and the results seem in line with tevador's conclusion :(
-
knaccc
jberman :'(
-
jberman[m]
it was a fun day
-
knaccc
jberman i wonder then if the idea that using two different curves depending on whether you're signing or doing ecdh is just rooted in historical speed differences, and if everyone would have just stuck to ed25519 if they could have foreseen these performance improvements...
-
jberman[m]
hmm, you're saying as in the people who in the past decided to implement the ed25519->curve conversion in their systems may not have done so had they foreseen these improvements in ed25519? are these new improvements in the grand scheme of things?
-
knaccc
well i wonder if they would have seen a 12% improvement as worth the complexity of dealing with two different curves
-
knaccc
which are not interchangeable
-
knaccc
not fully interchangeable i mean
-
knaccc
i think the argument can still be made that the world's best efforts will always be on improving varbasescalarmults for x25519 and not ed25519
-
knaccc
so we can make a strategic choice to switch to x25519 in expectation of taking advantage of those advancements
-
knaccc
e.g. the person that started writing GPU code to speed up scalarmults did it for x25519 and not ed25519
-
jberman[m]
here's an article from someone briefly talking about the decision to use both:
words.filippo.io/using-ed25519-keys-for-encryption
-
jberman[m]
doesn't really say much though haha, they just wanted to reuse their x25519 code and needed ed25519 for signing
-
knaccc
-
knaccc
that's an example of all of the optimization effort going into x25519
-
knaccc
and it'd be great to not have to constantly figure out ourselves how to backport those improvements into the ed25519 version
-
knaccc
with the possibility of error that would bring
-
jberman[m]
knaccc: seems reasonable to relatively novice me
-
knaccc
ooh
-
knaccc
here is an interesting point:
-
knaccc
i'm not sure, but i think x25519 scalarmults are constant time, and ed25519 scalar mults aren't
-
knaccc
i'm not sure about the latter
-
knaccc
but that's an interesting security consideration to avoid leaking the private view key
-
knaccc
maybe it's possible to make ed25519 scalarmults happen in constant time
-
knaccc
but that would be another example of us constantly putting the ed25519 round peg into the x25519 square hole
-
knaccc
so maybe the strategic move would be to use x25519 even if it were 12% *slower* than ed25519!
-
knaccc
just for ecdh stuff of course. not elsewhere
-
jberman[m]
what do you think the logic was in choosing to use ed25519 from the start? x25519 was around back then too and it seems like it's all the rage all over the place for key exchange
-
knaccc
jberman ed25519 is faster for signature verification, which speeds other things up
-
knaccc
schnorr signatures involve fixed base scalarmult added to a variable base scalar mult, and ed25519 does that all at once faster than curve25519
-
jberman[m]
so essentially the tradeoff in the decision was nodes can verify faster but wallets could take longer to scan
-
knaccc
jberman yes it looks to me that way. the ring signatures are schnorr-based, and there are a ton of ec ops going on there
-
jberman[m]
seems perfectly sensible to me. gonna head to bed. I think it's a funny coincidence the article ends saying 10% improvements are worth the implementation effort lol. from what I can gather it seems there's a solid case brewing for x25519
-
jberman[m]
this file's the latest I've got and seems to generally line up with tevador's conclusion:
github.com/j-berman/monero/blob/cur…ests/performance_tests/curve25519.h
-
knaccc
jberman i'm a little confused - ed25519 mult including view tag check is twice as fast as just an ed25519 mult?
-
jberman[m]
In this version I added the equivalent check to get the output's public key in the final step. In the view tag check, it only needs to do that 1/256 times. That check is an extra ed25519 scalar mult base and addition
-
jberman[m]
Now that I think about it doesn't make sense to include that in the curve25519 version of the test. 1 sec I'll just remove that check
-
UkoeHB
-
jberman[m]
UkoeHB: hm, I see what you're saying. `ed25519_pk_copy` is already an address location pointing to the start of the char array, so no need to pass the address of the address. Interesting the result on my machine is the same either way there, but adding the ref to the pointer `ed25519_pk` param changes the result
-
UkoeHB
did some reading, c_arr decays to &c_arr when passed to a function
-
UkoeHB
so it is equivalent (just unneeded syntax)
-
jberman[m]
ah, got it
-
UkoeHB
the ed25519_pk is already an address, it isn't a c array
-
UkoeHB
a pointer to the first element *
-
jberman[m]
right
-
knaccc
btw on stackexchange someone pointed out that libsodium uses crypto_core_hchacha20 on the ecdh shared secret for the famous libsodium secretbox
-
knaccc
and that's lightning fast apparently
-
knaccc
although i'm not sure about the security guarantees where you have the possibility of two IKMs that differ by only a byte (i.e. the concatenated output index varint)
-
knaccc
so i'd not sleep well at night unless someone really fully understands the implications of using hchacha20
-
» moneromooo makes bank by selling knaccc bulk sleeping pills
-
knaccc
lol let's not tempt fate that a bottle of sleeping pills may become the MRL mascot :)
-
jberman[m]
lmao ok the latest file removes the output pub key check and the unneeded syntax, back to sleep for me
-
jberman[m]
(i could prolly use some of those too)
-
knaccc
:)
-
jberman[m]
UkoeHB: when I try to do something like this `&ed25519_pk_copy == ed25519_pk`, I get this compiler error: `error: comparison between distinct pointer types 'unsigned char (*)[32]' and 'unsigned char*' lacks a cast`. which seems to suggest to me that both actually have the same value (the address of the start of the array), but the former is just a weird and unnecessary way of referencing that value. I don't see what advantage the former
-
jberman[m]
type has over the latter
-
UkoeHB
just a weird C thing
-
jberman[m]
here are results of the most apple-to-apples comparison I think is possible (ref10 implementations):
paste.debian.net/1229227
-
knaccc
whoa this nano vanity address generator does 2 million curve25519 scalarmultbases per second on a GPU!
github.com/PlasmaPower/nano-vanity
-
jberman[m]
> Intel GPUs are not supported, as in most cases running the code on the integrated GPU is no faster than running it on the CPU.
-
jberman[m]
related note: would seem to suggest commodity hardware integrated GPU's may not be particularly useful for verifying the chain