#monero-dev

00:56

jberman[m]

Just got caught up with above. It was a mistake on my end to use a db read txn guard in `on_getblocktemplate` which I take responsibility for. In 7936/37, I specifically sought to only use a db read txn guards in RPC functions where it would guarantee thread safety (i.e. RPC functions which strictly read state from the db in its lower levels), which should have excluded `on_getblocktemplate` since `Blockchain::create_block_template` reads
00:56

jberman[m]

state from memory. I didn't grab the locks blindly. I was careful to try and place them as late as needed to achieve thread safe RPC calls, but made a mistake adding it to this function
00:56

jberman[m]

AFAICT the proposed fix doesn't guarantee a thread safe response to the RPC caller because the db can update in between these 2 lines: github.com/monero-project/monero/bl…rpc/core_rpc_server.cpp#L1756-L1776
00:57

jberman[m]

Personally I just would have removed the db read txn guard from `on_getblocktemplate` and left it at that for now until a better locking approach was found. But I don't see an issue with adding a read txn guard in 8383; seems like it can only help achieve a higher level of thread safety in the response to `create_block_template`
00:57

jberman[m]

A larger structural change to enable thread safe RPC calls is probably warranted I think. But I don't think the argument to revert 7936/37 is particularly sensible given what's known at this point -- I don't see how the cost definitively outweighs the benefit, but I'm open to more thoughts on it
05:44

sech1

jberman[m] line 1776 gets blocks id for the next seed height, so if can change only if a reorg deep enough happened. The field itself changes only once every 2048 blocks. Yes, it can happen, but it's very rare. Also, next_seed_hash is not used by anything at the moment. It was supposed to be used to precompute the next RandomX dataset and all miner
05:44

sech1

implementation just update the current dataset when seed_hash changes.
05:46

sech1

Yes, it can return incosistent data under rare conditions (at most every 2048 blocks, and only if a reorg happens at this exact moment), but next_seed_height changes 64 blocks before it's actually used, so worst case when it's wrong, miner who uses it (no one does) will calculate RandomX dataset one more time
05:46

sech1

So I'd give it a very low priority to fix. But the fix would be moving it all inside create_block_template
05:47

sech1

and adding next_seed_height, next_seed_hash parameters to it
05:54

sech1

add a comment to the code there in 8383
05:56

sech1

*added a comment
11:32

ooo123ooo1234567

expected: notification -> rpc -> new data with correct order of events; before 7937: notification -> rpc -> new data, but with race conditions; after 7937: notification -> rpc -> old data, but without race conditions
11:35

ooo123ooo1234567

<jberman[m]> "Personally I just would have..." <- it isn't only about thread safety, many things at once are required
11:36

ooo123ooo1234567

<jberman[m]> "A larger structural change to..." <- 7936/7937 claimed to achieve it, but it didn't and it should be reverted and/or replaced with better one
11:37

ooo123ooo1234567

the problem is that negative side effect from these changes is visible only for those who tries to rely on this api
11:38

ooo123ooo1234567

in the worst case it would be like: emergency notification -> rpc -> everything is ok -> very important notification was ignored by external api user
11:38

ooo123ooo1234567

it isn't ok
11:38

ooo123ooo1234567

* api user due old data returned in response
12:13

ooo123ooo1234567

<sech1> "add a comment to the code..." <- "// TODO: move next_height, next_seed_hash under the protection of locks in Blockchain::create_block_template to guarantee consistent data" TODO: revert 7937, close 8383, optionally add another PR
12:13

jberman[m]

I do see that argument. It would be better to receive inconsistent data in a response on some occasions (for now), than to receive all stale data after notification that data has updated, which still seems possible in other RPC endpoints that have the db read txn guard aside from this one
12:15

jberman[m]

The former seems pretty bad too. Hard to rationalize which is "better". But I think a good argument can be made that it's best to leave behavior as it's been than to introduce a separate issue
12:15

ooo123ooo1234567

jberman[m]: it's possible to write PR that adds more methods within core in order to achieve both: correct order of events without race conditions
12:16

ooo123ooo1234567

I wouldn't approve 7936/7937/8383 during review, but it was approved and now can't be even reverted
12:16

jberman[m]

Ya I see that, that will just take a bit of time
12:16

ooo123ooo1234567

jberman[m]: no time - revert, enough time - revert and/or another PR
12:19

jberman[m]

I'll submit a PR that reverts the changes of 7937 on master (7936 was never merged). I don't see an issue with 8383 regardless, do you?
12:20

jberman[m]

ooo123ooo1234567: And a complete PR for this in the future would be nice
12:20

ooo123ooo1234567

jberman[m]: It's useless placebo: placing db lock after txpool and blockchain locks
12:22

ooo123ooo1234567

if there is some inconsistency within Blockchain class, then identify it and fix with internal locks before they are removed, but not add lmdb read txn blindly
12:23

ooo123ooo1234567

* internal locks (std::mutex, not lmdb read txn) before they
12:25

jberman[m]

Eh, ya fair enough. sech1 I think he makes good arguments
12:25

jberman[m]

worth reading in the logs
12:29

sech1

I only care about get_block_template working properly.
12:30

jberman[m]

reverting 7937 should do that. If you remove the db read txn guard entirely, it works as you expect, right?
12:34

sech1

yes
12:35

sech1

m_tx_pool lock protects it from running in parallel with reorg or a new block
12:37

moneromooo

BTW, if anyone didn't see it already, the comment near the top of prepare_handle_incoming_blocks is very relevant.
12:41

sech1

comment about the order of locks? BTW, do we ever run monerod under thread sanitizer?
12:44

moneromooo

Yes. And I did once and I was none the wiser.
12:45

moneromooo

Lots and lots of messages which didn't seem to be obviously wrong, so I decided to not waste my time with it.
12:45

moneromooo

Someone who's familiar with tsan may have more luck working with it on monero.
12:46

sech1

I managed to make sense of tsan messages in p2pool (with a bit of filtering), and fixed them
12:46

sech1

But I'm scared of millions of warnings it will spit out when tested in monerod :D
13:06

MeowingCat

wallet2 is starting to get blocks only when i do wallet->createTransaction()
13:06

MeowingCat

what does this mean??
13:16

ooo123ooo1234567

<moneromooo> "Someone who's familiar with tsan..." <- uwsg.indiana.edu/hypermail/linux/kernel/0106.2/0405.html, "A Computer is a state machine. Threads are for people who can't program state machines." still correct: taking one more dependency into consideration vs just use one more thread (works until problems with synchronization) vs just use new more memory (works until problems with out-of-memory), not correct:
13:16

ooo123ooo1234567

welcome vulnerabilities
13:17

MeowingCat

ohh i found wallet->startRefresh()
13:25

MeowingCat

can anyone say me does wallet2 saves current state of blockchain synchronization?????
13:25

MeowingCat

im doing wallet->store() and it is storing something in a .key file buttt i need to know does it save current state???
14:04

MeowingCat

[1655126920] libunbound[41972:0] error: can't bind socket: Permission denied. for 0.0.0.0
14:05

MeowingCat

i got this while sending transaction
14:05

MeowingCat

this doesn't sound gooood if wallet2 needs to bind sockets for mobile networks
14:06

MeowingCat

because damn mobile networks are working behind damn NATs
14:07

hyc

note that LMDB read txns do nothing for thread safety. LMDB reads are lockless and never block.
14:08

hyc

a read txn's only purpose is for data consistency: all reads performed within the same read txn see data consistent with the moment the read txn began.
14:08

moneromooo

You can disable dns for the wallet. It's really unused now, it was a failsafe at a time it was needed.
14:09

hyc

if you had an unsafe lock ordering before, sprinkling LMDB readtxns into the code will not affect it, it will still be unsafe.
14:12

hyc

you could also consider using valgrind/helgrind to look for data races and such. of course valgrind runs like 100x slower than realtime
14:22

ooo123ooo1234567

<jberman[m]> "I'll submit a PR that reverts..." <- `git revert refs/pull/7937/head` no conflicts, pretty straightforward
14:26

UkoeHB

hyc: if a read txn is started while a different read is still active, will the second one have a newer view on the db?
14:27

ooo123ooo1234567

UkoeHB: yes, since events are: read txn -> write txn -> read txn
14:27

ooo123ooo1234567

otherwise db isn't consistent
14:38

hyc

a readtxn has the view of the DB that was in effect when the readtxn began
14:38

hyc

it doesn't matter what any other readtxn is present
14:38

hyc

or doing or not
14:45

ooo123ooo1234567

s/refs/pull/7937/head/50410d1f7d04bf60053f2263410c39e81d3ddad1/
14:46

hyc

as far as consistency goes, there's nothing wrong with readtxn -> writetxn [readtxn continues to return old data]
14:46

hyc

from a correctness standpoint, every op within the readtxn occurred before the writetxn ever happened.
14:47

ooo123ooo1234567

s/refs/pull/7937/head/-m 1 50410d1f7d04bf60053f2263410c39e81d3ddad1/
14:48

UkoeHB

I see, interesting
14:48

hyc

the only thing that matters as far as clinet-visible correctness is that any single request from a client invokes only a single DB txn, total.
14:49

hyc

if you're opening and closing many DB txns to process a single client request, then all consistency is lost
15:56

ooo123ooo1234567

teddit.net/r/Monero/comments/vb7fmh…_is_the_time_to_decentralize_monero, right place to discuss changes in monero without reading code and critical thinking; the most upvoted comment suggests to use private key signing to force p2pool; facepalm
17:30

ooo123ooo1234567

monero-project/monero #8384, approval msg doesn't contain any conclusion about PR correctness and revert should be merge commit 50410d1f7d04bf60053f2263410c39e81d3ddad1, since whole PR isn't ok
17:31

ooo123ooo1234567

* should be for merge commit
17:33

ooo123ooo1234567

"A stronger long term fix to ensure consistent db reads for RPC functions that will still return the latest data is preferred. See one such proposed solution here." the purpose is to have consistent api (rpc / zmq / ... / notifications), not just db reads, since currently db doesn't store full state of daemon
17:34

ooo123ooo1234567

actually understanding of the right goal would help to not do that mistake
17:50

jberman[m]

sure, updated
17:52

ooo123ooo1234567

jberman[m]: wow
17:54

jberman[m]

what
17:55

ooo123ooo1234567

jberman[m]: I mean unexpected, no questions
18:04

ooo123ooo1234567

monero-project/monero #8384#issuecomment-1154219150, conclusion should be placed into approval; if in doubt what to choose for conclusion, here are some suggestions: correct / may be correct / might be correct / certainly improvement / useful change / better than nothing / worse than nothing
18:07

ooo123ooo1234567

In theory, there could 1 million of random comments, but the ones in approval should be non-empty and not random
18:08

jeffro256[m]

I wasn't specifically approving though, just confirming that the PR does what it says it does
18:09

jeffro256[m]

Because deciding whether reverting 7937 is the best course of action is entirely a different issue
18:15

ooo123ooo1234567

<jeffro256[m]> "Because deciding whether..." <- then such comment isn't very useful since it doesn't take any responsibility
18:19

ooo123ooo1234567

In theory, sech1 could approve it and it would be enough to add it into merge queue
18:22

ooo123ooo1234567

<jeffro256[m]> "Because deciding whether..." <- how much time would it require to make this decision ? any arguments against it ?
18:23

selsta

there is already support to revert it from people involved
18:23

ooo123ooo1234567

s/it/be/, s/require/needed/
18:24

hyc

so we know for certain that 7937 is worse than the existing behavior?
18:25

ooo123ooo1234567

selsta: it's about review process in general, not about that specific commit. Just a study case for problems with bad review process
18:25

selsta

but if anyone has arguments against it they should be added to the PR, or here in the chat
18:28

jeffro256[m]

> > <@jeffro256:monero.social> Because deciding whether reverting 7937 is the best course of action is entirely a different issue
18:28

jeffro256[m]

>
18:28

jeffro256[m]

> then such comment isn't very useful since it doesn't take any responsibility
18:28

jeffro256[m]

Okay? I wasn't meaning to take responsibility, just stating that the PR reverts 7937. I'm also not opposing it, because I'm still looking at it
18:28

jeffro256[m]

It's useful to have multiple independent parties review PRs, even seemingly straightforward ones such as these
18:31

ooo123ooo1234567

jeffro256[m]: straightforward - approve it (with conclusion), not straightforward - add some arguments against
18:38

ooo123ooo1234567

<jeffro256[m]> "It's useful to have multiple..." <- many partial reviews can't replace 1 end-to-end, especially in cryptocurrency project where all details are important
18:40

jeffro256[m]

The changes themselves are straightforward (reverting 7937), the implications and long-term goal is less so (keeping it vs not keeping it, working on preserving order + integrity with regards to data, approving/introducing competing changes, etc). I was stating the fact the the changes themselves are correct (i.e. that PR 8384 reverts PR 7937) but was not yet making any statement as whether or not I generally approve of the concept
18:40

jeffro256[m]

(hint: I do thus far but I am taking a slightly longer look at it). Reviews often take place over multiple steps. I never said I was done reviewing but you seemed determined to get me to push a "end-to-end" review through as fast as possible, which is usually not the correct course of action
18:42

ooo123ooo1234567

jeffro256[m]: try to disconnect personality from technical problem; I've just noticed the message that repeats conclusion of another approval and suggested to put it into approval.
18:45

jeffro256[m]

"... and suggested to put it into approval." That's the thing: ***I don't approve yet***
18:45

jeffro256[m]

***yet***. That comment was meant to be taking at face value: it reverts the PR, nothing more, nothing less
18:54

rbrunner

I wonder how to interpret your remark "I second that this PR correctly reverts #7937" but no approval from you on that PR right now differently.
19:00

jeffro256[m]

Thank you. I did rephrase it to hopefully make it more clear
19:23

jeffro256[m]

monero-project/monero #8317#issuecomment-1120474475
19:24

jeffro256[m]

Would anyone be willing to run this PR and see if you also get the same deadlock problem?
19:28

MeowingCat

when i pass a filename to path in Monero::WalletManager::recoveryWallet(path, ...) public address is being a different and wrong address with 111111111111s
19:28

MeowingCat

when i pass empty string it is true
19:28

MeowingCat

as wallet path
19:34

selsta

MeowingCat: github.com/monero-project/monero-gu…alletqt/WalletManager.cpp#L145-L155

3 years ago

« a day earlier

a day later »

today »