-
selsta
hyc: i found the issue, randomx tests are failing
-
selsta
[84] Hash test 2a (compiler) ... Assertion failed: (equalsHex(hash, "639183aae1bf4c9a35884cb46b09cad9175f04efd7684e7262a0ac1c2f0b4e3f")), function operator(), file tests.cpp, line 966.
-
selsta
sech1: ^^
-
hyc
selsta I've got macos 12.0.1
-
selsta
i installed 13.0 so that i can test if everything monero related works, and this randomx issue showed up
-
selsta
is there an env var to disable randomx jit?
-
hyc
yes\
-
hyc
MONERO_RANDOMX_UMASK
-
selsta
MONERO_RANDOMX_UMASK=8
-
selsta
that's what i found
-
hyc
so jit is broken on macos 13?
-
selsta
yep
-
selsta
it's beta so maybe macOS itself is broken but so far everything else works
-
hyc
try doing make test in the randomx source tree?
-
selsta
that's where i got test 84 failing
-
selsta
i get a different hash every single time i run ./randomx-tests
-
hyc
that's not good....
-
selsta
-
selsta
2a sometimes passes but then it fails at 2b, other times it fails at 2a directly, it's inconsistent
-
hyc
try editing src/virtual_memory.cpp and make sure USE_PTHREAD_JIT_WP stays undefined
-
selsta
same issue
-
hyc
I wonder if CPU cache invalidation isn't happening.
-
hyc
might try explicitly calling sys_icache_invalidate() instead of __builtin__clear_cache
-
hyc
-
hyc
I feel like we've had this conversation before...
-
selsta
checking
-
hyc
then again, gcc should automatically be emitting that when __builtin__clear_cache is used
-
hyc
but still, worth a try
-
hyc
yeah I looked up this same question november 2020
-
hyc
nov 29 2020
-
hyc
xmrig calls it explicitly instead of the __builtin so it's worth a try anyway
-
selsta
didn't help
-
hyc
probably will need sech1 to get onto a Mac with OS13 to debug. sounds like an OS bug tho
-
selsta
I'll try to get it set up
-
selsta
i'll also check if xmrig has the same issue
-
hyc
does xmrig --bench=1M return the right hash?
-
hyc
you might need to also specify a --seed
-
hyc
"right hash" == same hash sum each time
-
selsta
fish: Job 1, './xmrig --bench=1M --seed "test"' terminated by signal SIGSEGV (Address boundary error)
-
hyc
got enough RAM?
-
selsta
32GB
-
hyc
no idea what tripped there
-
hyc
hm, bench defaults to seed of all zero, so try without --seed
-
selsta
same
-
selsta
I'm starting to wonder if I should just install the old OS again
-
hyc
and never had this problem on os12?
-
selsta
no
-
hyc
sounds like it
-
selsta
monero wallet sync also crashes
-
selsta
use after free or address boundary error
-
hyc
can't believe a use after free wouldn't be caught on every other os too
-
selsta
monero-wallet-cli(29852,0x16eaaf000) malloc: Incorrect checksum for freed object 0x137464610: probably modified after being freed.
-
selsta
Corrupt value: 0x0
-
hyc
... or a heap overrun
-
hyc
could run wallet sync on x86-64 with valgrind
-
hyc
if it always dies in same place, can compare stack traces
-
selsta
doesn't always die in the same place
-
selsta
it's... super weird
-
hyc
but par for the course... along with the random connection drops and other crap
-
hyc
hard to imagine how they can screw up a BSD based OS so badly
-
selsta
the weird thing is apart from monero and randomx i didn't find any issues yet, web browsers also work fine (i assume they use jit)
-
hyc
hm, probably, yeah
-
selsta
trying to use the binaries generated with depends now
-
selsta
in case it's some compiler error, no idea
-
hyc
hm, the depends build will use the OSX 11.0 SDK and its compiler
-
selsta
ok same
-
hyc
and that binary works on OS 12
-
hyc
so I don't see a compiler bug being likely
-
selsta
so i'll rollback my laptop and try to install macOS 13 on that macmini we have
-
sech1
selsta it really looks like cache invalidation problem. Could you run xmrig under gdb and get a callstack?
-
hyc
i don't think gdb really works on macos. prob need to use lldb
-
tobtoht[m]
<selsta> "we don't use lmdb for the wallet..." <- only ringdb uses lmdb
-
TrasherDK[m]
Is it normal for zmq-pub to miss publishing mem-pool tx ?
-
sech1
That's unusual
-
TrasherDK[m]
-
sech1
It shouldn't miss publishing, but keep in mind that it doesn't dump everything in mempool when you first connect. It only publishes new transactions
-
TrasherDK[m]
On stagenet, sender show txid:6226f5d745bfaa158a70ba7b3e0d8b41e71e40b3cafb9449a0e839a45d3981eb
-
TrasherDK[m]
receiver show both mem-pool: 2022-07-07 15:03:42 0.500000000000 6226f5d745bfaa158a70ba7b3e0d8b41e71e40b3cafb9449a0e839a45d3981eb 0000000000000000 0.000000000000 57Es8x:0.500000000000 0
-
TrasherDK[m]
and receiving: Height 1130735, txid <6226f5d745bfaa158a70ba7b3e0d8b41e71e40b3cafb9449a0e839a45d3981eb>, 0.500000000000, idx 0/0
-
TrasherDK[m]
But nothing received from zmq-pub
-
sech1
did you submit this tx to your node? zmq-pub might skip publishing transactions which are in dandelion stem phase
-
TrasherDK[m]
Yes. My node and both wallets are on the same host.
-
sech1
-
TrasherDK[m]
The Node script is on a different host.
-
sech1
relay_category::legacy, not sure which transactions it is
-
sech1
yes, it matches only for dandelion fluff transactions
-
sech1
so if the node adds transaction to mempool during stem phase, it's not sent vid zmq-pub
-
sech1
*via
-
sech1
which makes sense because this is an interface for miners (dandelion transactions should be mined only after stem phase has finished)
-
sech1
but it should still publish them after they switch to fluff
-
sech1
but it doesn't and it can be considered a bug
-
TrasherDK[m]
Nah. It never arrived on the sup end. How can I ensure I get all tx's ?
-
sech1
you can remove "matches_category(tx_relay, relay_category::legacy)" from that line if you don't mind rebuilding monerod
-
selsta
sech1: run xmrig benchmark under lldb?
-
TrasherDK[m]
The wallet confirm prompt was probably more than a minute after transfer command
-
sech1
a minute sounds like Dandelion++ delay
-
sech1
selsta if you can, yes
-
selsta
TrasherDK[m]: did you disable dns?
-
TrasherDK[m]
disable-dns-checkpoints=1
-
TrasherDK[m]
enable-dns-blocklist=1
-
selsta
I meant wallet --no-dns
-
TrasherDK[m]
Probably not. Checking...
-
selsta
that was one thing that caused slow tx generation but not sure if it's related to what you are writong
-
selsta
writing
-
TrasherDK[m]
Okay, dns disabled and sessions restarted. Let's see how that goes.
-
TrasherDK[m]
This one never came on zmq-sup: 8fa62730c2c82cc45703471b085c46e771ebc244c98287497857b4bf82685f75
-
TrasherDK[m]
No show: cbd114c486f60f8ad7532a9e0327c9a86a16217799d6e5bb4c9d002131f25d00
-
TrasherDK[m]
No show: 6226f5d745bfaa158a70ba7b3e0d8b41e71e40b3cafb9449a0e839a45d3981eb
-
selsta
-
selsta
does not seem useful
-
sech1
quite useful actually
-
sech1
or not
-
sech1
one thread is in hashAndFillAes1Rx4 but it's not the thread that crashed
-
sech1
it's probably the same cache flushing problem
-
sech1
code cache flushing
-
sech1
it definitely crashed in JIT generated code, in one of RandomX FP instructions
-
kayabanerve[m]
`libcryptonote_basic.a(cryptonote_format_utils.cpp.o): in function cryptonote::get_pruned_transaction_hash: undefined reference to cryptonote::get_transaction_prefix_hash(cryptonote::transaction_prefix const&, crypto::hash&)`
-
kayabanerve[m]
How did I get an undefined reference in a lib to something defined in the same lib?
-
kayabaNerve
I just need bp_prove and hash_to_curve and I've now linked ~11 libs I don't want just trying to get this to compile. I think this is my last error
-
sech1
it can happen if that function is inlined by compiler
-
kayabaNerve
... so how do I get it to *not* be inlined?
-
kayabaNerve
I just ran make. Didn't do any CMAKE config
-
kayabaNerve
Though I am also looking for a CMAKE to disable all optional depends. I have hidapi rn and I do not want it
-
sech1
hmm, actually there's no such function
-
sech1
"cryptonote::transaction_prefix const&, crypto::hash&" with this specific signature
-
sech1
there is one with "(const transaction_prefix& tx, crypto::hash& h, hw::device &hwdev)" signature
-
sech1
ah, it's in cryptonote_format_utils_basic.cpp
-
kayabaNerve
Yep. Which is still in cn_basic AFAICT
-
kayabaNerve
I also have undefined reference *from* device
-
sech1
it's in cryptonote_format_utils_basic library
-
kayabaNerve
... ah
-
sech1
see src/cryptonote_basic/CMakeLists.txt
-
kayabanerve[m]
Thanks :)
-
jberman[m]
I can repro TrasherDK 's issue when my daemon is in the fluff epoch, and when stem isn't working. Trasher do you see this error in your daemon: `Unable to send transaction(s) via Dandelion++ stem`?
-
jberman[m]
In fluff: my tx passes through my node with `tx_relay::local` so it doesn't get pushed out over zmq, then `on_transactions_relayed` in `dandelionpp_notify` will default upgrade it to "fluff" without pushing it out over zmq
-
jberman[m]
When stem is working: my tx passes through my node with `tx_relay::stem`, then once my node sees it in the network from another node, the tx gets re-added to my node's tx pool and pushed over zmq because `already_have` in `handle_incoming_txs` is false
-
TrasherDK[m]
jberman: I have only 2 different log messages:
-
TrasherDK[m]
I background mining is enabled,
-
TrasherDK[m]
I Found block <6a9e00293490fa5d
-
jberman[m]
every time you've ever tried submitting a tx through your own node, you don't see the tx in zmq? It's not a sporadic thing?
-
TrasherDK[m]
It's like 2-3 transfers is OK, 1 is not, then 2-3 OK and so on.
-
TrasherDK[m]
I'm at default log_level, maybe a higher would help?
-
jberman[m]
can you try compiling with `const bool fluffing = false;` and seeing if they come through 100% of the time?
github.com/monero-project/monero/bl…note_protocol/levin_notify.cpp#L701
-
jberman[m]
`const bool fluffing = false;` the txs should always show up in zmq.. `const bool fluffing = true;` the txs should never show up
-
TrasherDK[m]
I'll give it a try.