- 
selsta
hyc: i found the issue, randomx tests are failing
 
- 
selsta
[84] Hash test 2a (compiler)                  ... Assertion failed: (equalsHex(hash, "639183aae1bf4c9a35884cb46b09cad9175f04efd7684e7262a0ac1c2f0b4e3f")), function operator(), file tests.cpp, line 966.
 
- 
selsta
sech1: ^^
 
- 
hyc
selsta I've got macos 12.0.1
 
- 
selsta
i installed 13.0 so that i can test if everything monero related works, and this randomx issue showed up
 
- 
selsta
is there an env var to disable randomx jit?
 
- 
hyc
yes\
 
- 
hyc
MONERO_RANDOMX_UMASK
 
- 
selsta
MONERO_RANDOMX_UMASK=8
 
- 
selsta
that's what i found
 
- 
hyc
so jit is broken on macos 13?
 
- 
selsta
yep
 
- 
selsta
it's beta so maybe macOS itself is broken but so far everything else works
 
- 
hyc
try doing make test in the randomx source tree?
 
- 
selsta
that's where i got test 84 failing
 
- 
selsta
i get a different hash every single time i run ./randomx-tests
 
- 
hyc
that's not good....
 
- 
selsta
 
- 
selsta
2a sometimes passes but then it fails at 2b, other times it fails at 2a directly, it's inconsistent
 
- 
hyc
try editing src/virtual_memory.cpp and make sure USE_PTHREAD_JIT_WP stays undefined
 
- 
selsta
same issue
 
- 
hyc
I wonder if CPU cache invalidation isn't happening.
 
- 
hyc
might try explicitly calling sys_icache_invalidate() instead of __builtin__clear_cache
 
- 
hyc
 
- 
hyc
I feel like we've had this conversation before...
 
- 
selsta
checking
 
- 
hyc
then again, gcc should automatically be emitting that when __builtin__clear_cache is used
 
- 
hyc
but still, worth a try
 
- 
hyc
yeah I looked up this same question november 2020
 
- 
hyc
nov 29 2020
 
- 
hyc
xmrig calls it explicitly instead of the __builtin so it's worth a try anyway
 
- 
selsta
didn't help
 
- 
hyc
probably will need sech1 to get onto a Mac with OS13 to debug. sounds like an OS bug tho
 
- 
selsta
I'll try to get it set up
 
- 
selsta
i'll also check if xmrig has the same issue
 
- 
hyc
does xmrig --bench=1M return the right hash?
 
- 
hyc
you might need to also specify a --seed
 
- 
hyc
"right hash" == same hash sum each time
 
- 
selsta
fish: Job 1, './xmrig --bench=1M --seed "test"' terminated by signal SIGSEGV (Address boundary error)
 
- 
hyc
got enough RAM?
 
- 
selsta
32GB
 
- 
hyc
no idea what tripped there
 
- 
hyc
hm, bench defaults to seed of all zero, so try without --seed
 
- 
selsta
same
 
- 
selsta
I'm starting to wonder if I should just install the old OS again
 
- 
hyc
and never had this problem on os12?
 
- 
selsta
no
 
- 
hyc
sounds like it
 
- 
selsta
monero wallet sync also crashes
 
- 
selsta
use after free or address boundary error
 
- 
hyc
can't believe a use after free wouldn't be caught on every other os too
 
- 
selsta
monero-wallet-cli(29852,0x16eaaf000) malloc: Incorrect checksum for freed object 0x137464610: probably modified after being freed.
 
- 
selsta
Corrupt value: 0x0
 
- 
hyc
... or a heap overrun
 
- 
hyc
could run wallet sync on x86-64 with valgrind
 
- 
hyc
if it always dies in same place, can compare stack traces
 
- 
selsta
doesn't always die in the same place
 
- 
selsta
it's... super weird
 
- 
hyc
but par for the course... along with the random connection drops and other crap
 
- 
hyc
hard to imagine how they can screw up a BSD based OS so badly
 
- 
selsta
the weird thing is apart from monero and randomx i didn't find any issues yet, web browsers also work fine (i assume they use jit)
 
- 
hyc
hm, probably, yeah
 
- 
selsta
trying to use the binaries generated with depends now
 
- 
selsta
in case it's some compiler error, no idea
 
- 
hyc
hm, the depends build will use the OSX 11.0 SDK and its compiler 
 
- 
selsta
ok same
 
- 
hyc
and that binary works on OS 12
 
- 
hyc
so I don't see a compiler bug being likely
 
- 
selsta
so i'll rollback my laptop and try to install macOS 13 on that macmini we have
 
- 
sech1
selsta it really looks like cache invalidation problem. Could you run xmrig under gdb and get a callstack?
 
- 
hyc
i don't think gdb really works on macos. prob need to use lldb
 
- 
tobtoht[m]
<selsta> "we don't use lmdb for the wallet..." <- only ringdb uses lmdb
 
- 
TrasherDK[m]
Is it normal for zmq-pub to miss publishing mem-pool tx ?
 
- 
sech1
That's unusual
 
- 
TrasherDK[m]
 
- 
sech1
It shouldn't miss publishing, but keep in mind that it doesn't dump everything in mempool when you first connect. It only publishes new transactions
 
- 
TrasherDK[m]
On stagenet, sender show txid:6226f5d745bfaa158a70ba7b3e0d8b41e71e40b3cafb9449a0e839a45d3981eb
 
- 
TrasherDK[m]
receiver show both mem-pool: 2022-07-07 15:03:42       0.500000000000 6226f5d745bfaa158a70ba7b3e0d8b41e71e40b3cafb9449a0e839a45d3981eb 0000000000000000 0.000000000000 57Es8x:0.500000000000 0
 
- 
TrasherDK[m]
and receiving: Height 1130735, txid <6226f5d745bfaa158a70ba7b3e0d8b41e71e40b3cafb9449a0e839a45d3981eb>, 0.500000000000, idx 0/0
 
- 
TrasherDK[m]
But nothing received from zmq-pub
 
- 
sech1
did you submit this tx to your node? zmq-pub might skip publishing transactions which are in dandelion stem phase
 
- 
TrasherDK[m]
Yes. My node and both wallets are on the same host.
 
- 
sech1
 
- 
TrasherDK[m]
The Node script is on a different host.
 
- 
sech1
relay_category::legacy, not sure which transactions it is
 
- 
sech1
yes, it matches only for dandelion fluff transactions
 
- 
sech1
so if the node adds transaction to mempool during stem phase, it's not sent vid zmq-pub
 
- 
sech1
*via
 
- 
sech1
which makes sense because this is an interface for miners (dandelion transactions should be mined only after stem phase has finished)
 
- 
sech1
but it should still publish them after they switch to fluff
 
- 
sech1
but it doesn't and it can be considered a bug
 
- 
TrasherDK[m]
Nah. It never arrived on the sup end. How can I ensure I get all tx's ?
 
- 
sech1
you can remove "matches_category(tx_relay, relay_category::legacy)" from that line if you don't mind rebuilding monerod
 
- 
selsta
sech1: run xmrig benchmark under lldb?
 
- 
TrasherDK[m]
The wallet confirm prompt was probably more than a minute after transfer command 
 
- 
sech1
a minute sounds like Dandelion++ delay
 
- 
sech1
selsta if you can, yes
 
- 
selsta
TrasherDK[m]: did you disable dns?
 
- 
TrasherDK[m]
disable-dns-checkpoints=1
 
- 
TrasherDK[m]
enable-dns-blocklist=1
 
- 
selsta
I meant wallet --no-dns
 
- 
TrasherDK[m]
Probably not. Checking...
 
- 
selsta
that was one thing that caused slow tx generation but not sure if it's related to what you are writong
 
- 
selsta
writing
 
- 
TrasherDK[m]
Okay, dns disabled and sessions restarted. Let's see how that goes.
 
- 
TrasherDK[m]
This one never came on zmq-sup: 8fa62730c2c82cc45703471b085c46e771ebc244c98287497857b4bf82685f75
 
- 
TrasherDK[m]
No show: cbd114c486f60f8ad7532a9e0327c9a86a16217799d6e5bb4c9d002131f25d00
 
- 
TrasherDK[m]
No show: 6226f5d745bfaa158a70ba7b3e0d8b41e71e40b3cafb9449a0e839a45d3981eb
 
- 
selsta
 
- 
selsta
does not seem useful
 
- 
sech1
quite useful actually
 
- 
sech1
or not
 
- 
sech1
one thread is in hashAndFillAes1Rx4 but it's not the thread that crashed
 
- 
sech1
it's probably the same cache flushing problem
 
- 
sech1
code cache flushing
 
- 
sech1
it definitely crashed in JIT generated code, in one of RandomX FP instructions
 
- 
kayabanerve[m]
`libcryptonote_basic.a(cryptonote_format_utils.cpp.o): in function cryptonote::get_pruned_transaction_hash: undefined reference to cryptonote::get_transaction_prefix_hash(cryptonote::transaction_prefix const&, crypto::hash&)`
 
- 
kayabanerve[m]
How did I get an undefined reference in a lib to something defined in the same lib?
 
- 
kayabaNerve
I just need bp_prove and hash_to_curve and I've now linked ~11 libs I don't want just trying to get this to compile. I think this is my last error
 
- 
sech1
it can happen if that function is inlined by compiler
 
- 
kayabaNerve
... so how do I get it to *not* be inlined?
 
- 
kayabaNerve
I just ran make. Didn't do any CMAKE config
 
- 
kayabaNerve
Though I am also looking for a CMAKE to disable all optional depends. I have hidapi rn and I do not want it
 
- 
sech1
hmm, actually there's no such function
 
- 
sech1
"cryptonote::transaction_prefix const&, crypto::hash&" with this specific signature
 
- 
sech1
there is one with "(const transaction_prefix& tx, crypto::hash& h, hw::device &hwdev)" signature
 
- 
sech1
ah, it's in cryptonote_format_utils_basic.cpp
 
- 
kayabaNerve
Yep. Which is still in cn_basic AFAICT
 
- 
kayabaNerve
I also have undefined reference *from* device
 
- 
sech1
it's in cryptonote_format_utils_basic library
 
- 
kayabaNerve
... ah
 
- 
sech1
see src/cryptonote_basic/CMakeLists.txt
 
- 
kayabanerve[m]
Thanks :)
 
- 
jberman[m]
I can repro TrasherDK  's issue when my daemon is in the fluff epoch, and when stem isn't working. Trasher do you see this error in your daemon: `Unable to send transaction(s) via Dandelion++ stem`?
 
- 
jberman[m]
In fluff: my tx passes through my node with `tx_relay::local` so it doesn't get pushed out over zmq, then `on_transactions_relayed` in `dandelionpp_notify` will default upgrade it to "fluff" without pushing it out over zmq
 
- 
jberman[m]
When stem is working: my tx passes through my node with `tx_relay::stem`, then once my node sees it in the network from another node, the tx gets re-added to my node's tx pool and pushed over zmq because `already_have` in `handle_incoming_txs` is false
 
- 
TrasherDK[m]
jberman: I have only 2 different log messages:
 
- 
TrasherDK[m]
 I background mining is enabled,
 
- 
TrasherDK[m]
 I Found block <6a9e00293490fa5d
 
- 
jberman[m]
every time you've ever tried submitting a tx through your own node, you don't see the tx in zmq? It's not a sporadic thing?
 
- 
TrasherDK[m]
It's like 2-3 transfers is OK, 1 is not, then 2-3 OK and so on.
 
- 
TrasherDK[m]
I'm at default log_level, maybe a higher would help?
 
- 
jberman[m]
can you try compiling with `const bool fluffing = false;` and seeing if they come through 100% of the time? 
github.com/monero-project/monero/bl…note_protocol/levin_notify.cpp#L701 
 
- 
jberman[m]
`const bool fluffing = false;` the txs should always show up in zmq.. `const bool fluffing = true;` the txs should never show up
 
- 
TrasherDK[m]
I'll give it a try.