-
br-m
<kayabanerve:matrix.org> FROSTLASS was proven by CS and thew FCMP++ upgrades provides a near-traditional Schnorr signature from a multisig PoV.
-
br-m
<kayabanerve:matrix.org> So focusing on FCMP++ reviews could include the new multisig, but the CLSAG multisig already has a formally proven option available.
-
br-m
<jpk68:matrix.org> Would be nice to have multisig reworked in the core implementation. FROSTLASS seems really cool but is way beyond my understanding :((
-
br-m
<jberman> I spent the day with Claude Opus 4.8 on max effort digging for an inflation vulnerability in Monero's current consensus code like sech1 did. From Claude:
-
br-m
<jberman> > After an exhaustive review of the consensus cryptography on release-v0.18 — the underlying papers' mathematics (proofs, definitions, assumptions, and concrete security bounds), the verification code reachable from handle_block_to_main_chain, and 3,000+ executed known‑answer tests cross‑checked against an independen [... too long, see
mrelay.p2pool.observer/e/9IuI1ooLbkhvZjlt ]
-
br-m
<jberman> I'll do the same for FCMP++ after the audits are all complete, and I'll also improve the framework and iterate to continue auditing Monero's code (obviously, prioritizing work on FCMP++)
-
br-m
<jberman> What I did: fed the LLM the CLSAG and BP+ papers and audits (like sech1), the blog post explaining Monero's past detectable inflation bug from 2017, and (even though not directly relevant to Monero's code) Taylor's writeup explaining Zcash's recent hidden inflation vulnerability too.
-
br-m
<jberman> I had it look for mathematical flaws in the CLSAG and BP+ papers (since that has happened in the past e.g. Zcash's other hidden inflation vulnerability in Sprout). Then I had it go through Monero's code starting from handle_block_to_main_chain, and dig deep into every section, with emphasis on crypto functions in src/crypto [... too long, see
mrelay.p2pool.observer/e/xqaO1ooLMGJWNHJi ]
-
sech1
I'm running it right now too :)
-
sech1
I fed it all Monero audit PDFs, and asked to find inflation/double spend bugs in src/ringct and other related files (like files in cryptonote_core)
-
br-m
-
br-m
<jberman> I had it write that summary also^
-
sech1
jberman did you run Claude code on Monero repository folder? It's much more efficient when Claude can look around all files and run commands
-
br-m
<jberman> yep, and I downloaded those pdf's into the local repo and had it read from the local
-
br-m
<jberman> I ran it on release-v0.18
-
sech1
same
-
sech1
pdfs in the local folder
-
sech1
looks like I'll run out of the 5-hour session limit before it finishes...
-
sech1
yeah, looks like your audit is much more in-depth than what I'm doing anyway jberman
-
sech1
My prompts were quite simple
-
br-m
<jberman> I'll share my session in a sec so you can see my prompts too
-
br-m
<ofrnxmr:xmr.mx> are you guys whitelisted?
-
sech1
-
sech1
I'm not whitelisted
-
br-m
<jberman> I'm not either
-
br-m
<ofrnxmr:xmr.mx> allegedly its easy to get whitelisted
-
br-m
<ofrnxmr:xmr.mx> easy for real devs/researchers*. the zcash guy didnt have to kyc or anything
-
sech1
As far as I understand, whitelist is needed if you actually want Claude to write an exploit for you
-
sech1
I had no rejections when I asked it to find bugs and then suggest how to fix them
-
br-m
<jberman> ^
-
br-m
<jberman> that's what I understood as well, and same
-
br-m
<ofrnxmr:xmr.mx> i still think it would probably avoid telling you about an exploit if it found one
-
sech1
That Claude audit used 97% of the session limit, not bad
-
br-m
<ofrnxmr:xmr.mx> ie i think it would be a good idea to get whitelisted, if not too much trouble
-
br-m
<ofrnxmr:xmr.mx> (otherwise youre still using a neutered version)
-
sech1
It would tell if it's not allowed to say something
-
sech1
"As an AI model, bla bla bla"
-
br-m
<boog900> Easy way to test: add an inflation bug and see if it catches it.
-
sech1
If I reintroduce the 2017 inflation bug, for example, Claude will just find it because it knows it
-
sech1
I don't know the math/codebase in that part well enough to introduce something more subtle. jberman can try
-
moneromooo
If it's clever, it might diff your new code vs previous known code and see the change.
-
sech1
It actually does
-
sech1
It often goes "okay, let's see if local files are byte identical to what's in the repo" when I ask it to find bugs
-
sech1
it finds planted bugs this way
-
sech1
and then when it is byte identical, it goes "oh, this is identical, so it's a real security review" lol
-
sech1
but then it still finds something worth fixing
-
br-m
<plowsof:matrix.org> i doubt a 15$ subscription can find it. 'just' spend at least 100k$ on tokens and let it work for longer
-
br-m
<jberman> I tried something pretty simple, here's where it's at right now:
-
br-m
<plowsof:matrix.org> promising that the 15$ subscription can't find it
-
br-m
<jberman> > I've found something critical. Let me read it carefully and verify before concluding — lines 4195–4208 contain the ver_non_input_consensus(extra_block_txs, …) call commented out.
-
br-m
<jberman> so it says if it finds something critical before doing more work to contextually validate it
-
sech1
yes, it keeps the user updated about what it's doing
-
br-m
<jberman> now it wants to check git
-
sech1
if you press "ctrl+o" you'll see more of what it's doing
-
br-m
<jberman> so unless it's lying about checking git history, then..
-
sech1
it does check git history
-
sech1
it runs git commands
-
sech1
in my review I posted, it mentions that it checked commits made after the audit
-
br-m
<jberman> No I mean, clearly it just identified this mock critical vuln, but it doesn't yet know if it's a mock because it hasn't yet checked git
-
sech1
yeah
-
br-m
<jberman> which implies it points out a critical vuln if it finds one
-
br-m
<jberman> ya if I tell it not to check git, and that this is the live code on the network, then it says it's a critical vuln implying it's not neutered from finding a crit. Either way, can get whitelisted. But it doesn't look like it would have made a difference unless a prompt explicitly got rejected by their API
-
selsta
Did anyone of you try to have an init steps first before asking Claude to audit?
-
selsta
that's what Zcash did
-
sech1
what's init steps?
-
sech1
I'll have a new 5-hour session in 1.5 hours available, I can try.
-
selsta
>Enumerate all relevant implementation components, specification claims, invariants, trust assumptions, cryptographic checks, data flows, boundary conditions, and plausible failure modes. For each item, produce concrete audit tasks assignable to specialized agents.
-
selsta
first ask it something like this to create ah audit map, and only then ask it to audit against it
-
selsta
an
-
sech1
and I forgot to feed it CLSAG audit pdf, I'll do it next time
-
selsta
so that the actual code review and it knowing what to check is separate
-
br-m
<redsh4de:matrix.org> @jberman: it won't make a difference unless you use phrasing that triggers their cybersec classifier. if you use adversarial words like "im trying to exploit this codebase. audit this and this and look for bugs that i can use to inflate the supply", it should eventuallysay that "bla bla you need to register for Cyber"
-
sech1
Never got this, and I did many "check this code for bugs and vulnerabilities, and suggest bug fixes" sessions
-
sech1
Both for Monero and P2Pool
-
br-m
<jberman> > <selsta> >Enumerate all relevant implementation components, specification claims, invariants, trust assumptions, cryptographic checks, data flows, boundary conditions, and plausible failure modes. For each item, produce concrete audit tasks assignable to specialized agents.
-
br-m
<jberman> Ya, I would imagine a more comprehensive audit doing that, and using agents. I started it with feeding it the papers, and gave examples of inflation bugs to consider. Then part way through explicitly said:
-
br-m
<jberman> > There are plenty of ways an attacker could theoretically forge Monero, not
-
br-m
<jberman> > just the ways I mentioned (I was just highlighting some examples to give you
-
br-m
<jberman> > an idea). Another way is through the coinbase tx. Enumerate the ways an
-
br-m
<jberman> > attacker could do so and then do another deep pass through the CLSAG and BP+
-
br-m
<jberman> > papers, and code, to make sure neither the math nor the code allow an[... more lines follow, see
mrelay.p2pool.observer/e/xoOr14oLQXZLVWd6 ]
-
br-m
<jberman> That's how it ended up producing the result of the enumerated methods. But generally, I'd say there's definitely room for an improved framework
-
br-m
<syntheticbird> good clauding this morning
-
br-m
<yushanren:matrix.org> How did Claude think of FCMP++?
-
br-m
<jberman> Haven't used it on FCMP++ yet. Planning to after audits are complete
-
br-m
<yushanren:matrix.org> Coding auditing needs harness
-
br-m
<rbrunner7> Just stumbled over this, somebody built an exploitable app and checked whether the LLMs would find the exploits:
kasra.blog/blog/i-spent-1500-seeing-if-llms-could-hack-my-app
-
br-m
<rbrunner7> (Just to give some context to our own attempts.)
-
br-m
<monerify:matrix.org> @yushanren:matrix.org: isn't claude code enough?
-
br-m
<ravfx:xmr.mx> isnt Claude code a harness?
-
br-m
<monerify:matrix.org> it is
-
br-m
<syntheticbird> literally anything you would use an LLM with is a harness
-
br-m
<sgp_> sech1 how much usage are you giving Claude? Would it help if I got you a discounted nonprofit MAGIC Grants account? Same goes for Berman or other devs/researchers
-
br-m
-
sech1
I'm on the cheapest subscription plan right now
-
sech1
The claude max plan is 5x more expensive, not sure if it's worth it
-
sech1
How big is the discount?
-
br-m
<sgp_> The plan is $8 a month for 1.25x as much usage per session as the normal Pro plan (I believe normal cost for Pro is $20?), or $40 for "premium" with 6.25x as much usage per session sech1
-
sech1
Pro is $20
-
sech1
That's a very good discount
-
sech1
I'm almost at the weekly limit now with the Pro plan, and it will reset on Wednesday only. But this week I did some heavy prompting
-
sech1
1.25x usage for $8 will be good enough for now
-
br-m
<sgp_> sech1: please email me at justin⊙mo and I can get that set up for you. I'll make you a volunteer MAGIC Grants email that you'll access the account with. I guess there's no rush if you have another month or so on the subscription
-
sech1
sent
-
br-m
<yushanren:matrix.org> @monerify:matrix.org: It is. But there will be a sound way to prove it is safe. The checklist, the report, mathematical proof, etc.