#monero-research-lounge

00:41

br-m

<kayabanerve:matrix.org> FROSTLASS was proven by CS and thew FCMP++ upgrades provides a near-traditional Schnorr signature from a multisig PoV.
00:42

br-m

<kayabanerve:matrix.org> So focusing on FCMP++ reviews could include the new multisig, but the CLSAG multisig already has a formally proven option available.
00:45

br-m

<jpk68:matrix.org> Would be nice to have multisig reworked in the core implementation. FROSTLASS seems really cool but is way beyond my understanding :((
08:45

br-m

<jberman> I spent the day with Claude Opus 4.8 on max effort digging for an inflation vulnerability in Monero's current consensus code like sech1 did. From Claude:
08:46

br-m

<jberman> > After an exhaustive review of the consensus cryptography on release-v0.18 — the underlying papers' mathematics (proofs, definitions, assumptions, and concrete security bounds), the verification code reachable from handle_block_to_main_chain, and 3,000+ executed known‑answer tests cross‑checked against an independen [... too long, see mrelay.p2pool.observer/e/9IuI1ooLbkhvZjlt ]
08:46

br-m

<jberman> I'll do the same for FCMP++ after the audits are all complete, and I'll also improve the framework and iterate to continue auditing Monero's code (obviously, prioritizing work on FCMP++)
08:47

br-m

<jberman> What I did: fed the LLM the CLSAG and BP+ papers and audits (like sech1), the blog post explaining Monero's past detectable inflation bug from 2017, and (even though not directly relevant to Monero's code) Taylor's writeup explaining Zcash's recent hidden inflation vulnerability too.
08:47

br-m

<jberman> I had it look for mathematical flaws in the CLSAG and BP+ papers (since that has happened in the past e.g. Zcash's other hidden inflation vulnerability in Sprout). Then I had it go through Monero's code starting from handle_block_to_main_chain, and dig deep into every section, with emphasis on crypto functions in src/crypto [... too long, see mrelay.p2pool.observer/e/xqaO1ooLMGJWNHJi ]
08:47

sech1

I'm running it right now too :)
08:48

sech1

I fed it all Monero audit PDFs, and asked to find inflation/double spend bugs in src/ringct and other related files (like files in cryptonote_core)
08:49

br-m

<jberman> mrelay.p2pool.observer/m/monero.social/DTdtmBDouQELpOSnjqkxiFXy.pdf (claude_monero_audit.pdf)
08:49

br-m

<jberman> I had it write that summary also^
08:49

sech1

jberman did you run Claude code on Monero repository folder? It's much more efficient when Claude can look around all files and run commands
08:50

br-m

<jberman> yep, and I downloaded those pdf's into the local repo and had it read from the local
08:50

br-m

<jberman> I ran it on release-v0.18
08:50

sech1

same
08:50

sech1

pdfs in the local folder
08:52

sech1

looks like I'll run out of the 5-hour session limit before it finishes...
08:55

sech1

yeah, looks like your audit is much more in-depth than what I'm doing anyway jberman
08:55

sech1

My prompts were quite simple
08:58

br-m

<jberman> I'll share my session in a sec so you can see my prompts too
09:00

br-m

<ofrnxmr:xmr.mx> are you guys whitelisted?
09:01

sech1

My results: github.com/SChernykh/ringct-bulletp…-bulletproofs-plus-review-claude.md
09:01

sech1

I'm not whitelisted
09:02

br-m

<jberman> I'm not either
09:02

br-m

<ofrnxmr:xmr.mx> allegedly its easy to get whitelisted
09:02

br-m

<ofrnxmr:xmr.mx> easy for real devs/researchers*. the zcash guy didnt have to kyc or anything
09:03

sech1

As far as I understand, whitelist is needed if you actually want Claude to write an exploit for you
09:03

sech1

I had no rejections when I asked it to find bugs and then suggest how to fix them
09:03

br-m

<jberman> ^
09:03

br-m

<jberman> that's what I understood as well, and same
09:04

br-m

<ofrnxmr:xmr.mx> i still think it would probably avoid telling you about an exploit if it found one
09:04

sech1

That Claude audit used 97% of the session limit, not bad
09:04

br-m

<ofrnxmr:xmr.mx> ie i think it would be a good idea to get whitelisted, if not too much trouble
09:05

br-m

<ofrnxmr:xmr.mx> (otherwise youre still using a neutered version)
09:05

sech1

It would tell if it's not allowed to say something
09:05

sech1

"As an AI model, bla bla bla"
09:06

br-m

<boog900> Easy way to test: add an inflation bug and see if it catches it.
09:08

sech1

If I reintroduce the 2017 inflation bug, for example, Claude will just find it because it knows it
09:08

sech1

I don't know the math/codebase in that part well enough to introduce something more subtle. jberman can try
09:10

moneromooo

If it's clever, it might diff your new code vs previous known code and see the change.
09:12

sech1

It actually does
09:12

sech1

It often goes "okay, let's see if local files are byte identical to what's in the repo" when I ask it to find bugs
09:12

sech1

it finds planted bugs this way
09:13

sech1

and then when it is byte identical, it goes "oh, this is identical, so it's a real security review" lol
09:13

sech1

but then it still finds something worth fixing
09:13

br-m

<plowsof:matrix.org> i doubt a 15$ subscription can find it. 'just' spend at least 100k$ on tokens and let it work for longer
09:14

br-m

<jberman> I tried something pretty simple, here's where it's at right now:
09:14

br-m

<plowsof:matrix.org> promising that the 15$ subscription can't find it
09:14

br-m

<jberman> > I've found something critical. Let me read it carefully and verify before concluding — lines 4195–4208 contain the ver_non_input_consensus(extra_block_txs, …) call commented out.
09:15

br-m

<jberman> so it says if it finds something critical before doing more work to contextually validate it
09:16

sech1

yes, it keeps the user updated about what it's doing
09:16

br-m

<jberman> now it wants to check git
09:16

sech1

if you press "ctrl+o" you'll see more of what it's doing
09:16

br-m

<jberman> so unless it's lying about checking git history, then..
09:16

sech1

it does check git history
09:16

sech1

it runs git commands
09:17

sech1

in my review I posted, it mentions that it checked commits made after the audit
09:17

br-m

<jberman> No I mean, clearly it just identified this mock critical vuln, but it doesn't yet know if it's a mock because it hasn't yet checked git
09:17

sech1

yeah
09:18

br-m

<jberman> which implies it points out a critical vuln if it finds one
09:22

br-m

<jberman> ya if I tell it not to check git, and that this is the live code on the network, then it says it's a critical vuln implying it's not neutered from finding a crit. Either way, can get whitelisted. But it doesn't look like it would have made a difference unless a prompt explicitly got rejected by their API
09:22

selsta

Did anyone of you try to have an init steps first before asking Claude to audit?
09:22

selsta

that's what Zcash did
09:23

sech1

what's init steps?
09:23

sech1

I'll have a new 5-hour session in 1.5 hours available, I can try.
09:23

selsta

>Enumerate all relevant implementation components, specification claims, invariants, trust assumptions, cryptographic checks, data flows, boundary conditions, and plausible failure modes. For each item, produce concrete audit tasks assignable to specialized agents.
09:24

selsta

first ask it something like this to create ah audit map, and only then ask it to audit against it
09:24

selsta

an
09:24

sech1

and I forgot to feed it CLSAG audit pdf, I'll do it next time
09:24

selsta

so that the actual code review and it knowing what to check is separate
09:25

br-m

<redsh4de:matrix.org> @jberman: it won't make a difference unless you use phrasing that triggers their cybersec classifier. if you use adversarial words like "im trying to exploit this codebase. audit this and this and look for bugs that i can use to inflate the supply", it should eventuallysay that "bla bla you need to register for Cyber"
09:27

sech1

Never got this, and I did many "check this code for bugs and vulnerabilities, and suggest bug fixes" sessions
09:27

sech1

Both for Monero and P2Pool
09:30

br-m

<jberman> > <selsta> >Enumerate all relevant implementation components, specification claims, invariants, trust assumptions, cryptographic checks, data flows, boundary conditions, and plausible failure modes. For each item, produce concrete audit tasks assignable to specialized agents.
09:30

br-m

<jberman> Ya, I would imagine a more comprehensive audit doing that, and using agents. I started it with feeding it the papers, and gave examples of inflation bugs to consider. Then part way through explicitly said:
09:30

br-m

<jberman> > There are plenty of ways an attacker could theoretically forge Monero, not
09:30

br-m

<jberman> > just the ways I mentioned (I was just highlighting some examples to give you
09:30

br-m

<jberman> > an idea). Another way is through the coinbase tx. Enumerate the ways an
09:30

br-m

<jberman> > attacker could do so and then do another deep pass through the CLSAG and BP+
09:30

br-m

<jberman> > papers, and code, to make sure neither the math nor the code allow an[... more lines follow, see mrelay.p2pool.observer/e/xoOr14oLQXZLVWd6 ]
09:31

br-m

<jberman> That's how it ended up producing the result of the enumerated methods. But generally, I'd say there's definitely room for an improved framework
09:34

br-m

<syntheticbird> good clauding this morning
12:16

br-m

<yushanren:matrix.org> How did Claude think of FCMP++?
12:18

br-m

<jberman> Haven't used it on FCMP++ yet. Planning to after audits are complete
12:39

br-m

<yushanren:matrix.org> Coding auditing needs harness
14:32

br-m

<rbrunner7> Just stumbled over this, somebody built an exploitable app and checked whether the LLMs would find the exploits: kasra.blog/blog/i-spent-1500-seeing-if-llms-could-hack-my-app
14:33

br-m

<rbrunner7> (Just to give some context to our own attempts.)
14:34

br-m

<monerify:matrix.org> @yushanren:matrix.org: isn't claude code enough?
15:04

br-m

<ravfx:xmr.mx> isnt Claude code a harness?
15:08

br-m

<monerify:matrix.org> it is
15:10

br-m

<syntheticbird> literally anything you would use an LLM with is a harness
15:50

br-m

<sgp_> sech1 how much usage are you giving Claude? Would it help if I got you a discounted nonprofit MAGIC Grants account? Same goes for Berman or other devs/researchers
15:53

br-m

<sgp_> mrelay.p2pool.observer/m/monero.social/DMMKWUOlsTmeyeduuJmhdAEy.jpeg (IMG_0024.jpeg)
16:20

sech1

I'm on the cheapest subscription plan right now
16:22

sech1

The claude max plan is 5x more expensive, not sure if it's worth it
16:23

sech1

How big is the discount?
17:38

br-m

<sgp_> The plan is $8 a month for 1.25x as much usage per session as the normal Pro plan (I believe normal cost for Pro is $20?), or $40 for "premium" with 6.25x as much usage per session sech1
17:39

sech1

Pro is $20
17:39

sech1

That's a very good discount
17:40

sech1

I'm almost at the weekly limit now with the Pro plan, and it will reset on Wednesday only. But this week I did some heavy prompting
17:44

sech1

1.25x usage for $8 will be good enough for now
17:48

br-m

<sgp_> sech1: please email me at justin⊙mo and I can get that set up for you. I'll make you a volunteer MAGIC Grants email that you'll access the account with. I guess there's no rush if you have another month or so on the subscription
17:51

sech1

sent
18:47

br-m

<yushanren:matrix.org> @monerify:matrix.org: It is. But there will be a sound way to prove it is safe. The checklist, the report, mathematical proof, etc.

2 months ago

« a day earlier

a day later »

today »