Verifiable private execution of machine learning models with Risc0?

post by mako yass (MakoYass) · 2023-10-25T00:44:48.643Z · LW · GW · 2 comments

Contents

2 comments

Risc0 is a virtual machine that runs any riscv (C, C++, Rust, Go, and Nim can all compile to riscv) program, records a trace of the state of the computation over time, and then, no matter how long the execution ran for, it can fold that execution trace into a 1KB reed-solomon code, which can be transmitted as a proof that the computation was run correctly, that the outputs entail from the inputs, all without having to reveal unencrypted copies of the code or inputs.
The interrogator verifies it by sampling a bunch of features of the receipt code, and if any aspect of the computation was incorrect, each check has a 3/4 chance of exposing that. So, after a hundred or so checks, we can be effectively certain of the correctness of the computation, without having to run the computation ourselves. [1]

I was incredulous when I saw it suggested that this could be applied to ML models, as I'd expect that running a neural model in an environment like Risc0 would reduce its performance by tens of thousands of times. Risc0 does reduce the performance of conventional CPU computation by around that much[2] (though that is still fast enough to be practical for a lot).
And yeah, the post doesn't discuss performance, and it discusses a lot of non-deep learning (?) model types like boosted trees and random forests. So I'd guess this probably isn't applicable to verifying training runs. Would it be usefully applicable to some distilled inference models?


There are two other reasons I found Risc0 very interesting. One is just that I'm working in social computing, but a deeper interest is in relation to the ongoing conversation in techno-eschatology about whether advanced agents tend towards or away from mutual transparency, which is to say, towards coordination, peace, trade, or towards war and oppression (eg, The Dark Forest). Risc0 compresses the execution trace of any program, however long, down to a proof checkable in constant time. That's a strong indication to me of the feasibility of transparency between information systems, even in the most adverse possible conditions where we can assume no trusted hardware infrastructure at all.
With trusted hardware, verified compute can run at normal speeds, and mutual transparency becomes trivial. The only costs of hardware security are perhaps the addition of tamper-proofing, but it seems to me that a developed coordination infrastructure, at least on the human scale, would be able to just monitor the surrounding physical bodies and sensors and scream when they seem to be doing a tampering, at pretty much no cost.
I suppose ZK systems like Risc0 are for cases where even a very, very small risk that a Trusted Execution Environment (TEE) could be cracked is unacceptable. In decentralized finance, that is common, because the integrity of the network's shared state is all entangled, if we assumed certainty of each TEE's integrity, we'd build houses of cards where even a single success at cracking a TEE storing one person's wallet could undermine the entire network by allowing them to undetectably mint, inflate, deface the currency. In contrast, Risc0 guarantees the integrity of the computation with mathematical certainty, which makes me more hopeful for the project of parallelizing a global verifiable computation.

  1. ^

    At least, that's my sense after 2 days of study. Discerned from watching some Risc0 Study Club videos. Somewhere it was mentioned that the receipt could be compressed down to one codeword, but is instead only reduced to 256 codewords for efficiency reasons. I guess that codewords are one riscv machine word large, 4 bytes, so 1024 bytes in total. Although running on secret programs was not mentioned, I infer that it must be possible, because of course we can encode a program as a data input and have a riscv implementation of an interpreter run them, but also, in a comparison between riscv and wasm I seem to remember seeing it mentioned that riscv can modify instructions in its execution region.

  2. ^

    Assuming that the typical laptop is around 3GHz, comparing to the developer's stated 30KHz, this would be a 100,000 fold difference. A 10x performance improvement in risc0 is conceivable. A 100x increase is harder to imagine. Perhaps with hardware support, but it's hard to see who would develop that, given that TEEs are a much more tempting/cheap approach to hardware-based compute integrity.

2 comments

Comments sorted by top scores.

comment by jacob_cannell · 2023-10-25T21:32:49.683Z · LW(p) · GW(p)

I've spent some time investigating/working on this. There are two approximate proof standards of relevance: constant error tolerance and relative error tolerance. A constant error bound proof (which can guarantee whp that there are less than c errors for some small constant c) seems to require complex zk-STARK/SNARK type circuit conversions. For the relaxed constraint of a c relative error bound for larger parallel computations much faster small overhead approaches are possible. So it seems you can combine these - use the very fast low overhead relative error bound proof for one or a few steps of your large parallel ANN,and then use the slower zk-STARK circuit for a serial proof over the chain of inner receipts.

The tradeoff is you no longer can prove strict correctness, and instead can only prove that the actual computation graph was similar to the claimed graph, and used about the same amount of compute. For DL training in particular seems like that is probably enough as you can prove the claimed circuit was the result of the claimed training algorithm on the claimed dataset up to a small constant number of 'bitflip' type dropout/dropin errors (which of course could be adversarial), and then you can include a tighter proof of a bound of the loss on a small hold out test set (which requires a more complex mechanism to prove that the test set wasn't part of the training set or wasn't available at that time yet, but that also seems doable).

So I think it is possible to get reasonable low cost DL training proofs, but that is all assuming deterministic computation, which is unfortunate as much of the remaining hardware gains at the end of moore's law require non-deterministic approx computation (hardware level stochastic rounding, as in brains/neuromorphic computers). There may be some way to still scale probabilistic proofs in that regime, not sure yet.

One is just that I'm working in social computing, but a deeper interest is in relation to the ongoing conversation in techno-eschatology about whether advanced agents tend towards or away from mutual transparency, which is to say, towards coordination, peace, trade, or towards war and oppression

Compute proofs have interesting use cases but this seems a bit optimistic? If you are conversing with an open source agent A that can provide a complete proof of it's computation graph that seems useful sure but you can only be certain that A exists in some sense, and never certain that A is not actually 'imprisoned' in some form and controlled by some other agent B.

Also there is a significant cost for being a fully open source agent. Not only does it give up much of the economic potential invested in the agent's training, but it also - from the agent's perspective - opens it up to sim imprisonment and associated torture/manipulation.