PoMP and Circumstance: Introduction
post by benatkin · 2024-12-09T05:54:09.292Z · LW · GW · 1 commentsContents
1 comment
This is the first part of a series about the Principle of Minimal Privilege, also known as the Principle of Least Authority.
One technique that is quickly trotted out in discussions about making AI safer is just not giving the AI agents access to certain things. A naive suggestion is not giving AI control over anything in the physical world. However, this has several problems, including:
- Control over a digital resource can often be used to obtain control over a physical resource
- An AI can pretend to be a human or passing information from a human
- A lot of humans are open to following instructions from AI
In computing, there are multiple ways to protect access to things. A universal one is a password. A lot of passwords are all-or-nothing. You either have full access to a service or no access to it. However, there is fine-grained access control.
There is also security sandboxing. A security sandbox, such as one in a Docker container or a browser tab, can prevent an external service from accessing data inside it.
There is a more robust by default sandbox, that is WebAssembly and it is increasingly favored for third-party plugins. However, to actually do stuff, access needs to be given to outside the WebAssembly instance. WebAssembly modules can export and import functions, and this can be set up in several ways. Alas, the most common way is direct binding of functions. This often means giving more access than is needed to code running inside the WebAssembly instance.
For instance, you could make a WebAssembly module that could render a web page. It could use rust-wasm's web_sys to access document.createElement
, append
, and setAttribute
.
Unfortunately, this would mean being able to create an anchor tag and setting the href to any site. If there is private data in the WebAssembly instance, and it has untrusted code, it could use it to trick the user into clicking a link to a malicious site with the private data base64-encoded into the URL. This is known as data exfiltration.
To prevent this, rather than giving access to these functions, it could give access to custom functions that wrap these, and prevent arbitrary links from being set into URLs.
In upcoming posts, I'll go into more detail on this security challenge, and explore Principle of Minimal Privilege on conceptual level.
1 comments
Comments sorted by top scores.
comment by Dagon · 2024-12-09T18:10:24.227Z · LW(p) · GW(p)
This is a useful and important topic, but there are some details in the writeup that are misleadingly explained and may reduce the trust in the overall explanation.
A lot of passwords are all-or-nothing. You either have full access to a service or no access to it.
It's really necessary to split authentication (authz) from authorization (authn). Passwords are authentication - they show identity of user. There are separate systems for what that identity is allowed to do. It's not the password that's all-or-nothing.
A security sandbox, such as one in a Docker container or a browser tab, can prevent an external service from accessing data inside it.
Almost exactly the opposite. The sandbox prevents things INSIDE from accessing data (except in very controlled ways) outside of it. Sandboxes makes it harder for an attacker to escape and hit other systems, not harder for an attacker who's got access to the host to get into the sandbox. In truth, there's a bit of both, as it makes it easier to secure the host when all "user work" happens in a sandbox.