Posts

Comments

Comment by Vadim Fomin (vadim-fomin) on All AGI Safety questions welcome (especially basic ones) [April 2023] · 2023-04-25T06:39:20.094Z · LW · GW

What is the connection between the concepts of intelligence and optimization?

I see that optimization implies intelligence (that optimizing sufficiently hard task sufficiently well requires sufficient intelligence). But it feels like the case for existential risk from superintelligence is dependent on the idea that intelligence is optimization, or implies optimization, or something like that. (If I remember correctly, sometimes people suggest creating "non-agentic AI", or "AI with no goals/utility", and EY says that they are trying to invent non-wet water or something like that?)

It makes sense if we describe intelligence as a general problem-solving ability. But intuitively, intelligence is also about making good models of the world, which sounds like it could be done in a non-agentic / non-optimizing way. One example that throws me off if Solomonoff induction - which feels like a superintelligence, and indeed contains good models of the world, but doesn't seem to be pushing to any specific state of the world. 

I know there's the concept of AIXI, basically an agent armed with Solomonoff induction as their epistemology, but it feels like agency is added separately. Like, there's the intelligence part (Solomonoff induction) and the agency part and they are clearly different, rather that agency automatically popping out because they're superintelligent.

Comment by Vadim Fomin (vadim-fomin) on Open & Welcome Thread — March 2023 · 2023-04-06T06:40:36.277Z · LW · GW

Is there currently any place for possibly stupid or naive questions about alignment? I don't wish to bother people with questions that have probably been addressed, but I don't always know where to look for existing approaches to a question I have.

Comment by Vadim Fomin (vadim-fomin) on Security Mindset and Ordinary Paranoia · 2023-01-28T10:14:40.506Z · LW · GW

The OpenBSD project to build a secure operating system has also, in passing, built an extremely robust operating system, because from their perspective any bug that potentially crashes the system is considered a critical security hole. An ordinary paranoid sees an input that crashes the system and thinks, “A crash isn't as bad as somebody stealing my data. Until you demonstrate to me that this bug can be used by the adversary to steal data, it's not extremely critical.” Somebody with security mindset thinks, “Nothing inside this subsystem is supposed to behave in a way that crashes the OS. Some section of code is behaving in a way that does not work like my model of that code. Who knows what it might do? The system isn't supposed to crash, so by making it crash, you have demonstrated that my beliefs about how this system works are false.”

Hey there,

I was showing this post to a friend who's into OpenBSD. He felt that this is not a good description, and wanted me to post his comment. I'm curious about what you guys think about this specific case and what it does to the point of the post as a whole. Here's his comment:

This isn't an accurate description of what OpenBSD does and how it differs from other systems.

> any bug that potentially crashes the system is considered a critical security hole

For the kernel, this is not true: OpenBSD, just like many other systems, has a concept of crashing in a controlled manner when it's the right thing to do, see e.g. [here](https://man.openbsd.org/crash). As far as I understand [KARL](https://why-openbsd.rocks/fact/karl), avoiding crashes at any cost would make the system less secure: attacker guesses incorrectly => the system crashes => the system boots a new randomized kernel => attacker is back at square one vs. attacker guesses incorrectly => the system continues working as usual  => attacker guesses again with new knowledge

For the other parts of the system, the opposite is true: OpenBSD consistently introduces new interesting restrictions, if a program violates them, it will crash immediately.

Example 1: printf and %n

Printf manual page for OpenBSD: http://man.openbsd.org/printf.3

"The %n conversion specifier has serious security implications, so it was changed to no longer store the number of bytes written so far into the variable indicated by the pointer argument. Instead a syslog(3) message will be generated, after which the program is aborted with SIGABRT."

Printf manual page for Linux: https://man7.org/linux/man-pages/man3/printf.3.html

"Code such as printf(foo); often indicates a bug, since foo may contain a % character.  If foo comes from untrusted user input, it may contain %n, causing the printf() call to write to memory and creating a security hole."

Printf manual page for macOS: https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man3/printf.3.html

"%n can be used to write arbitrary data to potentially carefully-selected addresses.  Programmers are therefore strongly advised to never pass untrusted strings as the format argument, as an attacker can put format specifiers in the string to mangle your stack, leading to a possible security hole."

As we see, on Linux and macOS, the potential security issue is well-known and documented, but a program that uses it is supposed to work. On OpenBSD, it's supposed to crash.

Example 2: [pledge](http://man.openbsd.org/pledge.2)

This system call allows a program to sandbox itself, basically saying "I only need this particular system functionality to operate properly; if I ever attempt to use anything else, may I crash immediately".

Example 3: [KERN_WXABORT](http://man.openbsd.org/sysctl.2#KERN_WXABORT)

Like many other systems, OpenBSD doesn't allow you to have memory that is both writable and executable. However, it's an error the program can recover from. By setting a kernel parameter, you can make the error unrecoverable. The program that attempts to use memory like that will crash.

I hope I've made my case clear.