0 comments

Comments sorted by top scores.

comment by Charlie Steiner · 2022-07-18T08:03:03.898Z · LW(p) · GW(p)

Suppose I have a self-driving car planning a route, and a superintelligent traffic controller "trying to help the car." The superintelligent traffic controller knows what route my car will pick, and so it tweaks the timing of the light slightly so that if the car takes the route it will in fact pick, everything is smoother and safer, but if it takes any other route, it will hit red lights and go slower.

Is this the sort of loss of freedom you mean?

What if, if my car tried to deviate from the route that's best according to its normal routing algorithm, the superintelligent traffic controller will for some reason not turn on some of the lights, so that if the car deviated it would actually get stuck permanently (though of course we never observe this).

This definitely seems like a loss of freedom. Actually it kinds of remind me of a great sci-fi story about free will by Paul Torek. But also, it doesn't produce effects and doesn't really seem like anything worth worrying about,

Replies from: winstonne

↑ comment by winstonne · 2022-07-18T14:16:43.278Z · LW(p) · GW(p)

Hmm, Looks like I should add an examples section and more background on what I mean related to freedom. What you are describing sounds like a traffic system that values ergodic efficiency of it's managed network and you are showing a way that a participant can have very non-ergodic results. It sounds like that is more of an engineering problem than what I'm imagining.

Examples off the top of my head of what I mean with respect to loss of freedom resulting from a powerful agent's value system include things like:

paperclip maximizer terraforming the earth prevents any value-systems other than paperclip maximization from sharing the earth's environment.
human's value for cheap foodstuffs results in mono-culture crop fields, which cuts off forest grassland ecosystem's values, (hiding places, alternating food stuffs which last through the seasons, etc.)
Drug dependent parent changes a child's environment, preventing freedom for a reliable schedule, security, etc.
Or, riffing off your example: superintelligent traffic controller starts city-planning, bulldozing blocks of car-free neighborhoods because they stood in the way of a 5% city-wide traffic flow improvement

Essentially what I'm trying to describe is that freedoms need to be a value onto themselves that has certain characteristics that are functionally different than the common utility function terminology that revolves around metric maximization (like gradient descent). Freedoms describe boundary conditions within which metric maximization is allowed, but describe steep penalties for surpassing their bounds. Their general mathematical form is a manifold surrounding some state-space, whereas it seems the general form of most utility function talk is finding a minima/maxima of some state space.

Replies from: Charlie Steiner

↑ comment by Charlie Steiner · 2022-07-18T15:34:02.796Z · LW(p) · GW(p)

Your example bad things seem bad for reasons unrelated to their effect on freedom. I don't want car-free neighborhoods bulldozed because I like neighborhoods more than piles of rubble.

Maybe think about examples where the the the available state space to an agent is changed without the optima changing.

And more subtly, since "available state space" is an emergent concept used when talking about agents that make choices, not a fundamental part of the world, consider whose point of view you're taking. If I make plans as if many actions are possible, but Laplace's demon predicts me as if they're not, this isn't a contradiction, it's an unresolved choice of perspective.

Replies from: winstonne

↑ comment by winstonne · 2022-07-18T19:35:39.665Z · LW(p) · GW(p)

I'm not following your final point. Regardless of determinism, the "state space" I can explore as an embedded agent is constrained by the properties of the local environment. If I value things like a walkable neighborhood, but I'm stuck in a pile of rubble, that's going to constrain my available state space and accordingly it's going to constrain my ability to have any rewarding outcome. McTraffic, by not allotting freedoms to me when executing their transportation redesign impeded on my freedom (which was mostly afforded to me through my and my neighbors property rights).

Freedoms (properly encoded), I believe are the proper framing for creating utility functions/value-systems for critters like our friendly neighborhood traffic agent. Sure, the traffic agent values transportation efficiency, but since it also values other agent's freedom to property rights, they will limit their execution of their traffic efficiency preferences within a multi-agent shared environment to minimize the restriction to property rights. To me, this seems simpler, and less error prone than any approach that tries to infer my values (or human preferences more generally) and act according to that inference.

Freedoms assume awareness of external (embedded) agency, they are values you afford to other agents. They have a payoff because you are then afforded them back. This helps to ensure agents do not unilaterally bulldoze (literally or figuratively) the "available state space" for other agents to explore and exploit.

Replies from: Charlie Steiner

↑ comment by Charlie Steiner · 2022-07-18T19:55:01.046Z · LW(p) · GW(p)

If I value things like a walkable neighborhood, but I'm stuck in a pile of rubble, that's going to constrain my available state space and accordingly it's going to constrain my ability to have any rewarding outcome.

And if you value rubble, having it replaced by a walkable neighborhood would constrain your available state space. It's symmetrical.

I worry you are just "seeing the freedom" inherent in the neighborhood more easily because you like freedom and you also like walkable neighborhoods. But this leads you to picking examples where the two different kinds of liking are all mixed up.

Sure, the traffic agent values transportation efficiency, but since it also values other agent's freedom to property rights, they will limit their execution of their traffic efficiency preferences within a multi-agent shared environment to minimize the restriction to property rights. To me, this seems simpler, and less error prone than any approach that tries to infer my values (or human preferences more generally) and act according to that inference.

But this precisely is an approach that tries to infer your values! It has to model you a certain way (as having certain freedoms like turning right vs. left, but not other freedoms like being able to fly). And I like the vision overall, but I think if you make it too strict you'll end up with an AI that's not making choices to defend a certain state of affairs, and so it's going to do silly things and get outcompeted by other forces.

Replies from: winstonne

↑ comment by winstonne · 2022-07-18T20:23:19.931Z · LW(p) · GW(p)

Ah ok, I think I'm following you. To me, freedom describes a kind of bubble around a certain physical or abstract dimension, who's center is at another agent. It's main use is to limit computational complexity when sharing an environment with other agents. If I have a set of freedom values, I don't have to infer the values of the agent so long as I don't enter their freedom bubbles. In the traffic example, how the neighborhood is constructed should be irrelevant to McTraffic, all it needs to know is a) there are other agents present in the neighborhood already, and b) it wants to change the nature of the neighborhood, which will enter the other agent's freedom bubbles. Therefore it needs to to negotiate with the inhabitants (so yes, at this step there's an inference via dialogue going on).