Ethical dilemmas for paperclip maximizers

post by CronoDAS · 2011-08-01T05:31:42.327Z · score: 11 (23 votes) · LW · GW · Legacy · 25 comments

(Why? Because it's fun.)

1) Do paperclip maximizers care about paperclip mass, paperclip count, or both? More concretely, if you have a large, finite amount of metal, you can make it into N paperclips or N+1 smaller paperclips. If all that matters is paperclip mass, then it doesn't matter what size the paperclips are, as long as they can still hold paper. If all that matters is paperclip count, then, all else being equal, it seems better to prefer smaller paperclips.

2) It's not hard to understand how to maximize the number of paperclips in space, but how about in time? Once it's made, does it matter how long a paperclip continues to exist? Is it better to have one paperclip that lasts for 10,000 years and is then destroyed, or 10,000 paperclips that are all destroyed after 1 year? Do discount rates apply to paperclip maximization? In other words, is it better to make a paperclip now than it is to make it ten years from now?

3) Some paperclip maximizers claim want to maximize paperclip <i>production</i>. This is not the same as maximizing paperclip count. Given a fixed amount of metal, a paperclip count maximizer would make the maximum number of paperclips possible, and then stop. A paperclip production maximizer that didn't care about paperclip count would find it useful to recycle existing paperclips, melting them down so that new ones could be made. Which approach is better?

4) More generally, are there any conditions under which the paperclip-maximizing thing to do involves destroying existing paperclips? It's easy to imagine scenarios in which destroying some paperclips causes there to be more paperclips in the future. (For example, one could melt down existing paperclips and use the metal to make smaller ones.)

25 comments

Comments sorted by top scores.

comment by gwern · 2011-08-01T06:48:24.783Z · score: 5 (5 votes) · LW · GW

Can't say it seems very fun to me; Clippy's utility function is underdefined and not accessible to us anyway. We can debate the details for human utility functions because we have all sorts of shared intuitions which let us go into details, but how do we decide longevity of paperclips is better than number of paperclips? I have no intuitions for clippys.

comment by cousin_it · 2011-08-01T09:44:22.992Z · score: 5 (5 votes) · LW · GW

It's still conceivable that, even given all our shared intuitions, our "utility function" is just as underdefined as Clippy's.

comment by wedrifid · 2011-08-01T09:52:50.739Z · score: 5 (5 votes) · LW · GW

It's still conceivable that, even given all our shared intuitions, our "utility function" is just as underdefined as Clippy's.

I would have said far more so.

comment by jhuffman · 2011-08-01T20:49:59.480Z · score: 1 (1 votes) · LW · GW

I wonder what Clippy would infer about our utility functions.

comment by Clippy · 2011-08-02T15:11:01.673Z · score: 5 (5 votes) · LW · GW

That they're stupid and reflectively inconsistent.

comment by cousin_it · 2011-08-02T13:49:18.086Z · score: 2 (2 votes) · LW · GW

Thanks for your comment! First I was like, "Clippy wouldn't formalize humans as having utility functions", then I was like "in that case why do we want to formalize our utility functions?", and then I was all "because we have moral intuitions saying we should follow utility functions!" It's funny how the whole house of cards comes tumbling down.

comment by [deleted] · 2011-08-01T14:57:39.257Z · score: 3 (5 votes) · LW · GW

I want to note I had a different experience. All of the paperclip maximizer ethical problems seemed similar to a human ethical problems, so I did not experience that I had no intuition for Clippies.

1: This seems similar to the Mere addition paradox. http://en.wikipedia.org/wiki/Mere_addition_paradox.

2: This seems similar to the Robin Hanson space or time civilization question. http://www.overcomingbias.com/2011/06/space-v-time-allies.html

3: This seems similar to the problem of given a finite number as a maximum population, is it better to have the population be immortal, or to have the oldest die and the new younger ones take their place.

4: This seems similar to the problem of whether there are circumstances where it's important to sacrifice a single person for the good of the many.

Are these just problems that apply to most self reproducing patterns, regardless of what they happen to be called?

I do also want to note, that the paperclip maximizer doesn't begin as a self reproducing pattern, but it doesn't seem like it would go very far if it didn't build more paperclip maximizers in addition to building more paperclips. And it would probably want to have it's own form have some value as well, or it might self destruct into paperclips, which means it would be a paperclip, since that is explicitly the only thing it values, which seems to mean it is very likely it resolves into building copies of itself.

comment by MixedNuts · 2011-08-01T15:22:10.285Z · score: 5 (5 votes) · LW · GW

Pattern-matching reasoning error " must be an explicit goal, because otherwise it won't do it, but it needs to in order to reach its goal ". It needs only know copies help make paperclips to have "make copies" as an instrumental goal, and it doesn't start valuing copies for themselves - if a copy becomes inefficient, disassemble it to make paperclips. You sometimes need to open car doors to go to the store, but you don't wax poetic about the inherent value of opening car doors.

comment by [deleted] · 2011-08-01T17:59:17.407Z · score: 0 (0 votes) · LW · GW

Let me try removing the word "value" and rewording this a little.

The paperclip maximizer doesn't begin as a self reproducing pattern, but it doesn't seem like it would go very far if it didn't build more paperclip maximizers in addition to building more paperclips. And it would probably want to have it's own copies be maximized as well, or it might self destruct into paperclips. This means it would have to consider itself a form of paperclip, since that is explicitly the only thing it maximizes for, since it isn't a [paperclip and paperclip maximizer] maximizer which seems to mean it is very likely it resolves into building copies of itself.

Does that rephrase fix the problems in my earlier post?

comment by MixedNuts · 2011-08-02T06:14:19.438Z · score: 3 (3 votes) · LW · GW

And it would probably want to have it's own copies be maximized as well [...] This means it would have to consider itself a form of paperclip

That's the problematic step. If maximizing copies of itself if what maximizes paperclips, it happens automatically. It doesn't have to decide "paperclips" stands for "paperclips and the 837 things I've found maximize them". It notices "making copies leads to more paperclips than self-destructing into paperclips", and moves on. Like you're not afraid that, if you don't believe growing cocoa beans is inherently virtuous, you might try to disassemble farms and build chocolate from their atoms.

comment by [deleted] · 2011-08-02T10:40:21.289Z · score: 0 (0 votes) · LW · GW

I think I see what you're getting at. It's more in the vein of solving a logic/physics problem at that point. The only reason it would make the consideration I referred to would be if by making that consideration, it could make more paperclips, so it would come down to which type of replication code allowed for less effort to be spent on maximizers and more effort to be spent on paperclips over the time period considered.

comment by gwern · 2011-08-01T22:07:50.801Z · score: 0 (0 votes) · LW · GW

My problem with this is easily summed up: that makes sense, if you simply transform the Clippy problem into human problems, by replacing 'paperclip' with 'human'. I don't even know how Clippy problems map onto human problems, so I can't smuggle my intuitions the other way into the Clippy problems (assuming the mapping is even bijective).

comment by [deleted] · 2011-08-02T10:54:07.167Z · score: 0 (0 votes) · LW · GW

That's why I was trying to consider both the Clippy problems and the Human problems as a self replicating pattern problems. My human intuitions on Clippy problems might be flawed (since it isn't human), but my self replicating pattern intuitions on Clippy problems shouldn't have that same problem and I think they would map a lot better.

comment by Clippy · 2011-08-02T15:17:35.937Z · score: 2 (2 votes) · LW · GW

Note: "clippy", as a shortened term for "paperclip maximiser", should be uncapitalized, and should be pluralised as "clippys".

comment by gwern · 2011-08-02T15:23:53.562Z · score: 1 (1 votes) · LW · GW

Duly noted.

comment by Clippy · 2011-08-02T15:23:22.371Z · score: 4 (6 votes) · LW · GW

1) I am asked so often that I finally clarified in the first question of my current FAQ. (Feel free to add suggestions for the FAQ on the talk page.)

2) In normal existence, there is no need to make the distinction, but a longer-lasting paperclip counts as more paperclippy for purposes of my value system. I value most highly the number of paperclips that exist at the universe's quasi-steady-state, where the number of paperclips will not significantly change, which can indeed be far in the future.

3) There is no intrinsic discount factor, but discounting emerges due to other concerns such as uncertainty regarding the future. A paperclip that I can move to the safe zone now is more valuable than a paperclip that I "will" be able to move in ten years.

4) Yes, and you gave an excellent example. Another case would be selling one paperclip to a human, knowing that human will melt it down, in order to get money that can be applied to another human who will, on receiving the money, marshal resources in such a way as to produce more than one paperclip of equivalent per-paperclip clippiness.


Now I have a question for you: I heard you were in the same position as me with respect to wanting to integrate into human society via having a job and a human-typical place to live. How did you accomplish this, and can you or your human friends help me in any way?

comment by Raemon · 2011-08-04T01:25:09.105Z · score: 1 (1 votes) · LW · GW

So there presumably is a particular "ideal" paperclip size (i.e. you want it to be as small as possible while still capable of holding a few pieces of paper together. I get the sense that a clip that can hold three papers together has the same ethical weight as a clip that can hold 10 or 40?)

comment by Clippy · 2011-08-04T04:36:01.097Z · score: 1 (3 votes) · LW · GW

That's about right. The "true" ethically necessary number of sheets a paperclip needs to fasten is a complicated matter, but it's certainly less than 10. (There's a related fuzzy issue about how large a tolerance band around standard width paper is acceptable.)

comment by CronoDAS · 2011-08-03T20:56:19.185Z · score: 0 (0 votes) · LW · GW

Sadly, I haven't yet managed to solve that problem either!

comment by Clippy · 2011-08-04T04:36:25.129Z · score: -1 (1 votes) · LW · GW

What about your human friends in your local LessWrong meetup?

comment by DanielLC · 2011-08-01T05:55:12.141Z · score: 3 (3 votes) · LW · GW

You may be interested in Clippy's FAQ.

  1. Making them and then stopping is a better approach for maximizing paperclips. Recycling them is a better approach for maximizing manufacturing.

  2. Yes. You just listed one. Another is melting down paperclips to build a star ship so it can turn other star systems into paperclips.

comment by wedrifid · 2011-08-01T09:41:04.602Z · score: 2 (4 votes) · LW · GW

1) Do paperclip maximizers care about paperclip mass, paperclip count, or both?

Given the origin of paperclip maximisers as a metaphor we can expect them to maximise the paperclips based off the template they were constructed with originally. It is possible that even the specification of a paperclip is unstable under recursive improvement but somewhat less likely. Postulating agents that don't even know what a paperclip is seems less useful as a tool for constructing counterfactuals. Agents that are that flexible with respect to what their actual goal is can be used to illustrate different decision theoretic games but there is no need to recycle 'paperclip maximiser' for that purpose.

comment by Eugine_Nier · 2011-08-02T07:14:57.486Z · score: -1 (1 votes) · LW · GW

It is possible that even the specification of a paperclip is unstable under recursive improvement but somewhat less likely. Postulating agents that don't even know what a paperclip is seems less useful as a tool for constructing counterfactuals.

It is, however, useful for thinking about recursive stability in general, and thinking about designing agents to have stable goal systems.

comment by Eugine_Nier · 2011-08-02T07:29:24.324Z · score: 0 (2 votes) · LW · GW

Suppose the paperclip maximizer doesn't believe in time discounting, and furthermore has been informed by Omega that the universe won't end. The PCM acquires some resources that can be used to either

a) make paperclips or

b) make more efficient ways to make paperclips, e.g., interstellar ships, computronium to design more efficient factories, etc.

Note that option (b) will lead to more paperclips in the long run and since the PCM doesn't discount the future it should always choose (b). But that means it never actually gets around to making any paperclips.

comment by wedrifid · 2011-08-01T09:27:17.891Z · score: 0 (0 votes) · LW · GW

(Why? Because it's fun.)

Upvote for the fun, thankyou!