The value of preserving reality

post by lukstafi · 2010-11-08T23:51:17.130Z · LW · GW · Legacy · 13 comments

A comment to http://singinst.org/blog/2010/10/27/presentation-by-joshua-foxcarl-shulman-at-ecap-2010-super-intelligence-does-not-imply-benevolence/: Given as in the naive reinforcement learning framework (and that can approximate some more complex notions of value) that the value is in the environment, you don't want to be too hasty with the environment lest you destroy a higher value you haven't yet discovered! So you especially wouldn't replace high complexity systems like humans with low entropy systems like computer chips, without first analyzing them.

 

13 comments

Comments sorted by top scores.

comment by ThomasR · 2010-11-09T08:58:50.071Z · LW(p) · GW(p)

The talk you link to is below the level of 1960's and 1970's discussions of that issue. Exist no better contemporary discussions of such issues?

comment by Manfred · 2010-11-09T06:12:20.923Z · LW(p) · GW(p)

How widespread is "curiosity" in current cutting-edge AI systems? How exactly does this correlate to benevolent outcomes, and what are the chances of something bad happening even if an AI is "curious?"

I, for one, do not want to be analyzed, especially not destructively.

comment by Jack · 2010-11-09T03:57:09.771Z · LW(p) · GW(p)

Given as in the naive reinforcement learning framework (and that can approximate some more complex notions of value) that the value is in the environment,

I'm confused about what this means.

Replies from: lukstafi
comment by lukstafi · 2010-11-09T11:47:02.659Z · LW(p) · GW(p)

It means that the agent maximizes the cumulative sum of a function of the environment states which is revealed to the agent only for states it visits.

Replies from: Jack
comment by Jack · 2010-11-09T18:58:25.715Z · LW(p) · GW(p)

I'm afraid this didn't clarify anything for me. Sorry! Pretend you're explaining it to someone stupid.

Replies from: DSimon
comment by DSimon · 2010-11-09T19:24:29.776Z · LW(p) · GW(p)

Actually, I think I understand it, but only with moderate confidence, and I have no prior experience with this particular terminology. So let me take this opportunity to put my thoughts on the firing line. :-) Whether they get shot down or confirmed, either way I'll have learned something about my ability to figure out things at first glance.

So: we're agents and we live in an environment.

We value certain things in the environment, and try to make decisions so that the environment arrives at the states we like, the states that have more value. Furthermore, as time passes the environment will continue to change state whether we like it or not. So we don't just want the environment to arrive at a given high-value state, we want to think longer term: we want the environment to keep on going through highly valuable states forever.

However, there's a problem: our prediction abilities are crappy. We can estimate how much value a given environmental state will have, but we won't really know until it's arrived. We know even less about states that are farther away in the future.

So the OP's argument is that we need to be careful about setting loose a superintelligent (but still not perfectly intelligent) FAI that rewrites the universe from scratch, because it might accidentally exclude a path farther down the line that's even more valuable than the near, fairly-high-value path it can predict. There are similar problems with less superpowerful FAIs, since they'll also guide humanity into the best path they can predict, which might not be as good as a weirder path farther out that it cannot.

How close am I?

Replies from: lukstafi
comment by lukstafi · 2010-11-09T20:10:25.534Z · LW(p) · GW(p)

You are on spot, though you provided more context than can be traced directly from the cited sentence. When i referred to the naive RL, I had in mind (PO)MDPs with unknown reward function. The reward of unseen state can be predicted only in the sense of Occam Razor-type induction.

comment by [deleted] · 2010-11-09T01:47:09.066Z · LW(p) · GW(p)

I'm not so sure. If we're talking about a God-like unFriendly AI, it might do a quick survey of the human race atom by atom and then replace it with a lower entropy system. That way, it can analyze human beings without having them increase entropy, which is something even the AI cannot undo.

Replies from: lukstafi, lukstafi
comment by lukstafi · 2010-11-09T02:27:31.576Z · LW(p) · GW(p)

I wasn't arguing against the possibility of atrocities (within the abstract discourse of "God-like AIs", which BTW feels contrived to me), just imagine how much redundancy can be spared while keeping much of the information content of humanity. I was arguing that there is more room for benevolence than recognized in the presentation -- benevolence from uncertainty of value. (Extending my "computation argument" from the "discounting" comment thread by Perplexed with an "information argument".)

comment by lukstafi · 2010-11-09T14:44:30.087Z · LW(p) · GW(p)

That way, it can analyze human beings without having them increase entropy, which is something even the AI cannot undo.

But if you kill patterns that can be reused, you just waste entropy. So our argument is in favor of freezing, not killing.

Replies from: None
comment by [deleted] · 2010-11-09T15:08:26.140Z · LW(p) · GW(p)

Only if freezing expends less energy than killing. If it doesn't, the most energy efficient choice would be to scan humanity and then wipe them out before they use more energy.

Replies from: lukstafi
comment by lukstafi · 2010-11-09T19:57:54.373Z · LW(p) · GW(p)

I'm confused what you mean by scanning. If you mean "scan and preserve the information in a databank" then it's a (perhaps very weak, depending of how much information relevant to us is actually retained) form of freezing I've been referring to (not necessarily literal freezing). If you mean "scan and compute some statistics, then discard the information", it is killing.

Replies from: None
comment by [deleted] · 2010-11-10T01:54:47.472Z · LW(p) · GW(p)

I was thinking about the former type, which is indeed more like freezing. However, it is unlikely that an unFriendly AI would ever re-implement humanity (especially if it mostly cares about entropy), so it's practically akin to killing.