Posts

Comments

Comment by Simulation_Brain on All AGI Safety questions welcome (especially basic ones) [~monthly thread] · 2023-02-01T17:15:07.333Z · LW · GW

I think the main concern is that feed forward nets are used as a component in systems that achieve full AGI. For instance, deepmind's agent systems include a few networks and run a few times before selecting an action. Current networks are more like individual pieces of the human brain, like a visual system and a language system. Putting them together and getting them to choose and pursue goals and subgoals appropriately seems all too plausible.

Now, some people also think that just increasing the size of nets and training data sets will produce AGI, because progress has been so good so far. Those people seem to be less concerned with safety. This is probably because such feedforward nets would be more like tools than agents. I tend to agree with you that this approach seems unlikely to.produce real AGI much less ASI, but it could produce very useful systems that are superhuman in limited areas. It already has in a few areas, such as protein folding.

Comment by Simulation_Brain on Accurate Models of AI Risk Are Hyperexistential Exfohazards · 2022-12-26T22:08:45.050Z · LW · GW

I think those are perfectly good concerns. But they don't seem so likely that they make me want to exterminate humanity to avoid them.

I think you're describing a failure of corrigibility. Which could certainly happen, for the reason you give. But it does seem quite possible (and perhaps likely) that an agentic system will be designed primarily for corrigibility, or alternately, alignment by obedience.

The second seems like a failure of morality. Which could certainly happen. But I see very few people who both enjoy inflicting suffering, and who would continue to enjoy that even given unlimited time and resources to become happy themselves.

Comment by Simulation_Brain on Accurate Models of AI Risk Are Hyperexistential Exfohazards · 2022-12-26T06:22:18.844Z · LW · GW

You are probably guessing correctly. I'm hoping that whoever gets ahold of aligned AGI will also make it corrigible, and that over time they'll trend toward a similar moral view to that generally held in this community. It doesn't have to be fast.

To be fair, I'm probably pretty biased against the idea that all we can realistically hope for is extinction. The recent [case against AGI alignment](https://www.lesswrong.com/posts/CtXaFo3hikGMWW4C9/the-case-against-ai-alignment) post was the first time I'd seen arguments that strong in that direction. I haven't really assimilated them yet.

My take on human nature is that, while humans are often stunningly vicious, they are also often remarkably generous. Further, it seems that the viciousness is usually happening when they feel materially threatened. Someone in charge of an aligned AGI will not feel very threatened for very long. And generosity will be safer and easier than it usually is.

Comment by Simulation_Brain on Accurate Models of AI Risk Are Hyperexistential Exfohazards · 2022-12-26T03:48:17.136Z · LW · GW

Yes. But that seems awfully unlikely to me. What would it need to be, two years from now? AI hype is going to keep ramping up as chatGPT and its successors are more widely used and improved.

If the odds of slipping it by governments and miltaries is slight, wouldn't the conclusion be the opposite - we should spread understanding of AGI alignment issues so that those in power have thought about them by the time they appropriate the leading projects?

This strikes me as a really practically important question. I personally may be rearranging my future based on what the community comes to believe about this.

Edit: I think the community tends to agree with you and be working in hopes that we reach the finish line before the broader world takes note. But this seems more like wishful thinking than a realistic guess about likely futures.

Comment by Simulation_Brain on Accurate Models of AI Risk Are Hyperexistential Exfohazards · 2022-12-26T03:12:48.856Z · LW · GW

I think there's a possibility that their lives, or some of them, are vastly worse than death. See the recent post the case against value alignment for some pretty convincing concerns.

Comment by Simulation_Brain on Accurate Models of AI Risk Are Hyperexistential Exfohazards · 2022-12-26T03:05:31.667Z · LW · GW

I totally agree with the core logic. I've been refraining from spreading these ideas, as much as I want to.

Here's the problem: Do you really think the whole government and military complex is dumb enough to miss this logic, right up to successful AGI? You don't think they'll roll in and nationalize the efforts when the power of AI keeps on progressively freaking people out more and more?

I think a lot of folks in the military are a lot smarter than you give them credit for. Or the issue will become much more obvious than you assume, as we get closer to general AI.

But I don't think that's necessarily going to spell doom.

I hope that emphasizing corrigability might be adequate. That would at least let the one group who've controlled creation of AGI change their minds down the road.

I think a lot of folks in the government and military might be swayed by logic, once they can perfectly protect and provide abundantly for themselves and everyone they value. Their circle of compassion can expand, just like everyone here has expanded theirs.

Comment by Simulation_Brain on Superintelligence 19: Post-transition formation of a singleton · 2015-03-16T22:56:44.096Z · LW · GW

Really? Can you say a little more about why you think you have that value? I guess I'm not convinced that it's really a terminal value if it varies so widely across people of otherwise similar beliefs. Presumably that's what lalartu meant as well, but I just don't get it. I like myself, so I'd like more of myself in the world!

Comment by Simulation_Brain on How to Beat Procrastination · 2014-08-01T22:43:04.301Z · LW · GW

Perhaps you're thinking of the dopamine spike when reward is actually given? I had thought the predictive spike was purely proportional to the odds of success and the amount of reward- which would indeed change with boring tasks, but not in any linear way. If you're right about that basic structure of the predictive spike I should know about it for my research; can you give a reference?

Comment by Simulation_Brain on Book review: The Reputation Society. Part II · 2014-06-08T19:23:51.815Z · LW · GW

Less Wrong seems like the ideal community to think up better reputation systems. Doctorow's Whuffie is reasonably well-thought-out, but intended for a post-scarcity economy; but its ideas of distinguishing right-handed (people who agree with you) from left-handed (from people who generally don't agree with you) reputations seems like one useful ingredient. Reducing the influence of those who tend to vote together seems like another potential win.

I like to imagine a face-based system; snap an image from a smartphone, and access reputation.

I hope to see more discussion, in particular, VAuroch's suggestion.

Comment by Simulation_Brain on AI risk, executive summary · 2014-04-08T20:10:23.034Z · LW · GW

I think the example is weak; the software was not that dangerous, the researchers were idiots who broke a vial they knew was insanely dangerous.

I think it dilutes the argument to broaden it to software in general; it could be very dangerous under exactly those circumstances (with terrible physical safety measures), but the dangers of superhuman AGI are vastly larger IMHO and deserve to remain the focus, particularly of the ultra-reduced bullet points.

I think this is as crisp and convincing a summary as I've ever seen; nice work! I also liked the book, but condensing it even further is a great idea.

Comment by Simulation_Brain on The Evil AI Overlord List · 2014-04-08T20:07:02.708Z · LW · GW

"Pleased to meet you! Soooo... how is YOUR originating species doing?..."

That actually seems like an extremely reasonable question for the first interstellar meeting of superhuman AIs.

I disagree with EY on this one (I rarely do). I don't think it's so likely as to ensure rationally acting Friendly, but I do think that the possibility of encountering an equally powerful AI, and one with a headstart on resource acquisition, shouldn't be dismissed by a rational actor.

Comment by Simulation_Brain on LWers living in Boulder/Denver area: any interest in an AI-philosophy reading group? · 2014-03-19T17:20:37.793Z · LW · GW

I'm game. These are some of my favorite topics. I do computational cognitive neuroscience, and my principal concern with it is how it can/will be used to build minds.

Comment by Simulation_Brain on [deleted post] 2014-02-05T19:53:04.216Z

I may be confused, but it seems to me that the issue in generalizing from decision utility to utilitarian utility simply comes down to making an assumption allowing utilities among different people to be compared- to put them on the same scale. I think there's a pretty strong argument that we can do so, springing from the fact that we all are running essentially the same neural hardware. Whatever experiential value is, it's made of patterns of neural firing, and we all have basically the same patterns. While we don't run our brains exactly the same, the mood- and reward-processing circuitry are pretty tightly feedback-controlled, so saying that everyone's relative utilities are equal shouldn't be too far from the truth.

But that's when one adopts an unbiased view. Neither I nor (almost?) anyone else in history have done so. We consider our own happiness more important than anyone else's. We weight it higher in our own decisions, and that's perfectly rational. The end point of this line of logic is that there is no objective ethics - it's up to the individual.

But there is one that makes more sense than others when making group decisions, and that's sum utilitarianism. That's the best candidate for an AI's utility function. Approximations must be made, but they're going to be approximately right. They can be improved by simply asking people about their preferences.

The common philosophical concern that you can't put different individuals preferences on the same scale does not hold water when held up against our current knowledge about how brains register value and so create preferences.

Comment by Simulation_Brain on Meetup : Lesswrong Boulder CO · 2014-01-20T20:38:00.801Z · LW · GW

I'm out of town or I'd be there. Hope to catch the next one.

Comment by Simulation_Brain on Luck II: Expecting White Swans · 2013-12-19T02:06:53.084Z · LW · GW

Wow, I feel for you. I wish you good luck and good analysis.

Comment by Simulation_Brain on Meetup : Meetup Bolder CO · 2013-11-17T18:03:54.566Z · LW · GW

Ha- I was there the week prior. I hope this is going to happen again. Note also that I'm re-launching a defunct Singularity meetup group for boulder/broomfield if anyone is interested.

Comment by Simulation_Brain on Meetup : Boulder CO · 2013-11-17T18:00:51.652Z · LW · GW

Sorry I missed it. I hope there will be more Boulder LW meetups?

Comment by Simulation_Brain on Pascal's Muggle: Infinitesimal Priors and Strong Evidence · 2013-09-03T01:17:13.382Z · LW · GW

Given how many underpaid science writers are out there, I'd have to say that ~50k/year would probably do it for a pretty good one, especially given the 'good cause' bonus to happiness that any qualified individual would understand and value. But is even 1k/week in donations realistic? What are the page view numbers? I'd pay $5 for a good article on a valuable topic; how many others would as well? I suspect the numbers don't add up, but I don't even have an order-of-magnitude estimate on current or potential readers, so I can't myself say.

Comment by Simulation_Brain on The Importance of Self-Doubt · 2010-08-23T04:55:28.342Z · LW · GW

Upvoted; the issue of FAI itself is more interesting than whether Eliezer is making an ass of himself and thereby the SIAI message (probably a bit; claiming you're smart isn't really smart, but then he's also doing a pretty good job as publicist).

One form of productive self-doubt is to have the LW community critically examine Eliezer's central claims. Two of my attempted simplifications of those claims are posted here and here on related threads.

Those posts don't really address whether strong AI feasible; I think most AI researchers agree that it will become so, but disagree on the timeline. I believe it's crucial but rarely recognized that the timeline really depends on how many resources are devoted to it. Those appear to be steadily increasing, so it might not be that long.

Comment by Simulation_Brain on Existential Risk and Public Relations · 2010-08-20T20:58:47.922Z · LW · GW

Not sure what you mean about by 1), but certainly, recurrent neural nets are more powerful. 2) is no longer true; see for example the GeneRec algorithm. It does something much like backpropagation, but with no derivatives explicitly calculated, there's no concern with recurrent loops.

On the whole, neural net research has slowed dramatically based on the common view you've expressed; but progress continues apace, and they are not far behind cutting edge vision and speech processing algorithms, while working much more like the brain does.

Comment by Simulation_Brain on Other Existential Risks · 2010-08-20T06:02:43.365Z · LW · GW

I think this is an excellent question. I'm hoping it leads to more actual discussion of the possible timeline of GAI.

Here's my answer, important points first, and not quite as briefly as I'd hoped.

1) even if uFAI isn't the biggest existential risk, the very low investment and interest in it might make it the best marginal value for investment of time or money. As someone noted, having at least a few people thinking about the risk far in advance seems like a great strategy if the risk is unknown.

2) No one but SIAI is taking donations to mitigate the risk (as far as I know) so your point 2 is all but immaterial right now.

3) I personally estimate the risk of uFAI to be vastly higher than any other, although I am as you point out quite biased in that direction. I don't think other existential threats come close (although I don't have the expertise to evaluate "gray goo" self replicator dangers) . a) AI is a new risk; (plagues and nuclear wars have failed to get us so far) b) it can be deadly in new ways (outsmarting/out-teching us); c) we don't know for certain that it won't happen soon.

How hard is AI? We actually don't know. I study not just the brain but how it gets computation and thinking done (a rare and fortunate job; most neuroscientists study neurons, not the whole mind) - and I think that its principles aren't actually all that complex. To put it this way: algorithms are rapidly approaching the human level in speech and vision, and the principles of higher-level thinking appear to be similar. (as an aside, EYs now-outdated Levels of General Intelligence does a remarkably good job of converging with my independently-developed opinion on principles of brain function) In my limited (and biased) experience, those with similar jobs tend to have similar opinions. But the bottom line is that we don't know either how hard, or how easy, it could turn out to be. Failure to this point is not strong evidence of continued failure.

And people will certainly try. The financial and power incentives are such that people will continue their efforts on narrow AI, and proceed to general AI when it helps solve problems. Recent military and intelligence grants indicate a trend in increasing interest in getting beyond narrow AI to get more useful AI; things that can make intelligence and military decisions and actions more cheaply (and eventually reliably) than a human. Industry similarly has a strong interest in narrow AI (e.g, sensory processing) but they will probably be a bit later to the GAI party given their track record of short term thinking. Academics are certainly are doing GAI research, in addition to lots of narrow AI stuff. Have a look at the BICA (biologically inspired cognitive architecture) conference for some academic enthusiasts with baby GAI projects.

So, it could happen soon. If it gets much smarter than us, it will do whatever it wants; and if we didn't build its motivational system veeery carefully, doing what it wants will eventually involve using all the stuff we need to live.

Therefore, I'd say the threat is on the order of 10-50%, depending on how fast it develops, how easy making GAI friendly turns out to be, and how much attention the issue gets. That seems huge relative to other truly existential threats.

If it matters, I believed very similar things before stumbling on LW and EY's writings.

I hope this thread is attracting some of the GAI sceptics; I'd like to stress-test this thinking.

Comment by Simulation_Brain on Existential Risk and Public Relations · 2010-08-18T06:20:49.545Z · LW · GW

I work in this field, and was under approximately the opposite impression; that voice and visual recognition are rapidly approaching human levels. If I'm wrong and there are sharp limits, I'd like to know. Thanks!

Comment by Simulation_Brain on Should I believe what the SIAI claims? · 2010-08-13T19:27:19.651Z · LW · GW

Now this is an interesting thought. Even a satisficer with several goals but no upper bound on each will use all available matter on the mix of goals it's working towards. But a limited goal (make money for GiantCo, unless you reach one trillion, then stop) seems as though it would be less dangerous. I can't remember this coming up in Eliezer's CFAI document, but suspect it's in there with holes poked in its reliability.

Comment by Simulation_Brain on Should I believe what the SIAI claims? · 2010-08-13T19:22:22.896Z · LW · GW

I think the concern stands even without a FOOM; if AI gets a good bit smarter than us, however that happens (design plus learning, or self-improvement), it's going to do whatever it wants.

As for your "ideal Bayesian" intuition, I think the challenge is deciding WHAT to apply it to. The amount of computational power needed to apply it to every thing and every concept on earth is truly staggering. There is plenty of room for algorithmic improvement, and it doesn't need to get that good to outwit (and out-engineer) us.

Comment by Simulation_Brain on Should I believe what the SIAI claims? · 2010-08-13T07:56:24.651Z · LW · GW

I think there are very good questions in here. Let me try to simplify the logic:

First, the sociological logic: if this is so obviously serious, why is no one else proclaiming it? I think the simple answer is that a) most people haven't considered it deeply and b) someone has to be first in making a fuss. Kurzweil, Stross, and Vinge (to name a few that have thought about it at least a little) seem to acknowledge a real possibility of AI disaster (they don't make probability estimates).

Now to the logical argument itself:

a) We are probably at risk from the development of strong AI. b) The SIAI can probably do something about that.

The other points in the OP are not terribly relevant; Eliezer could be wrong about a great many things, but right about these.

This is not a castle in the sky.

Now to argue for each: There's no good reason to think AGI will NOT happen within the next century. Our brains produce AGI; why not artificial systems? Artificial systems didn't produce anything a century ago; even without a strong exponential, they're clearly getting somewhere.

There are lots of arguments for why AGI WILL happen soon; see Kurzweil among others. I personally give it 20-40 years, even allowing for our remarkable cognitive weaknesses.

Next, will it be dangerous? a) Something much smarter than us will do whatever it wants, and very thoroughly. (this doesn't require godlike AI, just smarter than us. Self-improving helps, too.) b) The vast majority of possible "wants" done thoroughly will destroy us. (Any goal taken to extremes will use all available matter in accomplishing it.) Therefore, it will be dangerous if not VERY carefully designed. Humans are notably greedy and bad planners individually, and often worse in groups.

Finally, it seems that SIAI might be able to do something about it. If not, they'll at least help raise awareness of the issue. And as someone pointed out, achieving FAI would have a nice side effect of preventing most other existential disasters.

While there is a chain of logic, each of the steps seems likely, so multiplying probabilities gives a significant estimate of disaster, justifying some resource expenditure to prevent it (especially if you want to be nice). (Although spending ALL your money or time on it probably isn't rational, since effort and money generally have sublinear payoffs toward happiness).

Hopefully this lays out the logic; now, which of the above do you NOT think is likely?

Comment by Simulation_Brain on MWI, copies and probability · 2010-06-25T18:07:15.089Z · LW · GW

I think the point is that not valuing non-interacting copies of oneself might be inconsistent. I suspect it's true; that consistency requires valuing parallel copies of ourselves just as we value future variants of ourselves and so preserve our lives. Our future selves also can't "interact" with our current self.

Comment by Simulation_Brain on Defeating Ugh Fields In Practice · 2010-06-21T22:09:24.884Z · LW · GW

Quality matters if you have a community that's interested in your work; you'll get more "nice job" comments if it IS a nice job.

Comment by Simulation_Brain on What if AI doesn't quite go FOOM? · 2010-06-21T19:31:59.484Z · LW · GW

I don't think the lack of an earth-shattering ka-FOOM changes much of the logic of FAI. Smart enough to take over the world is enough to make human existence way better, or end it entirely.

It's quite tricky to ensure that your superintelligent AI does anything like what you wanted it to. I don't share the intuition that creating a "homeostasis" AI is any easier than an FAI. I think one move Eliezer is making in his "Creating Friendly AI" strategy is to minimize the goals you're trying to give the machine; just CEV.

I think this makes apparent what a good CEV seeker needs anyway; some sense of restraint when CEV can't be reliably extrapolated in one giant step. It's less than certain that even a full FOOM AI could reliably extrapolate to some final most-preferred world state.

I'd like to see a program where humanity actually chooses its own future; we skip the extrapolation and just use CV repeatedly; let people live out their own extrapolation.

Does just CV work all right? I don't know, but it might. Sure, Palestinians want to kill Israelis and vice versa; but they both want to NOT be killed way more than they want to kill, and most other folks don't want to see either of them killed.

Or perhaps we need a much more cautious, "OK, let's vote on improvements, but they can't kill anybody and benefits have to be available to everyone..." policy for the central guide of AI.

CEV is a well thought out proposal (perhaps the only one - counterexamples?), but we need more ideas in the realm of AI motivation/ethics systems. Particularly, ways to get from a practical AI with goals like "design neat products for GiantCo" or "obey orders from my commanding officer" to ensure that they don't ruin everything if they start to self-improve. Not everyone is going to want to give their AI CEV as its central goal, at least not until it's clear it can/will self improve, at which point it's probably too late.

Comment by Simulation_Brain on Rationality quotes: June 2010 · 2010-06-12T06:37:25.234Z · LW · GW

Well, yes; it's not straightforward to go from brains to preferences. But for any particular definition of preference, a given brain's "preference" is just a fact about that brain. If this is true, it's important to understanding morality/ethics/volition.