Posts

The Social Alignment Problem 2023-04-28T14:16:17.825Z
irving's Shortform 2023-04-28T13:57:10.589Z

Comments

Comment by irving (judith) on The Social Alignment Problem · 2023-05-02T04:10:45.778Z · LW · GW

I honestly can't say. I wish I could.

Comment by irving (judith) on The Social Alignment Problem · 2023-05-02T03:51:44.278Z · LW · GW

Hmm, not necessarily the researchers, but the founders undoubtedly. OpenAI was specifically formed to increase AI safety.

Comment by irving (judith) on The Social Alignment Problem · 2023-05-02T03:50:23.337Z · LW · GW

I've seen the latter but much more of the former.

Comment by irving (judith) on irving's Shortform · 2023-05-02T03:37:47.500Z · LW · GW

This post was meant as a summary of common rebuttals. I haven't actually heard much questioning of motivation, as instrumental convergence seems fairly intuitive. The more common question asked is how an AI could actually physically achieve the destruction.

Comment by irving (judith) on Realistic near-future scenarios of AI doom understandable for non-techy people? · 2023-04-28T23:30:55.802Z · LW · GW

I just started a writing contest for detailed scenarios on how we get from our current scenario to AI ending the world. I want to compile the results on a website so we have an easily shareable link with more scenarios than can be ad hoc dismissed, because individual scenarios taken from a huge list are easy to argue against and thus discredit the list, but a critical mass of them presented at once defeats this effect. If anyone has good examples I'll add them to the website.

Comment by irving (judith) on The Social Alignment Problem · 2023-04-28T22:46:53.433Z · LW · GW

Yes, I should have been more clear that I was addressing people who have very high p(doom). The prisoner/bomb is indeed somewhat of a simplification, but I do think there's a valid connection in the form of half-heartedly attempting to get the assistance of people more powerful than you and prematurely giving it up as hopeless.

Thank you for your kind words! I was expecting most reactions to be fairly anti-"we should", but I figured it was worth a try.

Comment by irving (judith) on irving's Shortform · 2023-04-28T13:57:11.144Z · LW · GW

Most common antisafety arguments I see in the wild, not steel-manned but also not straw-manned:

  • There’s no evidence of a malign superintelligence existing currently, therefore it can be dismissed without evidence
  • We're faking being worried because if we truly were, we would use violence
  • Yudkowsky is calling for violence
  • Saying something as important as the end of the world could happen could influence people to commit violence, therefore warning about the end of the world is bad
  • Doomers can’t provide the exact steps a superintelligence would take to eliminate humanity
  • When the time comes we’ll just figure it out
  • There were other new technologies that people warned would cause bad outcomes
  • We didn’t know whether nuclear experimentation would end the world but we went ahead with it anyway and we didn’t end the world (omitting that careful effort was put forth first to ensure this risk was miniscule)
  • My personal favorite: AI doom would happen in the future, and anything happening in the future is unfalsifiable, therefore it is not a scientific claim and should not be taken seriously.
Comment by irving (judith) on AI scares and changing public beliefs · 2023-04-08T07:50:09.368Z · LW · GW

Count me in!

Comment by irving (judith) on Catching the Eye of Sauron · 2023-04-08T07:44:54.061Z · LW · GW

Hardcore agree. I'm planning a documentary and trying to find interested parties.

Comment by irving (judith) on Catching the Eye of Sauron · 2023-04-08T07:26:05.252Z · LW · GW

Honestly I don't think fake stories are even necessary, and becoming associated with fake news could be very bad for us. I don't think we've seriously tried to convince people of the real big bad AI. What, two podcasts and an opinion piece in Time? We've never done a real media push but all indications are that people are ready to hear it. "AI researchers believe there's a 10% chance they'll end life" is all the headline you need.

Comment by irving (judith) on Catching the Eye of Sauron · 2023-04-08T07:15:01.394Z · LW · GW

The discussion in the comments is extremely useful and we've sorely needed much more of it. I think we need a separate place purely for sharing and debating our thoughts about strategies like this, and ideally also working on actual praxis based on these strategies. The ideal solution for me would be a separate "strategy" section on LessWrong or at least a tag, with much weaker moderation to encourage out-of-the-box ideas. So as not to pass the buck I'm in the process of building my own forum in the absence of anything better.

Some ideas for praxis I had, to add to the ones in this post and the comments: gather a database of experiences people have had of actually convincing different types of people of AI risk, and then try to quantitatively distill the most convincing arguments for each segment; proofreading for content expected to be mass consumed - this could have prevented the Time nukes gaffe; I strongly believe a mass-appeal documentary could go a long way to alignment-pilling a critical mass of the public. It's possible these are terrible ideas, but I lack a useful place to even discuss them.

Comment by irving (judith) on Catching the Eye of Sauron · 2023-04-08T06:45:13.674Z · LW · GW

It might be almost literally impossible for any issue at all to not get politicized right down the middle when it gets big, but if any issue could avoid that fate one would expect it to be the imminent extinction of life. If it's not possible, I tend to think the left side would be preferable since they pretty much get everything they ever want. I tentatively lean towards just focusing on getting the left and letting the right be reactionary, but this is a question that deserves a ton of discussion.

Comment by irving (judith) on I asked my senator to slow AI · 2023-04-08T06:18:36.802Z · LW · GW

This is exactly the kind of praxis we need to see more of.

Comment by judith on [deleted post] 2023-04-04T23:29:57.446Z

There are three possible futures: 1) nobody ever cares and nothing happens until AI ruin, 2) the public is finally spooked by capabilities advancement and the government acts, but out of ignorance does something like building a literal box, and 3) the public and the government gain an appreciation of the reality of the situation and take actually useful actions. What I was trying to convey is that Future 3 surely has a higher probability in a universe where we decide to think about how to increase its probability, relative to a universe in which we don't think about it and let the default outcome happen. 

And however low our probability of reaching a good solution, surely it's higher than the probability that the public and the government will reach a good solution on their own. If we don't have enough information to take probabilities-increasing action, it seems like it would be useful to think until we either do have enough information to take probabilities-increasing action, or have enough information to decide that the optimal strategy is to not act. What worries me is that our strategy doesn't appear to have been thought about very much at all.

Comment by judith on [deleted post] 2023-04-04T04:07:55.813Z

Any feedback is of course welcomed.

Comment by irving (judith) on We have to Upgrade · 2023-03-25T16:53:29.423Z · LW · GW

For the people disagreeing, I'm curious what part you're disagreeing with.

Comment by irving (judith) on We have to Upgrade · 2023-03-24T07:52:38.682Z · LW · GW

I have been shocked by the lack of effort put into social technology to lengthen timelines. As I see it one of the only chances we have is increasing the number of people (specifically normies, as that is the group with true scale) who understand and care about the risk arguments, but almost nobody seems to be trying to achieve this. Am I missing something?