Posts

laserfiche's Shortform 2023-04-19T13:50:23.232Z
An example elevator pitch for AI doom 2023-04-15T12:29:17.303Z

Comments

Comment by laserfiche on laserfiche's Shortform · 2023-08-26T14:19:34.905Z · LW · GW

Yes, thank you, I think that's it exactly. I don't think that people are communicating this well when they are reporting predictions.

Comment by laserfiche on laserfiche's Shortform · 2023-08-25T21:58:10.082Z · LW · GW

Are we misreporting p(doom)s?

I usually say that my p(doom) is 50%, but that doesn't mean the same thing that it does in a weather forecast.

In weather forecasts, the percentage states that they ran a series of simulations, and that percentage of simulations produced that result. A forecast of a 100% chance of rain, then, does not mean that there is near a 100% chance of rain. Forecasts still have error bars; 10 days out, a forecast will be wrong 50% of the time. Therefore, a 10 forecast of 100% chance of rain means that there is actually a 50%.

In my mental simulations, the outcome is bad 100% of the time. I can't construct a convincing scenario in my mind where things work out, at least contingent on the continued development of AI. But I know that there is much that I don't know, things I haven't yet considered, etc. Hence the 50% error margin. But like in the weather forecast, this can be misinterpreted as me thinking that 50% of the time it works out.

Is there a terminology that currently accounts for this? If not, does it mean that p(doom)s are being misunderstood, or reported with different meanings?

Comment by laserfiche on My views on “doom” · 2023-04-29T09:48:40.310Z · LW · GW

Are you assuming that avoiding doom in this way will require a pivotal act? It seem absent policy intervention and societal change, even if some firms exhibit a proper amount of concern many others will not.

Comment by laserfiche on Don't die with dignity; instead play to your outs · 2023-04-25T21:45:54.501Z · LW · GW

A similar principle I have about this situation is: Don't get too clever.

Don't do anything questionable or too complicated. If you do, you're just as likely to cause harm as to cause good. The psychological warfare campaign you've envisioned against OpenAI is going to backfire on you and undermine your team.

Keep it simple. Promote alignment research. Persuade your friends. Volunteer on one of the many relevant projects.

Comment by laserfiche on laserfiche's Shortform · 2023-04-19T16:29:58.978Z · LW · GW

Upvoted, I agree with the gist of what you saying, with some caveats. I think I would have expected the two posts to end up with a score of 0 to 5, but there is a world of difference between a 5 and a -12.

It's worth noting that the example explainer you linked to doesn't appeal to me at all.  And that's fine.  It doesn't mean that there's something wrong with the argument, or with you, or with me.  But it's important to note that it demonstrates a gap.  I've read all the alignment material[1], and I still see huge chunks of the population that will not be compelled by the existing arguments.  Also, many of the arguments are outdated and are less applicable to the current state of events.

 

  1. ^

    https://docs.google.com/document/d/1zx_WpcwuT3Stpx8GJJHcvJLSgv6dLje0eslVKvuk1yQ/edit

Comment by laserfiche on laserfiche's Shortform · 2023-04-19T13:50:23.533Z · LW · GW

Under the tag of AI Safety Materials, 48 posts come up.  There are exactly two posts by sprouts:

An example elevator pitch for AI doom Score: -8[1]

On urgency, priority and collective reaction to AI-Risks: Part I Score: -12

These are also the only two posts with negative scores.  

In both cases, it was the user's first post.  For Denreik in particular you can tell that he suffered over it and put many hours into it. 

Is it counterproductive to discourage new arrivals attempting to assist in the AI alignment effort?

Is there a systemic bias against new posters?

  1. ^

    Full disclosure, this was posted by me.  

Comment by laserfiche on On urgency, priority and collective reaction to AI-Risks: Part I · 2023-04-19T13:28:41.725Z · LW · GW

Denreik, I think this is a quality post and I know you spent a lot of time on it. I found your paragraphs on threat complexity enlightening - it is in hindsight an obvious point that a sufficiently complex or subtle threat will be ignored by most people regardless of its certainty, and that is an important feature of the current situation.

Comment by laserfiche on An example elevator pitch for AI doom · 2023-04-18T17:09:43.296Z · LW · GW

I agree that there are many situations where this cannot be used. But there appears at least to be a gap that arguments like this can fill that is missed by the existing explanations.

Comment by laserfiche on An example elevator pitch for AI doom · 2023-04-15T16:14:49.631Z · LW · GW

I find those first two and Lethalities to be too long and complicated for convincing an uninitiated, marginally interested person. Zvi's Basics is actually my current preference along with stories like It Looks Like You're Trying To Take Over The World (Clippy).

Comment by laserfiche on Catching the Eye of Sauron · 2023-04-08T21:19:03.816Z · LW · GW

The best primer that I have found so far is Basics of AI Wiping Out All Value in the Universe by Zvi.  It's certainly not going to pass peer review, but it's very accessible, compact, covers the breadth of the topics, and links to several other useful references.  It has the downside of being buried in a very long article, though the link above should take you to the correct section.

Comment by laserfiche on Catching the Eye of Sauron · 2023-04-08T19:36:20.774Z · LW · GW

Let's not bury this comment. Here is someone we have failed: there are comprehensive, well-argued explanations for all of this, and this person couldn't find them. Even the responses to the parent comment don't conclusively answer this - let's make sure that everyone can find excellent arguments with little effort.

Comment by laserfiche on Eliezer Yudkowsky’s Letter in Time Magazine · 2023-04-06T11:51:26.136Z · LW · GW

Thank you for pointing this perspective out. Although Eliezer is from the west, I assure you he cares nothing for that sort of politics. The whole point is that the ban would have to be universally supported, with a tight alliance between US, China, Russia, and ideally every other country in the world. No one wants to do any airstrikes and, you're right, they are distracting from the real conversation.

Comment by laserfiche on Giant (In)scrutable Matrices: (Maybe) the Best of All Possible Worlds · 2023-04-05T12:00:55.975Z · LW · GW

That's a very interesting observation. As far as I understand as well, deep neural networks have completely unlimited rewirability - a particular "function" can exist anywhere in the network, in multiple places, or spread out between and within layers. It can be duplicated in multiple places. And if you retrain that same network, it will then be found in another place in another form. It makes it seem like you need something like a CNN to be able to successfully identify functional groups within another model, if it's even possible.

Comment by laserfiche on Speed running everyone through the bad alignment bingo. $5k bounty for a LW conversational agent · 2023-04-02T12:08:32.146Z · LW · GW

Thank you Arthur.  I'd like to offer my help on continuing to develop this project, and helping any of the other teams (@ccstan99, @johnathan, and others) on their projects.  We're all working towards the same thing.  PM me, and let me know if there are any other forums (Discord, Slack, etc) where people are actively working on or need programming help for AI risk mitigation.

Comment by laserfiche on Stop pushing the bus · 2023-03-31T19:11:52.812Z · LW · GW

I think we need to move public opinion first, which hopefully is slowly starting to happen.  We need one of two things to happen:

  1. A breakthrough in AI alignment research
  2. Major shifts in policy

A strike does not currently help either of those.  

Edit:  Actually, I do agree that if you could get ALL AI researchers - a general strike - that would serve the purpose of delay, and I would be in favor.  I do not think that is realistic.  A lesser strike might also serve to drum up attention; I was initially afraid that it might drum up negative attention.

Comment by laserfiche on Speed running everyone through the bad alignment bingo. $5k bounty for a LW conversational agent · 2023-03-26T19:35:47.475Z · LW · GW

I have a well functioning offline Python pipeline that integrates the OpenAI API and the entire alignment research dataset.  If this is still needed, I need to consider how to make this online and accessible without tying it to my API key.  Perhaps I should switch to using the new OpenAI plugins instead.  Suggestions welcomed.

Comment by laserfiche on GPT-4 · 2023-03-23T23:32:53.596Z · LW · GW

It's easy to construct alternate examples of the Monty Fall problem that clearly weren't in the training data.  For example, from my experience GPT-4 and Bing Chat in all modes always get this prompt wrong:

Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You know that the car is always behind door number 1. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?