What did you change your mind about in the last year?

post by mike_hawke · 2023-11-23T20:53:45.664Z · LW · GW · 1 comment

This is a question post.

Contents

  Answers
    10 Ege Erdil
    4 Noosphere89
    4 Bill Benzon
    3 Bruce Lewis
    2 Valdes
    2 mike_hawke
    2 avancil
    1 rotatingpaguro
None
1 comment

Seems like a good New Year activity for rationalists, so I thought I'd post it early instead of late.

Here are some steps I recommend:

Here are some emotional loadings that I anticipate seeing:

Answers

answer by Ege Erdil · 2023-11-24T20:04:39.393Z · LW(p) · GW(p)

I thought cryonics was unlikely to work because a bunch of information might be lost even at the temperatures that bodies are usually preserved in. I now think this effect is most likely not serious and cryonics can work in principle at the temperatures we use, but present-day cryonics is still unlikely to work because of how much tissue damage the initial process of freezing can do.

comment by Andy_McKenzie · 2023-12-05T05:38:01.002Z · LW(p) · GW(p)

Out of curiosity, what makes you think that the initial freezing process causes too much information loss? 

answer by Noosphere89 · 2023-12-02T16:15:25.125Z · LW(p) · GW(p)

My obvious changed my mind moment was about alignment difficulty, and a generalized update away from AI x-risk being real/relevant in general.

The things I've changed my mind about are the following:

  1. I no longer believe that deceptive alignment is very likely to happen. A large part of this is that I think that aligned behavior is probably quite low complexity, whether it's via model-based RL as Steven Byrnes would argue, via Direct Preference Optimization which throws away reward, etc. The point is that I no longer believe that value is as complex as LWers believe it to be, which informs my general skepticism of deceptive alignment. More generally, I think that the deceptively aligned program and the actually aligned program is separated only by 10s-100s of bits in program space.

For some reasoning on why this might be true, I think the main post here has to be the inaccessibility post, which points out that the genome has some fairly harsh limitations on how much it can encode priors on values, and thus it needs to use indirect influence, and that limits how much it can use specific priors for values instead of modifying the algorithms for within life-time RL or self-learning.

https://www.lesswrong.com/posts/CQAMdzA4MZEhNRtTp/human-values-and-biases-are-inaccessible-to-the-genome# [LW · GW]

  1. I no longer believe that the security mindset is appropriate for AI in general, primarily because computer security/rocket engineering in general is both a bad mindset to a lot of problems, because you will usually both need more trust that your system works to get results than security mindset would tell you, and also that this works far more than LWers generally realize. More specifically, there are also very severe disanalogies between computer security and AI alignment, so much so that security mindset is an anti-helpful framework for aligning AI.

Quintin Pope has the point better than I do, here:

https://www.lesswrong.com/posts/wAczufCpMdaamF9fy/my-objections-to-we-re-all-gonna-die-with-eliezer-yudkowsky#Yudkowsky_mentions_the_security_mindset__ [LW · GW]

  1. I agree with the claim made by Jaime Sevilla that AI alignment/AI control is fundamentally profitable, and plausibly wildly so, and as a consequence, a lot of money will already be spent to control AI, and there is no reason to assume that profit motives move in the direction of trading safety for capabilities, due to several differences:

A. There's much more negative externalities internalized than is the case usually, because the capitalists share a far larger portion of the costs if they fail to align AI.

B. Some amount of alignment is necessary for AI to be in the world at all, and thus there will be efforts to align AI by default, which is either duplicative or strictly better than LWer's attempts to align AI.

answer by Bill Benzon · 2023-11-23T22:09:32.922Z · LW(p) · GW(p)

At the beginning of the year I thought a decent model of how LLMs work was 10 years or so out. I’m now thinking it may be five years or less. What do I mean? 

In the days of classical symbolic AI, researchers would use a programming language, often some variety of LISP, but not always, to implement a model of some set of linguistic structures and processes, such as those involved in story understanding and generation, or question answering. I see a similar division of conceptual labor in figuring out what’s going on inside LLMs. In this analogy I see mechanistic understanding as producing the equivalent of the programming languages of classical AI. These are the structures and mechanisms of the virtual machine that operates the domain model, where the domain is language in the broadest sense. I’ve been working on figuring out a domain model and I’ve had unexpected progress in the last month. I’m beginning to see how such models can be constructed. Call these domain models meta-models for LLMs.

It’s those meta models that I’m thinking are five years out. What would the scope of such a meta model be? I don’t know. But I’m not thinking in terms of one meta model that accounts for everything a given LLM can do. I’m thinking of more limited meta models. I figure that various communities will begin creating models in areas that interest them. 

I figure we start with some hand-crafting to work out some standards. Then we’ll go to work on automating the process of creating the model. How will that work? I don’t know. Noone’s ever done it.

comment by Bill Benzon (bill-benzon) · 2023-11-25T06:30:39.513Z · LW(p) · GW(p)

My confidence in this project has just gone up. It seems that I now have a collaborator. That is, he's familiar with my work in general and my investigations of ChatGPT in particular, we've had some email correspondence, and a couple of Zoom conversations. During today's conversation we decided to collaborate on a paper on the theme of 'demystifying LLMs.' 

A word of caution. We haven't written the paper yet, so who knows? But all the signs are good. He's an expert on computer vision systems on the faculty of Goethe University in Frankfurt: Visvanathan Ramesh

These are my most important papers on ChatGPT:

comment by rotatingpaguro · 2023-11-23T23:24:46.954Z · LW(p) · GW(p)

To clarify: do you think in about 5 years we will be able to do such thing to then state of the art big models?

Replies from: bill-benzon
comment by Bill Benzon (bill-benzon) · 2023-11-23T23:47:16.878Z · LW(p) · GW(p)

Yes. It's more about the structure of language and cognition than about the mechanics of the models. The number of parameters and layers and functions assigned to layers shouldn't change things, nor going multi-modal, either. Whatever the mechanics of the mechanics of the models, they have to deal with language as it is, and that's not changing in any appreciable way.

answer by Bruce Lewis · 2023-11-24T04:33:38.949Z · LW(p) · GW(p)

At the beginning of 2023 I thought Google was a good place to work. I changed my mind after receiving new evidence.

answer by Valdes · 2023-11-25T15:52:52.328Z · LW(p) · GW(p)

I have become less sceptic about the ability of western government to act and solve issues in a reasonable timeframe. In general, I tend to think political actions are doomed and are mostly only able to let the statu quo evolve by itself. But recent relatively fast reactions to the evolution of mainstream AI tools have led me to think that I am too cynical on this. I do not know what to think instead, but I am now less confident in my old opinion.

answer by mike_hawke · 2023-11-25T03:37:30.221Z · LW(p) · GW(p)

Many projects are left undone simply because people don't step up to do them. I had heard this a lot, but I now feel it more deeply.

A number of times this year, I sharply changed the mind of a trusted advisor by arguing with them, even though I thought they knew more and should be able to change my mind. It now seems marginally more valuable to argue with people and ask them to show their work.

My antipathy toward Twitter had waned, but then I asked people about it, and did some intentional browsing, and I am back to being as anti-Twitter as ever. Twitter is harming the minds of some of my smartest friends & allies, and they seem to be unable to fully realize this, presumably due to the addiction impairing their judgment.

I have become highly uncertain about public sentiment around AI progress. I have heard multiple conflicting claims about what the median American thinks, always asserted with conviction, but never by anyone anywhere near the median.

comment by mike_hawke · 2023-11-25T16:33:46.524Z · LW(p) · GW(p)

Oh also, I am no longer surprised to find out that someone has an eloquent, insightful online presence while also being perpetually obnoxious and maladjusted in real life. Turns out lots of people have both of those.

answer by avancil · 2023-11-24T19:18:30.761Z · LW(p) · GW(p)

At the beginning of the year, I had never heard of ChatGPT, and thought AI would continue to progress slowly, in a non-disruptive fashion. At this point, I believe 2023 will be at least as significant as 2007 (iPhone) in terms of marking the beginning of a technological transformation.

answer by rotatingpaguro · 2023-11-23T21:34:25.879Z · LW(p) · GW(p)

Off the top of my head: Q1 2023 I was vaguely scornful of asymptotics, Q4 2023 I think they are a useful tool.

comment by the gears to ascension (lahwran) · 2023-11-23T21:59:52.299Z · LW(p) · GW(p)

can you say more about what evidence produced this change?

Replies from: rotatingpaguro
comment by rotatingpaguro · 2023-11-23T23:30:06.431Z · LW(p) · GW(p)

I started working on a asymptotics problem, partly to see if I would change idea. I try to keep my eyes on the ball in general, so I started noticing the applications and practical implications of it. Previously, I had encountered the topic mostly reading up-in-the-clouds theoretical stuff. 

I also think a tribal instinct was tinging my past thoughts; asymptotics were "Frequentist" while I was "Bayesian".

1 comment

Comments sorted by top scores.

comment by the gears to ascension (lahwran) · 2023-11-25T16:52:22.349Z · LW(p) · GW(p)

Someone has downvoted and disagreed almost every comment on this post.