Does LessWrong make a difference when it comes to AI alignment?
post by PhilosophicalSoul (LiamLaw) · 2024-01-03T12:21:32.587Z · LW · GW · 2 commentsThis is a question post.
Contents
Answers 12 Seth Herd 7 AnthonyC 7 Charlie Steiner 4 habryka 3 NicholasKross 1 mishka None 2 comments
I see LessWrong is currently obsessed with AI Alignment. I spoke with some others on the unofficial LessWrong discord, and agreed that LessWrong is becoming more and more specialised, thus scaring off any newcomers who aren't interested in AI.
That aside. I'm genuinely curious. Do any of the posts on LessWrong make any difference in the general psychosphere of AI alignment? Does anyone who has actual control on the direction of AI and LLM's follow LessWrong? Does Sam Altman or anyone at OpenAI engage with LessWrongers?
Not being condescending here. I'm just asking this since there's two (2) important things to note: (1) Since LessWrong has very little focus on anything other than AI at the moment, are these efforts meaningful? (2) What are some basic beginner resources someone can use to understand the flood of complex AI posts currently on the front page? (Maybe I'm being ignorant, but I haven't found a sequence dedicated to AI...yet.)
Answers
Good ideas propagate. Nobody from an AGI org has to read a LessWrong post for any good ideas generated here to reach them. Although they definitely do read Alignment Forum posts and often LessWrong posts. Check out the Alignment Forum FAQ to understand its relationship to LW.
LessWrong and AF also provide something that journals do not: public discussion that includes both expert and outside contributions. This is lacking in other academic forums. After spending a long time in cognitive neuroscience, it looked to me like intellectual progress was severaly hampered by people communicating rarely, and in cliques. Labs each had their own viewpoint that was pretty biased and limited, and cross-lab communication was rare, but extremely valuable when it happened. So I think the existence of a common forum is extremely valuable for making rapid progress.
There are specialized filters for LW by tag. If you're not interested in AI, you can turn that topic down as far as you want.
↑ comment by PhilosophicalSoul (LiamLaw) · 2024-01-04T06:48:54.748Z · LW(p) · GW(p)
Ah okay, thanks. I wasn't aware of the Alignment Forum, I'll check it out.
I don't disagree that informal forums are valuable. I take Jacque Ellul's belief in Technological Society that science firms held by monopolies tend to have their growth stunted for exactly the reasons you pointed out.
I think it's more that places like LessWrong are susceptible to having the narrative around them warped (referencing the article about Scott Alexander). Though this is slightly off-topic now.
Lastly, I am interested in AI; I'm just feeling around for what the best way to get into it is. So thanks.
Just wanted to point out that AI Safety ("Friendliness" at the time) was the original impetus for LW. Only, they (esp. EY, early on) kept noticing other topics that were prerequisites for even having a useful conversation about AI, and topics that were prerequisites for those, etc., and that's how the Sequences came to be. So in that sense, "LW is more and more full of detailed posts about AI that newcomers can't follow easily" is a sign that everything is going as intended, and yes, it really is important to read a lot of the prerequisite background material if you want to participate in that part of the discussion.
On the other hand, if you want a broader participation in the parts of the community that are about individual and collective rationality, that's still here too! You can read the Sequence Highlights [? · GW], or the collections of resources listed by CFAR, or everything else in the Library. And if there's something you want to ask or discuss, make a post about it, and you'll most likely get some good engagement, or at least people directing you to other places to investigate or discuss it. There are also lots of other forums and blogs and substacks with current or historical ties to LW that are more specialized, now that the community is big enough to support that. The diaspora/fragmentation will continue for many of the same reasons we no longer have Natural Philosophers.
↑ comment by PhilosophicalSoul (LiamLaw) · 2024-11-08T11:10:50.008Z · LW(p) · GW(p)
I was naive during the period in which I made this particular post. I'm happy with the direction LW is going in, having experienced more of the AI world, and read many more posts. Thank you for your input regardless.
LessWrong is becoming more and more specialised, thus scaring off any newcomers who aren't interested in AI.
Yup, sorry.
Do any of the posts on LessWrong make any difference in the general psychosphere of AI alignment?
Sometimes. E.g. the Waluigi effect post [LW · GW] was in March, and I've seen that mentioned by random LLM users. CNN had Conor Leahy on as an AI expert about the same time, and news coverage about Bing chat sometimes glossed Evan Hubinger's post [LW · GW] about it.
Does Sam Altman or anyone at OpenAI engage with LessWrongers?
Yeah. And I don't just mean on Twitter, I mean it's kinda hard not to talk to e.g. Jan Leike when he works there.
What are some basic beginner resources someone can use to understand the flood of complex AI posts currently on the front page?
Yeah, this is pretty tricky, because fields accumulate things you have to know to be current in them. For "what's going on with AI in general" there are certainly good posts on diverse topics here, but nothing as systematic and in-depth as a textbook. I'd say just look for generically good resources to learn about AI, learning theory, neural networks. Some people have reading lists (e.g. MIRI, Vanessa Kosoy, John Wentworth [LW · GW]), many of which are quite long and specialized - obviously how deep down the rabbit hole you go depends on what you want to learn. For alignment topics various people have made syllabi (This one, I think primarily based on Richard Ngo's, is convenient to recommend despite occasional disagreements)
↑ comment by PhilosophicalSoul (LiamLaw) · 2024-01-03T18:33:14.095Z · LW(p) · GW(p)
Thanks for that.
Out of curiosity then, do people use the articles here as part of bigger articles on other academic journals? Is this place sort of the 'launching pad' for ideas and raw data?
(2) What are some basic beginner resources someone can use to understand the flood of complex AI posts currently on the front page? (Maybe I'm being ignorant, but I haven't found a sequence dedicated to AI...yet.)
You can check out the recommended sequences at the top of the AI Alignment Forum (which is a subset of LessWrong).
To add onto other people's answers:
People have disagreements over what the key ideas about AI/alignment even are.
People with different basic-intuitions notoriously remain unconvinced by each other's arguments, analogies, and even (the significance of) experiments. This has not been solved yet.
Alignment researchers usually spend most time on their preferred vein of research, rather than trying to convince others [LW · GW].
To (try to) fix this, the community's added concepts like "inferential distance [? · GW]" and "cruxes [? · GW]" to our vocabulary. These should be be discussed and used explicitly.
One researcher has some shortform notes (here [LW(p) · GW(p)]and here [LW(p) · GW(p)]) on how hard it is to communicate about AI alignment. I myself wrote some longer, more emotionally-charged notes [LW · GW] on why we'd expect this.
But there's hope yet! This chart format [LW · GW] makes it easier to communicate beliefs on key AI questions. And better ideas can always be lurking around the corner...
↑ comment by PhilosophicalSoul (LiamLaw) · 2024-01-04T06:51:34.963Z · LW(p) · GW(p)
Do you think these disagreements stem from a sort of egoistic desire to be known as the 'owner' of that concept? Or to be a forerunner for that vein of research should it become popular?
Or is it a genuinely good faith disagreement on the future of AI and what the best approach is? (Perhaps these questions are outlined in the articles you've linked, which I'll begin reading now. Though I do think it's still useful to perhaps include a summary here too.) Thanks for your help.
Replies from: NicholasKross↑ comment by Nicholas / Heather Kross (NicholasKross) · 2024-01-04T19:24:17.313Z · LW(p) · GW(p)
Seems to usually be good faith. People can still be biased of course (and they can't all be right on the same questions, with the current disagreements), but it really is down to differing intuitions, which background-knowledge posts have been read by which people, etc.
The impact of the LessWrong community as a whole on the field of AI and especially on the field of AI safety seems to be fairly strong, even if difficult to estimate in a precise fashion.
For example, a lot of papers related to interpretability of AI models are publicized and discussed here, so I would expect that interpretability researchers do often read those discussions.
One of the most prominent examples of LessWrong impact is Simulator Theory which has been initially published on LessWrong (Simulators [LW · GW]). Simulator Theory is a great deconfusion framework in regard to what LLMs are and are not, helping people to avoid mistakingly interpreting properties of particular inference runs as properties of LLMs themselves, and has recently been featured in Nature as a part of joint publication, M.Shanahan and the authors of Simulator Theory, "Role play with large language models", Nov 8, 2023, open access.
But I also think that people ending up working on AI existential safety in major AI labs are often influenced by the AI safety discourse on LessWrong in their career choice and initial orientation, although I don't know if it's possible to track that well.
2 comments
Comments sorted by top scores.
comment by 1a3orn · 2024-01-03T23:51:01.230Z · LW(p) · GW(p)
What are some basic beginner resources someone can use to understand the flood of complex AI posts currently on the front page? (Maybe I'm being ignorant, but I haven't found a sequence dedicated to AI...yet.)
There is no non-tradition-of-thought specific answer to that question.
That is, people will give you radically different answers depending on what they believe. Resources that are full of just.... bad misconceptions, from one perspective, will be integral for understanding the world, from another.
For instance, the "study guide" referred to in another post lists the "List of Lethalities" by Yudkowsky as an important resource. Yet if you go to the only current review [LW · GW] of it on LessWrong thinks that it is basically just confused, extremely badly, and that "deeply engaging with this post is, at best, a waste of time." I agree with this assessment, but my agreement is worthless in the face of the vast agreements and disagreements swaying back and forth.
Your model here should be that you are asking a room for of Lutherans, Presbyterians, Methodists, Baptists, Anabaptists, Huttites, and other various and sundry Christian groups, and asking them for the best introduction to interpreting the Bible. You'll get lots of different responses! You might be able to pick out the leading thinkers for each group. But there will be no consensus about what the right introductory materials are, because there is no consensus in the group.
For myself, I think that before you think about AI risk you should read about how AI, as it is practiced, actually works. The 3blue1brown course on neural networks; the Michael Nielsen Deep Learning book online; tons of stuff from Karpathy; these are all excellent. But -- this is my extremely biased opinion, and other people doubtless think it is bad.
comment by Seth Herd · 2024-01-03T20:35:08.257Z · LW(p) · GW(p)
On your question of AI posts: it is a complex topic, and understanding all of the nuances is a full-time job. The recent post Shallow review of live agendas in alignment & safety [LW · GW] is an outstanding overview of all of the research currently going on. Opinions on LessWrong are even more varied, but that's a good starting point for understanding what's going on in the field and therefore on LW right now.