LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

New LessWrong review winner UI ("The LeastWrong" section and full-art post pages)
kave · 2024-02-28T02:42:05.801Z · comments (64)

[link] A case for AI alignment being difficult
jessicata (jessica.liu.taylor) · 2023-12-31T19:55:26.130Z · comments (56)

Me, Myself, and AI: the Situational Awareness Dataset (SAD) for LLMs
L Rudolf L (LRudL) · 2024-07-08T22:24:38.441Z · comments (28)

On the future of language models
owencb · 2023-12-20T16:58:28.433Z · comments (17)

[question] What convincing warning shot could help prevent extinction from AI?
Charbel-Raphaël (charbel-raphael-segerie) · 2024-04-13T18:09:29.096Z · answers+comments (18)

The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks
Lucius Bushnaq (Lblack) · 2024-05-20T17:53:25.985Z · comments (4)

[link] A Chess-GPT Linear Emergent World Representation
Adam Karvonen (karvonenadam) · 2024-02-08T04:25:15.222Z · comments (14)

SAE reconstruction errors are (empirically) pathological
wesg (wes-gurnee) · 2024-03-29T16:37:29.608Z · comments (16)

Scaling and evaluating sparse autoencoders
leogao · 2024-06-06T22:50:39.440Z · comments (6)

In favour of exploring nagging doubts about x-risk
owencb · 2024-06-25T23:52:01.322Z · comments (2)

[link] Transformer Circuit Faithfulness Metrics Are Not Robust
Joseph Miller (Josephm) · 2024-07-12T03:47:30.077Z · comments (5)

Catching AIs red-handed
ryan_greenblatt · 2024-01-05T17:43:10.948Z · comments (22)

[link] Poker is a bad game for teaching epistemics. Figgie is a better one.
rossry · 2024-07-08T06:05:20.459Z · comments (47)

Backdoors as an analogy for deceptive alignment
Jacob_Hilton · 2024-09-06T15:30:06.172Z · comments (2)

I turned decision theory problems into memes about trolleys
Tapatakt · 2024-10-30T20:13:29.589Z · comments (20)

Dreams of AI alignment: The danger of suggestive names
TurnTrout · 2024-02-10T01:22:51.715Z · comments (59)

[link] Carl Sagan, nuking the moon, and not nuking the moon
eukaryote · 2024-04-13T04:08:50.166Z · comments (8)

Key takeaways from our EA and alignment research surveys
Cameron Berg (cameron-berg) · 2024-05-03T18:10:41.416Z · comments (10)

What happens if you present 500 people with an argument that AI is risky?
KatjaGrace · 2024-09-04T16:40:03.562Z · comments (7)

LLMs can learn about themselves by introspection
Felix J Binder (fjb) · 2024-10-18T16:12:51.231Z · comments (38)

LLM Applications I Want To See
sarahconstantin · 2024-08-19T21:10:03.101Z · comments (5)

Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small
Joseph Bloom (Jbloom) · 2024-02-02T06:54:53.392Z · comments (37)

Lsusr's Rationality Dojo
lsusr · 2024-02-13T05:52:03.757Z · comments (17)

Refactoring cryonics as structural brain preservation
Andy_McKenzie · 2024-09-11T18:36:30.285Z · comments (14)

Response to nostalgebraist: proudly waving my moral-antirealist battle flag
Steven Byrnes (steve2152) · 2024-05-29T16:48:29.408Z · comments (29)

[link] Notes from a Prompt Factory
Richard_Ngo (ricraz) · 2024-03-10T05:13:39.384Z · comments (19)

On Dwarksh’s Podcast with Leopold Aschenbrenner
Zvi · 2024-06-10T12:40:03.348Z · comments (7)

Live Theory Part 0: Taking Intelligence Seriously
Sahil · 2024-06-26T21:37:10.479Z · comments (3)

[link] Advice for journalists
Nathan Young · 2024-10-07T16:46:40.929Z · comments (53)

A simple model of math skill
Alex_Altair · 2024-07-21T18:57:33.697Z · comments (16)

General Thoughts on Secular Solstice
Jeffrey Heninger (jeffrey-heninger) · 2024-03-23T18:48:43.940Z · comments (60)

[link] LessOnline (May 31—June 2, Berkeley, CA)
Ben Pace (Benito) · 2024-03-26T02:34:00.000Z · comments (24)

[link] Advice for Activists from the History of Environmentalism
Jeffrey Heninger (jeffrey-heninger) · 2024-05-16T18:40:02.064Z · comments (8)

Behavioral red-teaming is unlikely to produce clear, strong evidence that models aren't scheming
Buck · 2024-10-10T13:36:53.810Z · comments (4)

[link] The Minority Coalition
Richard_Ngo (ricraz) · 2024-06-24T20:01:27.436Z · comments (7)

Hierarchical Agency: A Missing Piece in AI Alignment
Jan_Kulveit · 2024-11-27T05:49:04.241Z · comments (19)

Why comparative advantage does not help horses
Sherrinford · 2024-09-30T22:27:57.450Z · comments (10)

MIRI’s 2024 End-of-Year Update
Rob Bensinger (RobbBB) · 2024-12-03T04:33:47.499Z · comments (1)

[link] CIV: a story
Richard_Ngo (ricraz) · 2024-06-15T22:36:50.415Z · comments (6)

[link] "Deep Learning" Is Function Approximation
Zack_M_Davis · 2024-03-21T17:50:36.254Z · comments (28)

Announcing the London Initiative for Safe AI (LISA)
James Fox · 2024-02-02T23:17:47.011Z · comments (0)

On attunement
Joe Carlsmith (joekc) · 2024-03-25T12:47:34.856Z · comments (8)

[link] My cover story in Jacobin on AI capitalism and the x-risk debates
garrison · 2024-02-12T23:34:16.526Z · comments (5)

OpenAI #8: The Right to Warn
Zvi · 2024-06-17T12:00:02.639Z · comments (8)

Comments on Anthropic's Scaling Monosemanticity
Robert_AIZI · 2024-06-03T12:15:44.708Z · comments (8)

Access to powerful AI might make computer security radically easier
Buck · 2024-06-08T06:00:19.310Z · comments (14)

[link] Seven lessons I didn't learn from election day
Eric Neyman (UnexpectedValues) · 2024-11-14T18:39:07.053Z · comments (33)

Dialogue introduction to Singular Learning Theory
Olli Järviniemi (jarviniemi) · 2024-07-08T16:58:10.108Z · comments (14)

Explaining a Math Magic Trick
Robert_AIZI · 2024-05-05T19:41:52.048Z · comments (10)

OpenAI's Sora is an agent
CBiddulph (caleb-biddulph) · 2024-02-16T07:35:52.171Z · comments (25)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

elityre on Why it's so hard to talk about Consciousness

I think this post cleanly and accurately elucidates a dynamic in conversations about consciousness. I hadn't put my finger on this before reading this post, and I noe think about it every time I hear or participate in a discussion about consciousness.

euanmclean on Computational functionalism probably can't explain phenomenal consciousness

I'm not saying anything about MCMC. I'm saying random noise is not what I care about, the MCMC example is not capturing what I'm trying to get at when I talk about causal closure.

I don't disagree with anything you've said in this comment, and I'm quite confused about how we're able to talk past each other to this degree.

steve2152 on Computational functionalism probably can't explain phenomenal consciousness

I’m confused by your comment. Let’s keep talking about MCMC.

The following is true: The random inputs to MCMC have “a causal effect on the execution of the algorithm such that the algorithm doesn't do what it's meant to do if you just take the average of those fluctuations”.
- For example, let’s say the MCMC accepts a million inputs in the range (0,100), typically generated by a PRNG in practice. If you replace the PRNG by the function return 500 (“just take the average of those fluctuations”), then the MCMC will definitely fail to give the right answer.
The following is false: “the signals entering…are systematic rather than random”. The random inputs to MCMC are definitely expected and required to be random, not systematic. If the PRNG has systematic patterns, it screws up the algorithm—I believe this happens from time to time, and people doing Monte Carlo simulations need to be quite paranoid about using an appropriate PRNG. Even very subtle long-range patterns in the PRNG output can screw up the calculation.

The MCMC will do a highly nontrivial (high-computational-complexity) calculation and give a highly non-arbitrary answer. The answer does depend to some extent on the stream of random inputs. For example, suppose I do MCMC, and (unbeknownst to me) the exact answer is 8.00. If I use a random seed of 1 in my PRNG, then the MCMC might spit out a final answer of 7.98 ± 0.03. If I use a random seed of 2, then the MCMC might spit out a final answer of 8.01 ± 0.03. Etc. So the algorithm run is dependent on the random bits, but the output is not totally arbitrary.

All this is uncontroversial background, I hope. You understand all this, right?

executions would branch conditional on specific charge trajectories, and it would be a rubbish computer.

As it happens, almost all modern computer chips are designed to be deterministic, by putting every signal extremely far above the noise floor. This has a giant cost in terms of power efficiency, but it has a benefit of making the design far simpler and more flexible for the human programmer. You can write code without worrying about bits randomly flipping—except for SEUs, but those are rare enough that programmers can basically ignore them for most purposes.

(Even so, such chips can act non-deterministically in some cases—for example as discussed here, some ML code is designed with race conditions where sometimes (unpredictably) the chip calculates (a+b)+c and sometimes a+(b+c), which are ever-so-slightly different for floating point numbers, but nobody cares, the overall algorithm still works fine.)

But more importantly, it’s possible to run algorithms in the presence of noise. It’s not how we normally do things in the human world, but it’s totally possible. For example, I think an ML algorithm would basically work fine if a small but measurable fraction of bits randomly flipped as you ran it. You would need to design it accordingly, of course—e.g. don’t use floating point representation, because a bit-flip in the exponent would be catastrophic. Maybe some signals would be more sensitive to bit-flips than others, in which case maybe put an error-correcting code on the super-sensitive ones. But for lots of other signals, e.g. the lowest-order bit of some neural net activation, we can just accept that they’ll randomly flip sometimes, and the algorithm still basically accomplishes what it’s supposed to accomplish—say, image classification or whatever.

raemon on The "Think It Faster" Exercise

Ah gotcha. Yeah, this is why Deliberate Grieving [LW · GW] is a core rationalist skill.

raemon on The "Think It Faster" Exercise

The idea of 'thinking it faster' is provocative, because it seems to be over-optimising for speed rather than other values, where as the way you're implementing it is by generating more meaningful or efficient decisions which are underpinned by a meta-analysis of your process—which is actually about increasing the quality of your decision-making.

I considered changing it to "Think it Sooner", which nudges you a bit away from "try to think frenetically fast" and towards "just learn to steer towards the most efficient parts of your thought process, avoid wasted motion, and use more effective metastrategies." "Think It Sooner" feels noticeably harder to say so I decided to stick with the original (although I streamlined the phrasing from "Think That Thought faster" a bit so it rolled off the tongue)

james-stephen-brown on The "Think It Faster" Exercise

Wow, that was quick. I mean, rather than scaffolding work that seems unproductive but is actually necessary, most creative time (for me at least) is wasted in resisting change (my number 3 point was about trying changes even if you don't immediately agree with them).

raemon on The "Think It Faster" Exercise

I actually think this third thing is likely to be a key lesson learned from meta-analysis, to not be stubborn and to pivot to the better solution more freely, what I call "back it up and break it".

I'm not sure I understood this point, could you say more?

james-stephen-brown on The "Think It Faster" Exercise

Thanks for this, nice writing.

The idea of 'thinking it faster' is provocative, because it seems to be over-optimising for speed rather than other values, where as the way you're implementing it is by generating more meaningful or efficient decisions which are underpinned by a meta-analysis of your process—which is actually about increasing the quality of your decision-making.

I think it's worthwhile seeing where we're wasting time. But often I find wasted time isn't what you'd expect it to be. As someone who also works in the creative industry, criticism is a lot easier than creating something out of whole cloth. Your senior partner, doesn't just have more experience, but is also a fresh pair of eyes looking at the product you're creating from a macroscopic (user's) perspective—this is much easier when you're not mired in the minutiae. I get this feedback in my job (a documentary editor) not only from people more experienced than me, but also those less experienced.

There a two things I have learned from experience:

1. Blocking out a scene is useful, even though the scene will never be in that form—the boring form of the scene makes it easier to step back and see the more creative way to approach the scene. The time spent making the picture clearer isn't wasted.
2. When working alone, step away and view your work from a fresh perspective (in my case the audience, in yours the user) to be your own director / senior partner.

That being said, I think it's well worth meta-analysing your own process and that of your more experienced colleagues, another thing I've learned is...

3. When someone you trust gives you changes you don't agree with, try them, they probably have a clearer perspective than you do.

Anyway, thanks for the post, I'm planning to implement your advice in my own job, it sounds like a worthwhile process. I actually think this third thing is likely to be a key lesson learned from meta-analysis, to not be stubborn and to pivot to the better solution more freely, what I call "back it up and break it".

shaedys on Guide to rationalist interior decorating

I love the interior decorating advice, it's quite different from the other posts but is really useful when designing and buying for a new room.

raemon on Here's Why I'm Hesitant To Respond In More Depth

Is this a thinly veiled attempt to get Elephant Seal 3 into the Review? :P