LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

(The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser
habryka (habryka4) · 2024-11-30T02:55:16.077Z · comments (212)

OpenAI Email Archives (from Musk v. Altman and OpenAI blog)
habryka (habryka4) · 2024-11-16T06:38:03.937Z · comments (80)

Alignment Faking in Large Language Models
ryan_greenblatt · 2024-12-18T17:19:06.665Z · comments (53)

The hostile telepaths problem
Valentine · 2024-10-27T15:26:53.610Z · comments (84)

How I got 4.2M YouTube views without making a single video
Closed Limelike Curves · 2024-09-03T03:52:33.025Z · comments (36)

[link] Survival without dignity
L Rudolf L (LRudL) · 2024-11-04T02:29:38.758Z · comments (28)

[link] I got dysentery so you don’t have to
eukaryote · 2024-10-22T04:55:58.422Z · comments (4)

[link] Biological risk from the mirror world
jasoncrawford · 2024-12-12T19:07:06.305Z · comments (29)

Would catching your AIs trying to escape convince AI developers to slow down or undeploy?
Buck · 2024-08-26T16:46:18.872Z · comments (76)

Overview of strong human intelligence amplification methods
TsviBT · 2024-10-08T08:37:18.896Z · comments (141)

The Great Data Integration Schlep
sarahconstantin · 2024-09-13T15:40:02.298Z · comments (16)

The Best Lay Argument is not a Simple English Yud Essay
J Bostock (Jemist) · 2024-09-10T17:34:28.422Z · comments (15)

The Online Sports Gambling Experiment Has Failed
Zvi · 2024-11-11T14:30:04.371Z · comments (28)

Laziness death spirals
PatrickDFarley · 2024-09-19T15:58:30.252Z · comments (36)

Principles for the AGI Race
William_S · 2024-08-30T14:29:41.074Z · comments (13)

the case for CoT unfaithfulness is overstated
nostalgebraist · 2024-09-29T22:07:54.053Z · comments (40)

[link] Explore More: A Bag of Tricks to Keep Your Life on the Rails
Shoshannah Tekofsky (DarkSym) · 2024-09-28T21:38:52.256Z · comments (15)

AGI Safety and Alignment at Google DeepMind: A Summary of Recent Work
Rohin Shah (rohinmshah) · 2024-08-20T16:22:45.888Z · comments (33)

You are not too "irrational" to know your preferences.
DaystarEld · 2024-11-26T15:01:42.996Z · comments (50)

"Slow" takeoff is a terrible term for "maybe even faster takeoff, actually"
Raemon · 2024-09-28T23:38:25.512Z · comments (69)

Ayn Rand’s model of “living money”; and an upside of burnout
AnnaSalamon · 2024-11-16T02:59:07.368Z · comments (58)

[link] What TMS is like
Sable · 2024-10-31T00:44:22.612Z · comments (23)

The Sun is big, but superintelligences will not spare Earth a little sunlight
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2024-09-23T03:39:16.243Z · comments (141)

Pay Risk Evaluators in Cash, Not Equity
Adam Scholl (adam_scholl) · 2024-09-07T02:37:59.659Z · comments (19)

Frontier Models are Capable of In-context Scheming
Marius Hobbhahn (marius-hobbhahn) · 2024-12-05T22:11:17.320Z · comments (24)

Making a conservative case for alignment
Cameron Berg (cameron-berg) · 2024-11-15T18:55:40.864Z · comments (68)

The Hopium Wars: the AGI Entente Delusion
Max Tegmark (MaxTegmark) · 2024-10-13T17:00:29.033Z · comments (55)

[link] Understanding Shapley Values with Venn Diagrams
Carson L · 2024-12-06T21:56:43.960Z · comments (32)

What Goes Without Saying
sarahconstantin · 2024-12-20T18:00:06.363Z · comments (9)

[link] The Compendium, A full argument about extinction risk from AGI
adamShimi · 2024-10-31T12:01:51.714Z · comments (52)

Communications in Hard Mode (My new job at MIRI)
tanagrabeast · 2024-12-13T20:13:44.825Z · comments (23)

Cryonics is free
Mati_Roy (MathieuRoy) · 2024-09-29T17:58:17.108Z · comments (42)

[link] Why I’m not a Bayesian
Richard_Ngo (ricraz) · 2024-10-06T15:22:45.644Z · comments (92)

A basic systems architecture for AI agents that do autonomous research
Buck · 2024-09-23T13:58:27.185Z · comments (15)

Skills from a year of Purposeful Rationality Practice
Raemon · 2024-09-18T02:05:58.726Z · comments (18)

Information vs Assurance
johnswentworth · 2024-10-20T23:16:25.762Z · comments (17)

Orienting to 3 year AGI timelines
Nikola Jurkovic (nikolaisalreadytaken) · 2024-12-22T01:15:11.401Z · comments (25)

Contra papers claiming superhuman AI forecasting
nikos (followtheargument) · 2024-09-12T18:10:50.582Z · comments (16)

[question] Why is o1 so deceptive?
abramdemski · 2024-09-27T17:27:35.439Z · answers+comments (24)

Struggling like a Shadowmoth
Raemon · 2024-09-24T00:47:05.030Z · comments (38)

Did Christopher Hitchens change his mind about waterboarding?
Isaac King (KingSupernova) · 2024-09-15T08:28:09.451Z · comments (22)

Three Subtle Examples of Data Leakage
abstractapplic · 2024-10-01T20:45:27.731Z · comments (16)

My motivation and theory of change for working in AI healthtech
Andrew_Critch · 2024-10-12T00:36:30.925Z · comments (37)

[link] Overcoming Bias Anthology
Arjun Panickssery (arjun-panickssery) · 2024-10-20T02:01:23.463Z · comments (14)

The Median Researcher Problem
johnswentworth · 2024-11-02T20:16:11.341Z · comments (69)

o1 is a bad idea
abramdemski · 2024-11-11T21:20:24.892Z · comments (38)

The Summoned Heroine's Prediction Markets Keep Providing Financial Services To The Demon King!
abstractapplic · 2024-10-26T12:34:51.059Z · comments (16)

Neutrality
sarahconstantin · 2024-11-13T23:10:05.469Z · comments (27)

Current safety training techniques do not fully transfer to the agent setting
Simon Lermen (dalasnoin) · 2024-11-03T19:24:51.537Z · comments (8)

[question] things that confuse me about the current AI market.
DMMF · 2024-08-28T13:46:56.908Z · answers+comments (28)

next page (older posts) →

Archive

Recent comments

mo-putera on Panology

Eric Drexler wrote two essays that seem related, which I really loved.

The first is How to Understand Everything (and why). It's short enough to be quoted essentially whole, so if you don't mind I'll do so:

In science and technology, there is a broad and integrative kind of knowledge that can be learned, but isn’t taught. It’s important, though, because it makes creative work more productive and makes costly blunders less likely.
Formal education in science and engineering centers on teaching facts and problem-solving skills in a series of narrow topics. It is true that a few topics, although narrow in content, have such broad application that they are themselves integrative: These include (at a bare minimum) substantial chunks of mathematics and the basics of classical mechanics and electromagnetism, with the basics of thermodynamics and quantum mechanics close behind.
Most subjects in science and engineering, however, are narrower than these, and advanced education means deeper and narrower education. What this kind of education omits is knowledge of extent and structure of human knowledge on a trans-disciplinary scale. This means understanding — in a particular, limited sense — everything.
To avoid blunders and absurdities, to recognize cross-disciplinary opportunities, and to make sense of new ideas, requires knowledge of at least the outlines of every field that might be relevant to the topics of interest. By knowing the outlines of a field, I mean knowing the answers, to some reasonable approximation, to questions like these:
What are the physical phenomena?
What causes them?
What are their magnitudes?
When might they be important?
How well are they understood?
How well can they be modeled?
What do they make possible?
What do they forbid?
And even more fundamental than these are questions of knowledge about knowledge:
What is known today?
What are the gaps in what I know?
When would I need to know more to solve a problem?
How could I find what I need?
It takes far less knowledge to recognize a problem than to solve it, yet in key respects, that bit of knowledge is more important: With recognition, a problem may be avoided, or solved, or an idea abandoned. Without recognition, a hidden problem may invalidate the labor of an hour, or a lifetime. Lack of a little knowledge can be a dangerous thing.
Looking back over the last few decades, I can see that I’ve invested considerably more than 10,000 hours in learning about the structures, relationships, contents, controversies, open problems, limitations, capabilities, developing an understanding of how the fields covered in the major journals fit together to constitute the current state of science and technology. In some areas, of course, I’ve dug deeper into the contents and tools of a field, driven by the needs of problem solving; in others, I know only the shape of the box and where it sits.
This sort of knowledge is a kind of specialty, really — a limited slice of learning, but oriented crosswise. Because of this orientation, though, it provides leverage in integrating knowledge from diverse sources. I am surprised by the range of fields in which I can converse with scientists and engineers at about the level of a colleague in an adjacent field. I often know what to ask about their research, and sometimes make suggestions that light their eyes.

The follow-up essay is How to Learn About Everything. It's again short enough to quote wholesale:

Note that the title above isn’t “how to learn everything”, but “how to learn about everything”. The distinction I have in mind is between knowing the inside of a topic in deep detail — many facts and problem-solving skills — and knowing the structure and context of a topic: essential facts, what problems can be solved by the skilled, and how the topic fits with others.
This knowledge isn’t superficial in a survey-course sense: It is about both deep structure and practical applications. Knowing about, in this sense, is crucial to understanding a new problem and what must be learned in more depth in order to solve it. The cross-disciplinary reach of nanotechnology almost demands this as a condition of competence.
Studying to learn about everything
To intellectually ambitious students I recommend investing a lot of time in a mode of study that may feel wrong. An implicit lesson of classroom education is that successful study leads to good test scores, but this pattern of study is radically different. It cultivates understanding of a kind that won’t help pass tests — the classroom kind, that is.
Read and skim journals and textbooks that (at the moment) you only half understand. Include Science and Nature.
Don’t halt, dig a hole, and study a particular subject as if you had to pass a test on it.
Don’t avoid a subject because it seems beyond you — instead, read other half-understandable journals and textbooks to absorb more vocabulary, perspective, and context, then circle back.
Notice that concepts make more sense when you revisit a topic.
Notice which topics link in all directions, and provide keys to many others. Consider taking a class.
Continue until almost everything you encounter in Science and Nature makes sense as a contribution to a field you know something about.
Why is this effective?
You learned your native language by immersion, not by swallowing and regurgitating spoonfuls of grammar and vocabulary. With comprehension of words and the unstructured curriculum of life came what we call “common sense”.
The aim of what I’ve described is to learn an expanded language and to develop what amounts to common sense, but about an uncommonly broad slice of the world. Immersion and gradual comprehension work, and I don’t know of any other way.
This process led me to explore the potential of molecular nanotechnology as a basis for high-throughput atomically precise manufacturing. If broad-spectrum common sense were more widespread among scientists, there would be no air of controversy around the subject, milestones like the U.S. National Academies report on molecular manufacturing would have been reached a decade earlier, and today’s research agenda and perception of global problems would be very different.

I think I prefer either of Drexler's approach, Sarah Constantin's / Scott's fact-posting [LW · GW], and Holden Karnofsky's learning by writing, all of which can start with endless breadth but also require (quoting Drexler) deep structure and practical applications as focusing mechanisms, to the sort of learning that I think might be incentivised by budding panologists having to maximise their minimum score across some standardised battery of tests. I also liked Sarah's suggestion at the end:

Ideally, a group of people writing fact posts on related topics, could learn from each other, and share how they think. I have the strong intuition that this is valuable. It's a bit more active than a "journal club", and quite a bit more casual than "research". It's just the activity of learning and showing one's work in public.

karl-krueger on Panology

One problem I've seen around consilience is that some people say "well, clearly you don't think that there is one thing" when what they actually have evidence for is "you don't agree with my analogies across distant fields". So "believing in the unity of science" can get conflated with "agreeing with a particular set of analogies."

Andy: "Evolutionary biology proves that capitalism is the correct economic system, because competition produces fitness among firms as it does among organisms."
Betty: "I'm not sure you can directly draw inferences from biology to economics like that, without dealing with a whole bunch of disanalogies between the objects of those fields."
Andy: "What, do you deny that biology and economics are both studying the same world?"

lblack on What Have Been Your Most Valuable Casual Conversations At Conferences?

Some casual conversations with strangers that were high instrumental value:

At my first (online) LessWrong Community Weekend in 2020, I happened to chat with Linda Linsefors. That was my first conversation with anyone working in AI Safety. I’d read about the alignment problem for almost a decade at that point and thought it was the most important thing in the world, but I’d never seriously considered working on it. MIRI had made it pretty clear that the field only needed really exceptional theorists, and I didn’t think I was one of those. That conversation with Linda started the process of robbing me of my comfortable delusions on this front. What she said made it seem more like the field was pretty inadequate, and perfectly normal theoretical physicists could maybe help just by applying the standard science playbook for figuring out general laws in a new domain. Horrifying. I didn't really believe it yet, but this conversations was a factor in me trying out AI Safety Camp a bit over a year later.

At my first EAG, I talked to someone who was waiting for the actual event to begin along with me. This turned out to be Vivek Hebbar, who I'd never heard of before. We got to talking about inductive biases of neural networks. We kept chatting about this research area sporadically for a few weeks after the event. Eventually, Vivek called me to talk about the idea that would become this post [LW · GW]. Thinking about that idea led to me understanding the connection between basin broadness and representation dimensionality in neural networks, which ultimately resulted in this [LW · GW] research. It was probably the most valuable conversation I’ve had at any EAG so far, and it was unplanned.

At my second EAG, someone told me that an idea for comparing NN representations I’d been talking to them about already existed, and was called centred kernel alignment. I don’t quite remember how that conversation started, but I think it might have been a speed friending event.

My first morning in the MATS kitchen area in Berkeley, someone [LW · GW] asked me if I’d heard about a thing called Singular Learning Theory. I had not. He went through his spiel on the whiteboard. He didn’t have the explanation down nearly as well back then, but it still very recognisably connected to how I’d been thinking about NN generalisation and basin broadness, so I kept an eye on the area.

johannes-c-mayer on Vegans need to eat just enough Meat - emperically evaluate the minimum ammount of meat that maximizes utility

I probably did it badly. I would eat hole grain bread pretty regularly, but not consistently. I might not eat it for 1 week in a row sometimes. That was before I knew that amino acids are important.

johannes-c-mayer on Vegans need to eat just enough Meat - emperically evaluate the minimum ammount of meat that maximizes utility

It was ferritin. However the levels where actually barely within acceptable levels. I hypothesise that because I started to eat steamed blood for perhaps 2 weaks prior every day, and that blood contains a lot of heme iron, that I was deficient before.

benito on What Have Been Your Most Valuable Casual Conversations At Conferences?

At my CFAR Workshop in 2015, probably the most valuable bit was meeting Oliver Habryka at the afterparty and talking under the stars for an hour or two, our working relationship and friendship grew immediately out of that meeting.

The conversation went pretty deep into x-risk and rationality, but it was more that it created a connection around that sort of thinking. I do think that a bunch of the value I get from connecting with people at events like these is that then I later feel comfortable sharing them on google docs or sending them emails for feedback on ideas.

benito on What Have Been Your Most Valuable Casual Conversations At Conferences?

(I made this a question post, which seemed natural, but I/you can change it back if you disprefer.)

habryka4 on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser

We have met the first of our three fundraising goals! Thank you all so much! Seeing all the outpouring of support from so many different people has been very heartening.

braydenm on AI Control: Improving Safety Despite Intentional Subversion

Was a widely impactful piece of work, beyond the bounds of the less wrong community

tetraspace-grouping on shortplav

Dominance/submission dynamics in relationships

In Act I outputs Claudes do a lot of this, e.g. this screenshot of Sonnet 3.6