LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

(The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser
habryka (habryka4) · 2024-11-30T02:55:16.077Z · comments (212)

OpenAI Email Archives (from Musk v. Altman and OpenAI blog)
habryka (habryka4) · 2024-11-16T06:38:03.937Z · comments (80)

Alignment Faking in Large Language Models
ryan_greenblatt · 2024-12-18T17:19:06.665Z · comments (53)

The hostile telepaths problem
Valentine · 2024-10-27T15:26:53.610Z · comments (84)

[link] Survival without dignity
L Rudolf L (LRudL) · 2024-11-04T02:29:38.758Z · comments (28)

[link] I got dysentery so you don’t have to
eukaryote · 2024-10-22T04:55:58.422Z · comments (4)

[link] Biological risk from the mirror world
jasoncrawford · 2024-12-12T19:07:06.305Z · comments (29)

The Online Sports Gambling Experiment Has Failed
Zvi · 2024-11-11T14:30:04.371Z · comments (28)

You are not too "irrational" to know your preferences.
DaystarEld · 2024-11-26T15:01:42.996Z · comments (50)

Ayn Rand’s model of “living money”; and an upside of burnout
AnnaSalamon · 2024-11-16T02:59:07.368Z · comments (58)

[link] What TMS is like
Sable · 2024-10-31T00:44:22.612Z · comments (23)

Frontier Models are Capable of In-context Scheming
Marius Hobbhahn (marius-hobbhahn) · 2024-12-05T22:11:17.320Z · comments (24)

Making a conservative case for alignment
Cameron Berg (cameron-berg) · 2024-11-15T18:55:40.864Z · comments (68)

[link] Understanding Shapley Values with Venn Diagrams
Carson L · 2024-12-06T21:56:43.960Z · comments (32)

What Goes Without Saying
sarahconstantin · 2024-12-20T18:00:06.363Z · comments (9)

[link] The Compendium, A full argument about extinction risk from AGI
adamShimi · 2024-10-31T12:01:51.714Z · comments (52)

Communications in Hard Mode (My new job at MIRI)
tanagrabeast · 2024-12-13T20:13:44.825Z · comments (23)

Information vs Assurance
johnswentworth · 2024-10-20T23:16:25.762Z · comments (17)

Orienting to 3 year AGI timelines
Nikola Jurkovic (nikolaisalreadytaken) · 2024-12-22T01:15:11.401Z · comments (25)

[link] Overcoming Bias Anthology
Arjun Panickssery (arjun-panickssery) · 2024-10-20T02:01:23.463Z · comments (14)

The Median Researcher Problem
johnswentworth · 2024-11-02T20:16:11.341Z · comments (69)

o1 is a bad idea
abramdemski · 2024-11-11T21:20:24.892Z · comments (38)

The Summoned Heroine's Prediction Markets Keep Providing Financial Services To The Demon King!
abstractapplic · 2024-10-26T12:34:51.059Z · comments (16)

Neutrality
sarahconstantin · 2024-11-13T23:10:05.469Z · comments (27)

Current safety training techniques do not fully transfer to the agent setting
Simon Lermen (dalasnoin) · 2024-11-03T19:24:51.537Z · comments (8)

[link] Gradient Routing: Masking Gradients to Localize Computation in Neural Networks
cloud · 2024-12-06T22:19:26.717Z · comments (12)

"It's a 10% chance which I did 10 times, so it should be 100%"
egor.timatkov · 2024-11-18T01:14:27.738Z · comments (57)

A Rocket–Interpretability Analogy
plex (ete) · 2024-10-21T13:55:18.184Z · comments (31)

o3
Zach Stein-Perlman · 2024-12-20T18:30:29.448Z · comments (144)

“Alignment Faking” frame is somewhat fake
Jan_Kulveit · 2024-12-20T09:51:04.664Z · comments (13)

[link] Arithmetic is an underrated world-modeling technology
dynomight · 2024-10-17T14:00:22.475Z · comments (32)

Repeal the Jones Act of 1920
Zvi · 2024-11-27T15:00:06.801Z · comments (23)

[link] o1: A Technical Primer
Jesse Hoogland (jhoogland) · 2024-12-09T19:09:12.413Z · comments (17)

[link] China Hawks are Manufacturing an AI Arms Race
garrison · 2024-11-20T18:17:51.958Z · comments (42)

[link] When Is Insurance Worth It?
kqr · 2024-12-19T19:07:32.573Z · comments (51)

Subskills of "Listening to Wisdom"
Raemon · 2024-12-09T03:01:18.706Z · comments (16)

[question] Which things were you surprised to learn are not metaphors?
Eric Neyman (UnexpectedValues) · 2024-11-21T18:56:18.025Z · answers+comments (79)

"The Solomonoff Prior is Malign" is a special case of a simpler argument
David Matolcsi (matolcsid) · 2024-11-17T21:32:34.711Z · comments (44)

[link] OpenAI's CBRN tests seem unclear
LucaRighetti (Error404Dinosaur) · 2024-11-21T17:28:30.290Z · comments (6)

Why Don't We Just... Shoggoth+Face+Paraphraser?
Daniel Kokotajlo (daniel-kokotajlo) · 2024-11-19T20:53:52.084Z · comments (51)

BIG-Bench Canary Contamination in GPT-4
Jozdien · 2024-10-22T15:40:48.166Z · comments (13)

A bird's eye view of ARC's research
Jacob_Hilton · 2024-10-23T15:50:06.123Z · comments (12)

[link] The Dangers of Mirrored Life
Niko_McCarty (niko-2) · 2024-12-12T20:58:32.750Z · comments (7)

[link] Miles Brundage resigned from OpenAI, and his AGI readiness team was disbanded
garrison · 2024-10-23T23:40:57.180Z · comments (1)

Passages I Highlighted in The Letters of J.R.R.Tolkien
Ivan Vendrov (ivan-vendrov) · 2024-11-25T01:47:59.071Z · comments (10)

The Dream Machine
sarahconstantin · 2024-12-05T00:00:05.796Z · comments (6)

The o1 System Card Is Not About o1
Zvi · 2024-12-13T20:30:08.048Z · comments (5)

What o3 Becomes by 2028
Vladimir_Nesov · 2024-12-22T12:37:20.929Z · comments (10)

Scissors Statements for President?
AnnaSalamon · 2024-11-06T10:38:21.230Z · comments (31)

Should CA, TX, OK, and LA merge into a giant swing state, just for elections?
Thomas Kwa (thomas-kwa) · 2024-11-06T23:01:48.992Z · comments (35)

next page (older posts) →

Archive

Recent comments

benito on What Have Been Your Most Valuable Casual Conversations At Conferences?

At my CFAR Workshop in 2015, probably the most valuable bit was meeting Oliver Habryka at the afterparty and talking under the stars for an hour or two, our working relationship and friendship grew immediately out of that meeting.

The conversation went pretty deep into x-risk and rationality, but it was more that it created a connection around that sort of thinking. I do think that a bunch of the value I get from connecting with people at events like these is that then I later feel comfortable sharing them on google docs or sending them emails for feedback on ideas.

benito on What Have Been Your Most Valuable Casual Conversations At Conferences?

(I made this a question post, which seemed natural, but I/you can change it back if you disprefer.)

habryka4 on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser

We have met the first of our three fundraising goals! Thank you all so much! Seeing all the outpouring of support from so many different people has been very heartening.

braydenm on AI Control: Improving Safety Despite Intentional Subversion

Was a widely impactful piece of work, beyond the bounds of the less wrong community

tetraspace-grouping on shortplav

Dominance/submission dynamics in relationships

In Act I outputs Claudes do a lot of this, e.g. this screenshot of Sonnet 3.6

hide on Hire (or become) a Thinking Assistant / Body Double

It’s true any job can find unqualified applicants. What I’m saying is that this in particular relies on an untenably small niche of feasible candidates that will take an enormous amount of time to find/filter through on average.

Sure, you might get lucky immediately, but without a reliable way to find the “independently wealthy guy who’s an intellectual and is sufficiently curious about you specifically that he wants to sit silently and watch you for 8 hours a day for a nominal fee”, your recruitment time will, on average, be very long, especially in comparison to what would likely be a very short average tenure given the many countervailing opportunities that would be presented to such a candidate.

Yes, it’s possible in principle to articulate the perfect candidate, but my point is more about real-world feasibility.

tsvibt on What are the strongest arguments for very short timelines?

The burden is on you because you're saying "we have gone from not having the core algorithms for intelligence in our computers, to yes having them".

https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce#The__no_blockers__intuition [LW · GW]

And I think you're admitting that your argument is "if we mush all capabilities together into one dimension, AI is moving up on that one dimension, so things will keep going up".

Would you say the same thing about the invention of search engines? That was a huge jump in the capability of our computers. And it looks even more impressive if you blur out your vision--pretend you don't know that the text that comes up on your screen is written by a humna, and pretend you don't know that search is a specific kind of task distinct from a lot of other activity that would be involved in "True Understanding, woooo"--and just say "wow! previously our computers couldn't write a poem, but now with just a few keystrokes my computer can literally produce Billy Collins level poetry!".

Blurring things together at that level works for, like, macroeconomic trends. But if you look at macroeconomic trends it doesn't say singularity in 2 years! Going to 2 or 10 years is an inside-view thing to conclude! You're making some inference like "there's an engine that is very likely operating here, that takes us to AGI in xyz years".

nim on Hire (or become) a Thinking Assistant / Body Double

Oops! I only realized in your reply that you're considering "reliability" the load-bearing element. Yes, the hiring pipeline will look like a background noise of consistent interest from the unqualified, and sporadic hits from excellent candidates. You're approaching it from the perspective that the background noise of incompetents is the more important part, whereas I think that the availability of an adequate candidate eventually is the important part.

I think this because basically anywhere that hires can reliably find unqualified applicants. For a role where people stay in the job for 6 months, for instance, you only need to find a suitable replacement once every 6 months... so "reliably" being able to find an excellent candidate every day seems simply irrelevant.

ulrik-horn on Last Line of Defense: Minimum Viable Shelters for Mirror Bacteria

Just a note that I intend to answer this comment, but it might be a couple of days.

cakubilo on People aren't properly calibrated on FrontierMath

For example, if the statement to be proved is say independent of ZFC, then no computer that can be computed from a Turing Machine (which includes all LLMs) can resolve the conjecture, and due to independent statements, you can make conjectures that are arbitrarily hard to solve, and even the non-independent conjectures may in practice be unsolvable by any human or AI for a long time, which means the benchmark is less useful for real AIs.

I don't believe this is true, actually! What do you mean by "resolve the conjecture"? If you mean write up with a proof of it, then of course you can write a turing machine that will write a proof of the conjecture, it's just infinite monkeys. ZFC is best thought of as the "minimal set of axioms to do most math". It's not anything particularly special. You can have various foundations such as ETCS, NF, Type theory, etc. If we have a model that can genuinely reason mathematically, then the set of axioms the model uses should be immaterial to its mathematical ability. In fact, it should certainly be able to handle more or less axioms, like replacing full choice with countable choice etc. Maybe I misunderstood your point here.

While I am not experienced at all in formalizing math, and thus am willing to update and be corrected by any expert on mathematics, especially those that formalize mathematics in proof assistants, I'd expect 2 language independent reasons for why formalizing mathematics in proof assistants are difficult:

But my point was that there are things that should be extremely easy, like proving lemmas about elementary row transformations, that have not been done in Lean yet. That is not due to a lack of people formalizing, but due to fundamental limitations with the proof assistant. The point that I'm failing to make explicit is that this seems like a copout. The ultimate naturalistic benchmark for an LLM's math ability is being able to formalize the undergraduate math curriculum! But it starts with having a proof assistant that is amenable to the formalization project, which seems to be the bottleneck today.