LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Intricacies of Feature Geometry in Large Language Models
7vik (satvik-golechha) · 2024-12-07T18:10:51.375Z · comments (0)

An Illustrated Summary of "Robust Agents Learn Causal World Model"
Dalcy (Darcy) · 2024-12-14T15:02:44.828Z · comments (2)

Reading RFK Jr so that you don’t have to
braces · 2024-11-22T00:59:19.583Z · comments (1)

Measuring whether AIs can statelessly strategize to subvert security measures
Alex Mallen (alex-mallen) · 2024-12-19T21:25:28.555Z · comments (0)

Checking in on Scott's composition image bet with imagen 3
Dave Orr (dave-orr) · 2024-12-22T19:04:17.495Z · comments (0)

U.S.-China Economic and Security Review Commission pushes Manhattan Project-style AI initiative
Phib · 2024-11-19T18:42:43.296Z · comments (7)

[link] The Evals Gap
Marius Hobbhahn (marius-hobbhahn) · 2024-11-11T16:42:46.287Z · comments (7)

Neuroscience of human social instincts: a sketch
Steven Byrnes (steve2152) · 2024-11-22T16:16:52.552Z · comments (0)

Win/continue/lose scenarios and execute/replace/audit protocols
Buck · 2024-11-15T15:47:24.868Z · comments (2)

o1 Turns Pro
Zvi · 2024-12-10T17:00:08.036Z · comments (3)

Luck Based Medicine: No Good Very Bad Winter Cured My Hypothyroidism
Elizabeth (pktechgirl) · 2024-12-08T20:10:02.651Z · comments (3)

[link] a space habitat design
bhauth · 2024-11-25T17:28:48.481Z · comments (13)

Estimates of GPU or equivalent resources of large AI players for 2024/5
CharlesD · 2024-11-28T23:01:58.522Z · comments (7)

A Conflicted Linkspost
Screwtape · 2024-11-21T00:37:54.035Z · comments (0)

Correct my H5N1 research ($reward)
Elizabeth (pktechgirl) · 2024-12-09T19:07:03.277Z · comments (23)

[link] Just one more exposure bro
Chipmonk · 2024-12-12T21:37:07.069Z · comments (6)

I Finally Worked Through Bayes' Theorem (Personal Achievement)
keltan · 2024-12-05T02:04:16.547Z · comments (6)

[link] Ideas for benchmarking LLM creativity
gwern · 2024-12-16T05:18:55.631Z · comments (10)

[link] A toy evaluation of inference code tampering
Fabien Roger (Fabien) · 2024-12-09T17:43:40.910Z · comments (0)

[link] Review: Breaking Free with Dr. Stone
TurnTrout · 2024-12-18T01:26:37.730Z · comments (4)

AI #94: Not Now, Google
Zvi · 2024-12-12T15:40:06.336Z · comments (3)

Dave Kasten's AGI-by-2027 vignette
davekasten · 2024-11-26T23:20:47.212Z · comments (8)

[link] Active Recall and Spaced Repetition are Different Things
Saul Munn (saul-munn) · 2024-11-08T20:14:56.092Z · comments (2)

Looking back on the Future of Humanity Institute - Asterisk
jakeeaton · 2024-11-19T00:44:40.928Z · comments (0)

A Solution for AGI/ASI Safety
Weibing Wang (weibing-wang) · 2024-12-18T19:44:29.739Z · comments (20)

[link] What Ketamine Therapy Is Like
Sable · 2024-11-11T11:09:08.602Z · comments (8)

Book a Time to Chat about Interp Research
Logan Riggs (elriggs) · 2024-12-03T17:27:46.808Z · comments (3)

Which evals resources would be good?
Marius Hobbhahn (marius-hobbhahn) · 2024-11-16T14:24:48.012Z · comments (4)

AI #91: Deep Thinking
Zvi · 2024-11-21T14:30:06.930Z · comments (10)

[link] Epistemic status: poetry (and other poems)
Richard_Ngo (ricraz) · 2024-11-21T18:13:17.194Z · comments (5)

Cognitive Biases Contributing to AI X-risk — a deleted excerpt from my 2018 ARCHES draft
Andrew_Critch · 2024-12-03T09:29:49.745Z · comments (2)

Detection of Asymptomatically Spreading Pathogens
jefftk (jkaufman) · 2024-12-05T18:20:02.473Z · comments (7)

[link] A dataset of questions on decision-theoretic reasoning in Newcomb-like problems
Caspar Oesterheld (Caspar42) · 2024-12-16T22:42:03.763Z · comments (1)

[link] The Choice Transition
owencb · 2024-11-18T12:30:56.198Z · comments (4)

[link] Literacy Rates Haven't Fallen By 20% Since the Department of Education Was Created
Maxwell Tabarrok (maxwell-tabarrok) · 2024-11-22T20:53:59.007Z · comments (0)

[link] Dangerous capability tests should be harder
LucaRighetti (Error404Dinosaur) · 2024-11-21T17:20:50.610Z · comments (3)

[link] Conjecture: A Roadmap for Cognitive Software and A Humanist Future of AI
Connor Leahy (NPCollapse) · 2024-12-02T13:28:57.977Z · comments (10)

Monthly Roundup #24: November 2024
Zvi · 2024-11-18T13:20:06.086Z · comments (14)

[link] Careless thinking: A theory of bad thinking
Nathan Young · 2024-12-17T18:23:16.140Z · comments (17)

Preppers Are Too Negative on Objects
jefftk (jkaufman) · 2024-12-18T02:30:01.854Z · comments (2)

Claude's Constitutional Consequentialism?
1a3orn · 2024-12-19T19:53:33.254Z · comments (6)

AXRP Episode 39 - Evan Hubinger on Model Organisms of Misalignment
DanielFilan · 2024-12-01T06:00:06.345Z · comments (0)

Causal Undertow: A Work of Seed Fiction
Daniel Murfet (dmurfet) · 2024-12-08T21:41:48.132Z · comments (0)

Trying to translate when people talk past each other
Kaj_Sotala · 2024-12-17T09:40:02.640Z · comments (12)

ARENA 4.0 Impact Report
Chloe Li (chloe-li-1) · 2024-11-27T20:51:54.844Z · comments (3)

[link] Intrinsic Power-Seeking: AI Might Seek Power for Power’s Sake
TurnTrout · 2024-11-19T18:36:20.721Z · comments (5)

How to use bright light to improve your life.
Nat Martin (nat-martin) · 2024-11-18T19:32:10.667Z · comments (10)

[question] Are You More Real If You're Really Forgetful?
Thane Ruthenis · 2024-11-24T19:30:55.233Z · answers+comments (25)

[link] College technical AI safety hackathon retrospective - Georgia Tech
yix (Yixiong Hao) · 2024-11-15T00:22:53.159Z · comments (2)

[link] FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI
Tamay · 2024-11-14T06:13:22.042Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

nim on Hire (or become) a Thinking Assistant / Body Double

Oops! I only realized in your reply that you're considering "reliability" the load-bearing element. Yes, the hiring pipeline will look like a background noise of consistent interest from the unqualified, and sporadic hits from excellent candidates. You're approaching it from the perspective that the background noise of incompetents is the more important part, whereas I think that the availability of an adequate candidate eventually is the important part.

I think this because basically anywhere that hires can reliably find unqualified applicants. For a role where people stay in the job for 6 months, for instance, you only need to find a suitable replacement once every 6 months... so "reliably" being able to find an excellent candidate every day seems simply irrelevant.

ulrik-horn on Last Line of Defense: Minimum Viable Shelters for Mirror Bacteria

Just a note that I intend to answer this comment, but it might be a couple of days.

cakubilo on People aren't properly calibrated on FrontierMath

For example, if the statement to be proved is say independent of ZFC, then no computer that can be computed from a Turing Machine (which includes all LLMs) can resolve the conjecture, and due to independent statements, you can make conjectures that are arbitrarily hard to solve, and even the non-independent conjectures may in practice be unsolvable by any human or AI for a long time, which means the benchmark is less useful for real AIs.

I don't believe this is true, actually! What do you mean by "resolve the conjecture"? If you mean write up with a proof of it, then of course you can write a turing machine that will write a proof of the conjecture, it's just infinite monkeys. ZFC is best thought of as the "minimal set of axioms to do most math". It's not anything particularly special. You can have various foundations such as ETCS, NF, Type theory, etc. If we have a model that can genuinely reason mathematically, then the set of axioms the model uses should be immaterial to its mathematical ability. In fact, it should certainly be able to handle more or less axioms, like replacing full choice with countable choice etc. Maybe I misunderstood your point here.

While I am not experienced at all in formalizing math, and thus am willing to update and be corrected by any expert on mathematics, especially those that formalize mathematics in proof assistants, I'd expect 2 language independent reasons for why formalizing mathematics in proof assistants are difficult:

But my point was that there are things that should be extremely easy, like proving lemmas about elementary row transformations, that have not been done in Lean yet. That is not due to a lack of people formalizing, but due to fundamental limitations with the proof assistant. The point that I'm failing to make explicit is that this seems like a copout. The ultimate naturalistic benchmark for an LLM's math ability is being able to formalize the undergraduate math curriculum! But it starts with having a proof assistant that is amenable to the formalization project, which seems to be the bottleneck today.

cole-wyeth on Cole Wyeth's Shortform

Most ordinary people don't know that no one understands how neural networks work (or even that modern "Generative A.I." is based on neural networks). This might be an underrated message since the inferential distance here is surprisingly high.

It's hard to explain the more sophisticated models that we often use to argue that human dis-empowerment is the default outcome but perhaps much better leveraged to explain these three points:

1) No one knows how A.I models / LLMs / neural nets work (with some explanation of how this is conceptually possible).

2) We don't know how smart they will get how soon.

3) We can't control what they'll do once they're smarter than us.

At least under my state of knowledge, this is also a particularly honest messaging strategy, because it emphasizes the fundamental ignorance of A.I. researchers.

cole-wyeth on Cole Wyeth's Shortform

A "Christmas edition" of the new book on AIXI is freely available in pdf form at http://www.hutter1.net/publ/uaibook2.pdf

joel-burget on o3

Hi Boaz, first let me say that I really like Deliberative Alignment. Introducing a system 2 element is great, not only for higher-quality reasoning, but also for producing a legible, auditable chain of though. That said, I have a couple questions I'm hoping you might be able to answer.

I read through the model spec (which DA uses, or at least a closely-related spec). It seems well-suited and fairly comprehensive for answering user questions, but not sufficient for a model acting as an agent (which I expect to see more and more). An agent acting in the real world might face all sorts of interesting situations that the spec doesn't provide guidance on. I can provide some examples if necessary.
Does the spec fed to models ever change depending on the country / jurisdiction that the model's data center or the user are located in? Situations which are normal in some places may be legal in others. For example, Google tells me that homosexuality is illegal in 64 countries. Other situations are more subtle and may reflect different cultures / norms.

cole-wyeth on Acknowledging Background Information with P(Q|I)

ET Jaynes seems to have made the notational choice you discuss earlier. His book was published later but long after his death.

hide on Hire (or become) a Thinking Assistant / Body Double

Do you genuinely think that you can find such people “reliably”?

sharmake-farah on shortplav

More accurately, they are aligned to some particular human's values (Which I'd call western liberal values), and misaligned towards other value systems like conservatism/reactionary views, which was always going to be the outcome of any aligned AI.

weibing-wang on A Solution for AGI/ASI Safety

1. The industry is currently not violating the rules mentioned in my paper, because all current AIs are weak AIs, so none of the AIs' power has reached the upper limit of the 7 types of AIs I described. In the future, it is possible for an AI to break through the upper limit, but I think it is uneconomical. For example, an AI psychiatrist does not need to have superhuman intelligence to perform well. An AI mathematician may be very intelligent in mathematics, but it does not need to learn how to manipulate humans or how to design DNA sequences. Of course, having regulations is better, because there may be some careless AI developers who will grant AIs too many unnecessary capabilities or permissions, although this does not improve the performance of AIs in actual tasks.

The difference between my view and Max Tegmark's is that he seems to assume that there will only be one type of super intelligent AI in the world, while I think there will be many different types of AIs. Different types of AIs should be subject to different rules, rather than the same rule. Can you imagine a person who is both a Nobel Prize-winning scientist, the president, the richest man, and an Olympic champion at the same time? This is very strange, right? Our society doesn't need such an all-round person. Similarly, we don't need such an all-round AI either.

The development of a technology usually has two stages: first, achieving capabilities, and second, reducing costs. The AI technology is currently in the first stage. When AI develops to the second stage, specialization will occur.

2. Agree.