LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Base LLMs refuse too
Connor Kissane (ckkissane) · 2024-09-29T16:04:21.343Z · comments (20)

Why our politicians aren't Median
Yair Halberstadt (yair-halberstadt) · 2024-11-03T14:03:33.779Z · comments (15)

Intricacies of Feature Geometry in Large Language Models
7vik (satvik-golechha) · 2024-12-07T18:10:51.375Z · comments (0)

[Intuitive self-models] 5. Dissociative Identity (Multiple Personality) Disorder
Steven Byrnes (steve2152) · 2024-10-15T13:31:46.157Z · comments (7)

AI #86: Just Think of the Potential
Zvi · 2024-10-17T15:10:06.552Z · comments (8)

The Geometry of Feelings and Nonsense in Large Language Models
7vik (satvik-golechha) · 2024-09-27T17:49:27.420Z · comments (10)

Mira Murati leaves OpenAI/ OpenAI to remove non-profit control
Sodium · 2024-09-25T21:15:17.315Z · comments (4)

Seeking Collaborators
abramdemski · 2024-11-01T17:13:36.162Z · comments (15)

[link] The Alignment Trap: AI Safety as Path to Power
crispweed · 2024-10-29T15:21:26.545Z · comments (17)

An Illustrated Summary of "Robust Agents Learn Causal World Model"
Dalcy (Darcy) · 2024-12-14T15:02:44.828Z · comments (2)

[link] How much I'm paying for AI productivity software (and the future of AI use)
jacquesthibs (jacques-thibodeau) · 2024-10-11T17:11:27.025Z · comments (16)

[question] Could orcas be (trained to be) smarter than humans? 
Towards_Keeperhood (Simon Skade) · 2024-11-04T23:29:26.677Z · answers+comments (20)

AI #87: Staying in Character
Zvi · 2024-10-29T07:10:08.212Z · comments (3)

AI #84: Better Than a Podcast
Zvi · 2024-10-03T15:00:07.128Z · comments (7)

Reading RFK Jr so that you don’t have to
braces · 2024-11-22T00:59:19.583Z · comments (1)

U.S.-China Economic and Security Review Commission pushes Manhattan Project-style AI initiative
Phib · 2024-11-19T18:42:43.296Z · comments (7)

Measuring whether AIs can statelessly strategize to subvert security measures
Alex Mallen (alex-mallen) · 2024-12-19T21:25:28.555Z · comments (0)

Toward Safety Case Inspired Basic Research
Lucas Teixeira · 2024-10-31T23:06:32.854Z · comments (2)

Safe Predictive Agents with Joint Scoring Rules
Rubi J. Hudson (Rubi) · 2024-10-09T16:38:16.535Z · comments (10)

[link] The Evals Gap
Marius Hobbhahn (marius-hobbhahn) · 2024-11-11T16:42:46.287Z · comments (7)

Neuroscience of human social instincts: a sketch
Steven Byrnes (steve2152) · 2024-11-22T16:16:52.552Z · comments (0)

Win/continue/lose scenarios and execute/replace/audit protocols
Buck · 2024-11-15T15:47:24.868Z · comments (2)

[link] How Likely Are Various Precursors of Existential Risk?
NunoSempere (Radamantis) · 2024-10-28T13:27:31.620Z · comments (4)

How might we solve the alignment problem? (Part 1: Intro, summary, ontology)
Joe Carlsmith (joekc) · 2024-10-28T21:57:12.063Z · comments (5)

Vegans need to eat just enough Meat - emperically evaluate the minimum ammount of meat that maximizes utility
Johannes C. Mayer (johannes-c-mayer) · 2024-12-22T22:08:31.971Z · comments (28)

[link] a space habitat design
bhauth · 2024-11-25T17:28:48.481Z · comments (13)

Luck Based Medicine: No Good Very Bad Winter Cured My Hypothyroidism
Elizabeth (pktechgirl) · 2024-12-08T20:10:02.651Z · comments (3)

o1 Turns Pro
Zvi · 2024-12-10T17:00:08.036Z · comments (3)

[link] The Mysterious Trump Buyers on Polymarket
Annapurna (jorge-velez) · 2024-10-18T13:26:25.565Z · comments (10)

Parental Writing Selection Bias
jefftk (jkaufman) · 2024-10-13T14:00:03.225Z · comments (3)

A Conflicted Linkspost
Screwtape · 2024-11-21T00:37:54.035Z · comments (0)

Estimates of GPU or equivalent resources of large AI players for 2024/5
CharlesD · 2024-11-28T23:01:58.522Z · comments (7)

Claude Sonnet 3.5.1 and Haiku 3.5
Zvi · 2024-10-24T14:50:06.286Z · comments (9)

Correct my H5N1 research ($reward)
Elizabeth (pktechgirl) · 2024-12-09T19:07:03.277Z · comments (23)

[link] Prices are Bounties
Maxwell Tabarrok (maxwell-tabarrok) · 2024-10-12T14:51:40.689Z · comments (13)

[link] Just one more exposure bro
Chipmonk · 2024-12-12T21:37:07.069Z · comments (6)

[link] Anthropic's updated Responsible Scaling Policy
Zac Hatfield-Dodds (zac-hatfield-dodds) · 2024-10-15T16:46:48.727Z · comments (3)

I Finally Worked Through Bayes' Theorem (Personal Achievement)
keltan · 2024-12-05T02:04:16.547Z · comments (6)

[link] Ideas for benchmarking LLM creativity
gwern · 2024-12-16T05:18:55.631Z · comments (10)

[link] A toy evaluation of inference code tampering
Fabien Roger (Fabien) · 2024-12-09T17:43:40.910Z · comments (0)

Metastatic Cancer Treatment Since 2010: The Success Stories
sarahconstantin · 2024-11-04T22:50:09.386Z · comments (2)

[link] Can AI Outpredict Humans? Results From Metaculus's Q3 AI Forecasting Benchmark
ChristianWilliams · 2024-10-10T18:58:46.041Z · comments (2)

Low Probability Estimation in Language Models
Gabriel Wu (gabriel-wu) · 2024-10-18T15:50:05.947Z · comments (0)

[Intuitive self-models] 7. Hearing Voices, and Other Hallucinations
Steven Byrnes (steve2152) · 2024-10-29T13:36:16.325Z · comments (2)

[link] [Paper Blogpost] When Your AIs Deceive You: Challenges with Partial Observability in RLHF
Leon Lang (leon-lang) · 2024-10-22T13:57:41.125Z · comments (1)

AI #94: Not Now, Google
Zvi · 2024-12-12T15:40:06.336Z · comments (3)

[link] cancer rates after gene therapy
bhauth · 2024-10-16T15:32:53.949Z · comments (0)

Toy Models of Feature Absorption in SAEs
chanind · 2024-10-07T09:56:53.609Z · comments (8)

[link] Review: Breaking Free with Dr. Stone
TurnTrout · 2024-12-18T01:26:37.730Z · comments (4)

Looking back on the Future of Humanity Institute - Asterisk
jakeeaton · 2024-11-19T00:44:40.928Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

johnswentworth on johnswentworth's Shortform

That's the opposite of my experience. Nearly all the papers I read vary between "trash, I got nothing useful out besides an idea for a post explaining the relevant failure modes" and "high quality but not relevant to anything important". Setting up our experiments is historically much faster than the work of figuring out what experiments would actually be useful.

There are exceptions to this, large projects which seem useful and would require lots of experimental work, but they're usually much lower-expected-value-per-unit-time than going back to the whiteboard, understanding things better, and doing a simpler experiment once we know what to test.

thane-ruthenis on johnswentworth's Shortform

Convincing.

unexpectedvalues on ReSolsticed vol I: "We're Not Going Quietly"

Thank you for making this! My favorite ones are 4, 5, and 12. (Mentioning this in case anyone wants to listen to a few songs but not the full Solstice.)

zach-stein-perlman on DeepSeek beats o1-preview on math, ties on coding; will release weights

Update: the weights and paper are out. Tweet thread, GitHub, paper.

It was super cheap to train — they say 2.8M H800 GPU-hours or $5.6M.

It's powerful:

It's cheap to run:

nathan-helm-burger on johnswentworth's Shortform

Consider: https://www.cognitiverevolution.ai/can-ais-generate-novel-research-ideas-with-lead-author-chenglei-si/

I think a different phenomenon is occuring. My guess, updating on my own experience, is that ideas aren't the current bottleneck. 1% inspiration, 99% perspiration.

As someone who has been reading 3-20 papers per month for nanny years now, in neuroscience and machine learning, I feel overwhelmed with ideas. I average about 0.75 per paper. I write them down, and the lists grow faster than they shrink by two orders of magnitude.

When I was on my favorite industry team, what I most valued about my technical manager was his ability to help me sort through and prioritize them. It was like I created a bunch of LEGO pieces, he picked one to be next, I put it in place by coding it up, he checked the placement by reviewing my PR. If someone has offered me a source of ideas ranging in quality between worse than my worst ideas, and almost as good as my best ideas, and skewed towards bad... I'd have laughed and turned them down without a second thought.

For something like a paper instead of a minor tech idea for 1 week PR... The situation is far more intense. The grunt work of running the experiments and preparing the paper is enormous compared to the time and effort of coming up with the idea in the first place. More like 0.1% to 99.9%.

Current LLMs can speed up creating a paper if given the results and experiment description to write about. That's probably also not the primary bottleneck (although still more than idea generation).

So the current bottleneck, in my estimation, for ml experiments, is the experiments. Coding up the experiments accurately and efficiently, running them (and handling the compute costs), analyzing the results.

So I've been expecting to see an acceleration dependent on that aspect. That's hard to measure though. Are LLMs currently speeding this work up a little? Probably. I've had my work sped up some by the recent Sonnet 3.5.1. Currently though it's a trade-off, there's overhead in checking for misinterpretations and correcting bugs. We still seem a long way in "capability space" from me being able to give a background paper and rough experiment description, and then having the model do the rest. Only once that's the case will idea generation become my bottleneck.

ben-turtel on Human, All Too Human - Superintelligence requires learning things we can’t teach

Hey, thanks for reading and for the thoughtful comment!

100% agree with this: "AI should be able to push at least somewhat beyond the limits of what humans have ever concluded from available data, in every field, before needing to obtain any additional, new data."

Current methods can get us to AGI, and full AGI would result in a mind that is practically superhuman because no human mind contains all of these abilities to such a degree. I say as much in the full post: "Models may even recombine known reasoning methods to uncover new breakthroughs, but they remain bound to known human reasoning patterns."

Also agree that simulation is a viable path to exploration / feedback beyond what humans can explicitly provide: "There are many ways we might achieve this, whether in physically embodied intelligence, complex simulations grounded in scientific constraints, or predicting real world outcomes."

I'm mostly pointing out that at some point we will hit a bottleneck between AGI and ASI, which will require breaking free from human labels, and learning new things via exploration / real world feedback.

jeremy-gillen on Evolution provides no evidence for the sharp left turn

I'm curious whether the recent trend toward bi-level optimization via chain-of-thought was any update for you? I would have thought this would have updated people (partially?) back toward actually-evolution-was-a-decent-analogy.

There's this paragraph, which seems right-ish to me:

In order to experience a sharp left turn that arose due to the same mechanistic reasons as the sharp left turn of human evolution, an AI developer would have to:
Deliberately create a (very obvious^[2] [LW(p) · GW(p)]) inner optimizer, whose inner loss function includes no mention of human values / objectives.^[3] [LW(p) · GW(p)]
Grant that inner optimizer ~billions of times greater optimization power than the outer optimizer.^[4] [LW(p) · GW(p)]
Let the inner optimizer run freely without any supervision, limits or interventions from the outer optimizer.^[5] [LW(p) · GW(p)]

Extremely long chains-of-thought on hard problems is pretty much meeting these conditions, right?

anthonyc on Human, All Too Human - Superintelligence requires learning things we can’t teach

This is all true, but I'm not sure the claimed implications are so certain. The problem is, different minds can gain different levels of insight out of the same data and tools.

First, we should assume humanity has enough data to enable the best human minds to reach the highest levels of every capability available to humans very very little real-world feedback. It's not ASI in the full sense, but there has never been a human mind that contained all such abilities at once, let alone with an AI's other default advantages.

Second, it seems extremely unlikely to me that the available data does not include patterns no human has ever found and understood. All collected data ha[s] yet to be completely correlated and put together in all possible relationships. I don't have a strong sense of the limits of what should be possible with current data. At minimum I expect an ASI to have better pure and applied math tools to apply to any task, and require less data than we do for any given purpose.

Third, with proper tool support, I'm not sure how much physical experimentation and feedback can be substituted with high-quality simulation using software based on known physics, chemistry, and biology. At minimum, this should enable answering a lot of questions that current humanity knows how to answer by formulaic investigation but has never specifically asked or bothered writing down an answer to.

To me this indicates that at the limit of enough compute with better training methods, AI should be able to push at least somewhat beyond the limits of what humans have ever concluded from available data, in every field, before needing to obtain any additional, new data.

sharmake-farah on A shot at the diamond-alignment problem

Randomly read this comment and I really enjoyed it, Turn it into a post? (I understand how annoying structuring complex thoughts coherently can be but maybe do a dialogue or something? I liked this.)

Maybe I should try a dialogue with someone else on this, because I don't think any of my points are very extendible to a full post without someone helping me.

Do you have any specific reason why you're going into QMech when talking about brain-like AGI stuff?

To be frank, this was mostly about clarifying the philosophy around computationalism/human values in general, but I didn't go that deep into QMech for brain-like AGI and don't expect it to be immediately useful for my pursuits, so the only role for QMech here is in clarifying some confusions people have, and QMech wasn't even that necessary to make my points.

When we get into acausality and evertt branches I think we're going a bit off-track. I can think computational intractability and observer bias is something interesting to bring up but I always find it never leads anywhere. Quantum Mechanics is fundamentally observer invariant and so positing something like MWI is a philosophical stance (that is supported by occam's razor) but it is still observer dependent, what if there are no observers?

Okay, the thing I think you are pointing to is that the same outcomes/rules can be generated out of ontologically distinct interpretations, and for our purposes, the observer is basically anything that interacts with anything, whether it's a human or particle, and thus saying there are no observers corresponds to saying that there is nothing in the universe, including the forces, and in particular dark energy is exactly 0.

The answer is that it would be a very different universe than our universe is today.

richard_kennaway on Terminal goal vs Intelligence

Leaving aside the conceptualisation of "terminal goals", the agent as described should start up the paperclip factory early enough to produce paperclips when the time comes. Until then it makes cups. But the agent as described does not have a "terminal" goal of cups now and a "terminal" goal of paperclips in future. It has been given a production schedule to carry out. If the agent is a general-purpose factory that can produce a whole range of things, the only "terminal" goal to design it to have is to follow orders. It should make whatever it is told to, and turn itself off when told to.

Unless, of course, people go, "At last, we've created the Sorceror's Apprentice machine, as warned of in Goethe's cautionary tale, 'The Sorceror's Apprentice'!"

So if I understand your concept correctly a super intelligent agent will combine all future terminal goals to a single unchanging goal.

A superintelligent agent will do what it damn well likes, it's superintelligent. :)