sil-ver

Trump says a lot of stuff that he doesn't do, the set of specific things that presidents don't do is larger than the set of things they do, and tariffs didn't even seem like they'd be super popular with his base if in fact they were implemented. So "~nothing is gonna happen wrt tariffs" seemed like the default outcome with not enough evidence to assume otherwise.

I was also not paying a lot of attention to what he was saying. After the election ended, I made a conscious decision to tune out of politics to protect my mental health. So it was a low information take -- but I don't know if paying more attention would have changed my prediction. I still don't think I actually know why Trump is doing the tariffs, especially to such an extreme extent..

Comment by Rafael Harth (sil-ver) on Why Should I Assume CCP AGI is Worse Than USG AGI? · 2025-04-19T15:54:37.022Z · LW · GW

I've also noticed this assumption. I myself don't have it, at all. My first thought has always been something like "If we actually get AGI then preventing terrible outcomes will probably require drastic actions and if anything I have less faith in the US government to take those". Which is a pretty different approach from just assuming that AGI being developed by government will automatically lead to a world with values of government $x$ . But this a very uncertain take and it wouldn't surprise me if someone smart could change my mind pretty quickly.

Comment by Rafael Harth (sil-ver) on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-04-15T00:13:49.634Z · LW · GW

Yeah.

Comment by Rafael Harth (sil-ver) on Who wants to bet me $25k at 1:7 odds that there won't be an AI market crash in the next year? · 2025-04-11T10:48:34.245Z · LW · GW

These are very poor odds, to the point that they seem to indicate a bullish rather than a bearish position on AI.

If you think the odds of something are , but lots of other people think they are $x$ with $x << y$ , then the rational action is not to offer bets at a point close to $x$ ; it's to find the closest number to $y$ possible. Why would you bet at 1:5 odds if you have reason to believe that some people would be happy to bet at 1:7 odds?

You could make an argument that this type of thinking is too mercenary/materialistic or whatever, but then critique should be about that. In any case the inference that offering a bet close to $x$ indicates beliefs close to $x$ is just not accurate.

Comment by Rafael Harth (sil-ver) on Reactions to METR task length paper are insane · 2025-04-11T10:34:57.863Z · LW · GW

I'm glad METR did this work, and I think their approach is sane and we should keep adding data points to this plot.

It sounds like you also think the current points on the plot are accurate? I would strongly dispute this, for all the reasons discussed here and here. I think you can find sets of tasks where the points fit on an exponential curve, but I don't think AI can do 1 hour worth of thinking on all, or even most, practically relevant questions.

Comment by Rafael Harth (sil-ver) on Rafael Harth's Shortform · 2025-04-08T10:14:58.896Z · LW · GW

In the last few months, GPT models have undergone a clear shift toward more casual language. They now often close a post by asking a question. I strongly dislike this from both a 'what will this do to the public's perception of LLMs' and 'how is my personal experience as a customer' perspective. Maybe this is the reason to finally take Gemini seriously.

Comment by Rafael Harth (sil-ver) on Prediction Markets Are Mediocre · 2025-04-05T18:15:52.601Z · LW · GW

I unfortunately don't think this proves anything relevant. The example just shows that there was one question where the market was very uncertain. This neither tells us how certain the market is in general (that depends on its confidence on other policy questions), nor how good this particular estimate was (that, I would argue, depends on how far along the information chart it was, which is not measurable -- but even putting my pet framework aside, it seems intuitively clear "it was 56% and then it happened" doesn't tell you how much information the market utilized).

The point is that even if voters did everything right and checked prediction markets as part of their decision making algorithm, it wouldn't help.

This depends on the first point, which again requires looking at a range of policy markets, not just one. And actually, I personally didn't expect Trump to do any tarrifs at all (was 100% wrong there), so for me, the market would have updated me significantly into the right direction.

Comment by Rafael Harth (sil-ver) on AI 2027: What Superintelligence Looks Like · 2025-04-04T09:15:41.918Z · LW · GW

Thanks. I've submitted my own post on the 'change our mind form', though I'm not expecting a bounty. I'd instead be interested in making a much bigger bet (bigger than Cole's 100 USD), gonna think about what resolution criterion is best.

Comment by Rafael Harth (sil-ver) on Rafael Harth's Shortform · 2025-04-01T13:59:12.681Z · LW · GW

I might be misunderstanding how this works, but I don't think I'm gonna win the virtue of The Void anytime soon. Or at all.

Comment by Rafael Harth (sil-ver) on On Downvotes, Cultural Fit, and Why I Won’t Be Posting Again · 2025-04-01T08:42:40.667Z · LW · GW

Yeah, valid correction.

Comment by Rafael Harth (sil-ver) on On Downvotes, Cultural Fit, and Why I Won’t Be Posting Again · 2025-04-01T08:38:26.196Z · LW · GW

If people downvoted because they thought the argument wasn’t useful, fine - but then why did no one say that? Why not critique the focus or offer a counter? What actually happened was silence, followed by downvotes. That’s not rational filtering. That’s emotional rejection.

Yeah, I do not endorse the reaction. The situation pattern-matches to other cases where someone new writes things that are so confusing and all over the place that making them ditch the community (which is often the result of excessive downvoting) is arguably a good thing. But I don't think this was the case here. Your essays look to me to be coherent (and also probably correct). I hadn't seen any of them before this post but I wouldn't have downvoted. My model is that most people are not super strategic about this kind of thing and just go "talking politics -> bad" without really thinking through whether demotivating the author is good in this case.

So if I understand you correctly: you didn’t read the essay, and you’re explaining that other people who also didn’t read the essay dismissed it as “political” because they didn’t read it.

Yes -- from looking at it, it seems like it's something I agree with (or if not, disagree for reasons that I'm almost certain won't be addressed in the text), so I didn't see a reason to read. I mean reading is a time investment, you have to give me a reason to invest that time, that's how it works. But I thought the (lack of) reaction was unjustified, so I wanted to give you a better model of what happened, which also doesn't take too much time.

Most people say capitalism makes alignment harder. I’m saying it makes alignment structurally impossible.

The point isn’t to attack capitalism. It’s to explain how a system optimised for competition inevitably builds the thing that kills us.

I mean that's all fine, but those are nuances which only become relevant after people read, so it doesn't really change the dynamic I've outlined. You have to give people a reason to read first, and then put more nuances into the text. Idk if this helps but I've learned this lesson the hard way by spending a ridiculous amount of time on a huge post that was almost entirely ignored (this was several years ago).

(It seems like you got some reactions now fwiw, hope this may make you reconsider leaving.)

Comment by Rafael Harth (sil-ver) on On Downvotes, Cultural Fit, and Why I Won’t Be Posting Again · 2025-03-31T20:10:26.335Z · LW · GW

I think you probably don't have the right model of what motivated the reception. "AGI will lead to human extinction and will be built because of capitalism" seems to me like a pretty mainstream position on LessWrong. In fact I strongly suspect this is exactly what Eliezer Yudkowsky believes. The extinction part has been well-articulated, and the capitalism part is what I would have assumed is the unspoken background assumption. Like, yeah, if we didn't have a capitalist system, then the entire point about profit motives, pride, and race dynamics wouldn't apply. So... yeah, I don't think this idea is very controversial on LW (reddit is a different story).

I think the reason that your posts got rejected is that the focus doesn't seem useful. Getting rid of capitalism isn't tractable, so what is gained by focusing on this part of the causal chain? I think that's the part your missing. And because this site is very anti-[political content], you need a very good reason to focus on politics. So I'd guess that what happened is that people saw the argument, thought it was political and not-useful, and consequently downvoted.

Comment by sil-ver on [deleted post] 2025-03-31T07:41:07.533Z

Sorry, but isn't this written by an LLM? Especially since milan's other comments ([1], [2], [3]) are clearly in a different style, the emotional component goes from 9/10 to 0/10 with no middle ground.

I find this extremely offensive (and I'm kinda hard to offend I think), especially since I've 'cooperated' with milan's wish to point to specific sections in the other comment. LLMs in posts is one thing, but in comments, yuck. It's like, you're not worthy of me even taking the time to respond to you.

The guidelines don't differentiate between posts and comments but this violates them regardless (and actually the post does as well) since it very much does have the stereotypical writing style of an AI assistant, and the comment also seems copy-pasted without a human element at all.

A rough guideline is that if you are using AI for writing assistance, you should spend a minimum of 1 minute per 50 words (enough to read the content several times and perform significant edits), you should not include any information that you can't verify, haven't verified, or don't understand, and you should not use the stereotypical writing style of an AI assistant.

Comment by sil-ver on [deleted post] 2025-03-30T21:38:39.703Z

The sentence you quoted is a typo, it's is meant to say that formal languages are extremely impractical.

Comment by sil-ver on [deleted post] 2025-03-30T14:43:42.266Z

Here's one section that strikes me as very bad

At its heart, we face a dilemma that captures the paradox of a universe so intricately composed, so profoundly mesmerizing, that the very medium on which its poem is written—matter itself—appears to have absorbed the essence of the verse it bears. And that poem, unmistakably, is you—or more precisely, every version of you that has ever been, or ever will be.

I know what this is trying to do but invoking mythical language when discussing consciousness is very bad practice since it appeals to an emotional response. Also it's hard to read.

Similar things are true for lots of other sections here, very unnecessarily poetic language. I guess you can say that this is policing tone, but I think it's valid to police tone if the tone is manipulative (on top of just making it harder and more time intensive to read.

Since you asked for a section that's explicitly nonsense rather than just bad, I think this one deserves the label:

We can encode mathematical truths into natural language, yet we cannot fully encode human concepts—such as irony, ambiguity, or emotional nuance—into formal language. Therefore: Natural language is at least as expressive as formal language.

First of all, if you can't encode something, it could just be that the thing is not well-defined, rather than that the system is insufficiently powerful

Second, the way this is written (unless the claim is further justified elsewhere) implies that the inability to encode human concepts in formal languages is self-evident, presumably because no one has managed it so far. This is completely untrue; formal[^1] languages are extremely impractical, which is why mathematicians don't write any real proofs in them. If a human concept like irony could be encoded, it would be extremely long and way way beyond the ability of any human to write down. So even if it were theoretically possible, we almost certainly wouldn't have done it yet, which means that it not having been done yet is negligible evidence of it being impossible.

[1]: typo corrected from "natural"

Comment by sil-ver on [deleted post] 2025-03-30T14:31:29.406Z

I agree that this sounds not very valuable; sounds like a repackaging of illusionism without adding anything. I'm surprised about the votes (didn't vote myself).

Comment by Rafael Harth (sil-ver) on Wei Dai's Shortform · 2025-03-29T12:12:34.514Z · LW · GW

The One True Form of Moral Progress (according to me) is using careful philosophical reasoning to figure out what our values should be, what morality consists of, where our current moral beliefs are wrong, or generally, the contents of normativity (what we should and shouldn't do)

Are you interested in hearing other people's answers to these questions (if they think they have them)?

Comment by Rafael Harth (sil-ver) on An argument for asexuality · 2025-03-27T22:00:25.314Z · LW · GW

I agree with various comments that the post doesn't represent all the tradeoffs, but I strong-upvoted this because I think the question is legit interesting. It may be that the answer is no for almost everyone, but it's not obvious.

Comment by Rafael Harth (sil-ver) on Rafael Harth's Shortform · 2025-03-21T11:02:28.746Z · LW · GW

For those who work on Windows, a nice little quality of life improvement for me was just to hide desktop icons and do everything by searching in the task bar. (Would be even better if the search function wasn't so odd.) Been doing this for about two years and like it much more.

Maybe for others, using the desktop is actually worth it, but for me, it was always cluttering up over time, and the annoyance over it not looking the way I want always outweighed the benefits. It really takes barely longer to go CTRL+ESC+"firef"+ENTER than to double click an icon.

Comment by Rafael Harth (sil-ver) on How far along Metr's law can AI start automating or helping with alignment research? · 2025-03-21T10:59:34.084Z · LW · GW

I don't think I get it. If I read this graph correctly, it seems to say that if you let a human play chess against an engine and want it to achieve equal performance, then the amount of time the human needs to think grows exponentially (as the engine gets stronger). This doesn't make sense if extrapolated downward, but upward it's about what I would expect. You can compensate for skill by applying more brute force, but it becomes exponentially costly, which fits the exponential graph.

It's probably not perfect -- I'd worry a lot about strategic mistakes in the opening -- but it seems pretty good. So I don't get how this is an argument against the metric.

Comment by Rafael Harth (sil-ver) on How far along Metr's law can AI start automating or helping with alignment research? · 2025-03-20T23:51:56.023Z · LW · GW

Not answerable because METR is a flawed measure, imho.

Comment by Rafael Harth (sil-ver) on Why am I getting downvoted on Lesswrong? · 2025-03-20T11:56:00.549Z · LW · GW

Should I not have began by talking about background information & explaining my beliefs? Should I have the audience had contextual awareness and gone right into talking about solutions? Or was the problem more along the lines of writing quality, tone, or style?

What type of post do you like reading?

Would it be alright if I asked for an example so that I could read it?

This is a completely wrong way to think about it, imo. A post isn't this thing with inherent terminal value that you can optimize for regardless of content.

If you think you have an insight that the remaining LW community doesn't have, then and only then^[1] should you consider writing a post. Then the questions become is the insight actually valid, and did I communicate it properly. And yes, the second one is huge topic -- so if in fact you have something value to say, then sure you can spend a lot of time trying to figure out how to do that, and what e.g. Lsuser said is fine advise. But first you need to actually have something valuable to say. If you don't, then the only good action is to not write a post. Starting off by just wanting to write something is bound to be not-fruitful.

yes technically there can be other goals of a post (like if it's fiction), but this is the central case ↩︎

Comment by Rafael Harth (sil-ver) on METR: Measuring AI Ability to Complete Long Tasks · 2025-03-20T11:22:32.430Z · LW · GW

I really don't think this is a reasonable measure for ability to do long term tasks, but I don't have the time or energy to fight this battle, so I'll just register my prediction that this paper is not going to age well.

Comment by Rafael Harth (sil-ver) on Metacognition Broke My Nail-Biting Habit · 2025-03-17T10:08:11.475Z · LW · GW

To I guess offer another data point, I've had an obsessive nail-removing^[1] habit for about 20 years. I concur that it can happen unconsciously; however noticing it seems to me like 10-20% of the problem; the remaining 80-90% is resisting the urge to follow the habit when you do notice. (As for enjoying it, I think technically yeah but it's for such a short amount of time that it's never worth it. Maybe if you just gave in and were constantly biting instead of trying to resist for as long as possible, it'd be different.) I also think I've solved the noticing part without really applying any specific technique.

But I don't think this means the post can't still be valuable for cases where noticing is the primary obstacle.

I'm not calling it nail-biting bc it's not about the biting itself, I can equally remove them with my other fingernails. ↩︎

Comment by Rafael Harth (sil-ver) on Metacognition Broke My Nail-Biting Habit · 2025-03-17T09:47:45.882Z · LW · GW

Oh, nice! The fact that you didn't make the time explicit in the post made me suspect that it was probably much shorter. But yeah, six months is long enough, imo.

Comment by Rafael Harth (sil-ver) on Metacognition Broke My Nail-Biting Habit · 2025-03-16T13:38:33.339Z · LW · GW

I would highly caution declaring victory too early. I don't know for how long you think you've overcome the habit, but unless it's at least three months, I think you're being premature.

Comment by Rafael Harth (sil-ver) on A Bear Case: My Predictions Regarding AI Progress · 2025-03-07T14:05:32.918Z · LW · GW

A larger number of people, I think, desperately desperately want LLMs to be a smaller deal than what they are.

Can confirm that I'm one of these people (and yes, I worry a lot about this clouding my judgment).

Comment by Rafael Harth (sil-ver) on Why it's so hard to talk about Consciousness · 2025-03-06T23:22:54.107Z · LW · GW

Again, those are theories of consciousness, not definitions of consciousness.

I would agree that people who use consciousness to denote the computational process vs. the fundamental aspect generally have different theories of consciousness, but they're also using the term to denote two different things.

(I think this is bc consciousness notably different from other phenomena -- e.g., fiber decreasing risk of heart disease -- where the phenomenon is relatively uncontroversial and only the theory about how the phenomenon is explained is up for debate. With consciousness, there are a bunch of "problems" about which people debate whether they're even real problems at all (e.g., binding problem, hard problem). Those kinds of disagreements are likely causally upstream of inconsistent terminology.)

Comment by Rafael Harth (sil-ver) on Have LLMs Generated Novel Insights? · 2025-02-26T12:01:32.401Z · LW · GW

I think the ability to autonomously find novel problems to solve will emerge as reasoning models scale up. It will emerge because it is instrumental to solving difficult problems.

This of course is not a sufficient reason. (Demonstration: telepathy will emerge [as evolution improves organisms] because it is instrumental to navigating social situations.) It being instrumental means that there is an incentive -- or to be more precise, a downward slope in the loss function toward areas of model space with that property -- which is one required piece, but it also must be feasible. E.g., if the parameter space doesn't have any elements that are good at this ability, then it doesn't matter whether there's a downward slope.

Fwiw I agree with this:

Current LLMs are capable of solving novel problems when the user does most the work: when the user lays the groundwork and poses the right question for the LLM to answer.

... though like you I think posing the right question is the hard part, so imo this is not very informative.

Comment by Rafael Harth (sil-ver) on Have LLMs Generated Novel Insights? · 2025-02-25T18:33:23.871Z · LW · GW

Instead of "have LLMs generated novel insights", how about "have LLMs demonstrated the ability to identify which views about a non-formal topic make more or less sense?" This question seems easier to operationalize and I suspect points at a highly related ability.

Comment by Rafael Harth (sil-ver) on Yair Halberstadt's Shortform · 2025-02-24T19:26:46.837Z · LW · GW

Fwiw this is the kind of question that has definitely been answered in the training data, so I would not count this as an example of reasoning.

Comment by Rafael Harth (sil-ver) on The Unearned Privilege We Rarely Discuss: Cognitive Capability · 2025-02-18T21:18:30.038Z · LW · GW

I'm just not sure the central claim, that rationalists underestimate the role of luck in intelligence, is true. I've never gotten that impression. At least my assumption going into reading this was already that intelligence was probably 80-90% unearned.

Comment by Rafael Harth (sil-ver) on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-17T14:44:43.263Z · LW · GW

Humans must have gotten this ability from somewhere and it's unlikely the brain has tons of specialized architecture for it.

This is probably a crux; I think the brain does have tons of specialized architecture for it, and if I didn't believe that, I probably wouldn't think thought assessment was as difficult.

The thought generator seems more impressive/fancy/magic-like to me.

Notably people's intuitions about what is impressive/difficult tend to be inversely correlated with reality. The stereotype is (or at least used to be) that AI will be good at rationality and reasoning but struggle with creativity, humor, and intuition. This stereotype contains information since inverting it makes better-than-chance predictions about what AI has been good at so far, especially LLMs.

I think this is not a coincidence but roughly because people use "degree of conscious access" an inverse proxy for intuitive difficulty. The more unconscious something is, the more it feels like we don't know how it works, the more difficult it intuitively seems. But I suspect degree of conscious access positively correlates with difficulty.

If sequential reasoning is mostly a single trick, things should get pretty fast now. We'll see soon? :S

Yes; I think the "single trick" view might be mostly confirmed or falsified in as little as 2-3 years. (If I introspect I'm pretty confident that I'm not wrong here, the scenario that frightens me is more that sequential reasoning improves non-exponentially but quickly, which I think could still mean doom, even if it takes 15 years. Those feel like short timelines to me.)

Comment by Rafael Harth (sil-ver) on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-16T22:21:02.578Z · LW · GW

Whether or not every interpretation needs a way to connect measurements to conscious experiences, or whether they need extra machinery?

If we're being extremely pedantic, then then KC is about predicting conscious experience (or sensory input data, if you're an illusionist; one can debate what the right data type is). But this only matters for discussing things like Boltzmann brains. As soon as you assume that there exists an external universe, you can forget about your personal experience just try to estimate the length of the program that runs the universe.

So practically speaking, it's the first one. I think what quantum physics does with observers falls out of the math and doesn't require any explicit treatment. I don't think Copenhagen gets penalized for this, either. The wave function collapse increases complexity because it's an additional rule that changes how the universe operates, not because it has anything to do with observers. (As I mentioned, I think the 'good' version of Copenhagen doesn't mention observers, anyway.)

If you insist on the point that interpretation relates to an observer, then I'd just say that "interpretation of quantum mechanics" is technically a misnomer. It should just be called "theory of quantum mechanics". Interpretations don't have KC; theories do. We're comparing different source codes for the universe.

steelmanning

I think this argument is analogous to giving white credit for this rook check, which is fact a good move that allows white to win the queen next move -- when in actual fact white just didn't see that the square was protected and blundered a rook. The existence of the queen-winning tactic increases the objective evaluation of the move, but once you know that white didn't see it, it should not increase your skill estimate of white. You should judge the move as if the tactic didn't exist.

Similarly, the existence of a way to salvage the argument might make the argument better in the abstract, but should not influence your assessment of DeepSeek's intelligence, provided we agree that DeepSeek didn't know it existed. In general, you should never give someone credit for areas of the chess/argument tree that they didn't search.

Comment by Rafael Harth (sil-ver) on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-16T14:09:51.273Z · LW · GW

The reason we can expect Copenhagen-y interpretations to be simpler than other interpretations is because every other interpretation also needs a function to connect measurements to conscious experiences, but usually requires some extra machinery in addition to that.

I don't believe this is correct. But I separately think that it being correct would not make DeepSeek's answer any better. Because that's not what it said, at all. A bad argument does not improve because there exists a different argument that shares the same conclusion.

Comment by Rafael Harth (sil-ver) on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-15T20:27:40.802Z · LW · GW

Here's my take; not a physicist.

So in general, what DeepSeek says here might align better with intuitive complexity, but the point of asking about Kolmogorov Complexity rather than just Occam's Razor is that we're specifically trying to look at formal description length and not intuitive complexity.

Many Worlds does not need extra complexity to explain the branching. The branching happens due to the part of the math that all theories agree on. (In fact, I think a more accurate statement is that the branching is a description of what the math does.)

Then there's the wavefunction collapse. So first of all, wavefunction collapse is an additional postulate not contianed in the remaining math, so it adds complexity. (... and the lack of the additional postulate does not add complexity, as DeepSeek claimed.) And then there's a separate issue with KC arguably being unable to model randomness at all. You could argue that this is a failure of the KC formalism and we need KC + randomness oracle to even answer the question. You could also be hardcore about it and argue that any nondeterministic theory is impossible to describe and therefore has KC . In either case, the issue of randomness is something you should probably bring up in response to the question.

And finally there's the observer role. Iiuc the less stupid versions of Copenhagen do not give a special role to an observer; there's a special role for something being causally entangled with the experiment's result, but it doesn't have to be an agent. This is also not really a separate principle from the wave function collapse I don't think, it's what triggers collapse. And then it doesn't make any sense to list as a strength of Copenhagen because if anything it increases description length.

There are variants of KC that penalize the amount of stuff that are created rather than just the the description length I believe, in which case MW would have very high KC. This is another thing DeepSeek could have brought up.

Comment by Rafael Harth (sil-ver) on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-14T15:59:49.754Z · LW · GW

[...] I personally wouldn’t use the word ‘sequential’ for that—I prefer a more vertical metaphor like ‘things building upon other things’—but that’s a matter of taste I guess. Anyway, whatever we want to call it, humans can reliably do a great many steps, although that process unfolds over a long period of time.

…And not just smart humans. Just getting around in the world, using tools, etc., requires giant towers of concepts relying on other previously-learned concepts.

As a clarification for anyone wondering why I didn't use a framing more like this in the post, it's because I think these types of reasoning (horizontal and vertical/A and C) are related in an important way, even though I agree that C might be qualitatively harder than A (hence section §3.1). Or to put it differently, if one extreme position is "we can look entirely at A to extrapolate LLM performance into the future" and the other is "A and C are so different that progress on A is basically uninteresting", then my view is somewhere near the middle.

Comment by Rafael Harth (sil-ver) on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-14T14:56:07.278Z · LW · GW

It's not clear to me that an human, using their brain and a go board for reasoning could beat AlphaZero even if you give them infinite time.

I agree but I dispute that this example is relevant. I don't think there is any step in between "start walking on two legs" to "build a spaceship" that requires as much strictly-type-A reasoning as beating AlphaZero at go or chess. This particular kind of capability class doesn't seem to me to be very relevant.

Also, to the extent that it is relevant, a smart human with infinite time could outperform AlphaGo by programming a better chess/go computer. Which may sound silly but I actually think it's a perfectly reasonable reply -- using narrow AI to assist in brute-force cognitive tasks is something humans are allowed to do. And it's something that LLMs are also allowed to do; if they reach superhuman performance on general reasoning, and part of how they do this is by writing python scripts for modular subproblems, then we wouldn't say that this doesn't count.

Comment by Rafael Harth (sil-ver) on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-14T14:04:56.946Z · LW · GW

I do think the human brain uses two very different algorithms/architectures for thought generation and assessment. But this falls within the "things I'm not trying to justify in this post" category. I think if you reject the conclusion based on this, that's completely fair. (I acknowledged in the post that the central claim has a shaky foundation. I think the model should get some points because it does a good job retroactively predicting LLM performance -- like, why LLMs aren't already superhuman -- but probably not enough points to convince anyone.)

Comment by Rafael Harth (sil-ver) on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-14T14:00:09.265Z · LW · GW

I don't think a doubling every 4 or 6 months is plausible. I don't think a doubling on any fixed time is plausible because I don't think overall progress will be exponential. I think you could have exponential progress on thought generation, but this won't yield exponential progress on performance. That's what I was trying to get at with this paragraph:

My hot take is that the graphics I opened the post with were basically correct in modeling thought generation. Perhaps you could argue that progress wasn't quite as fast as the most extreme versions predicted, but LLMs did go from subhuman to superhuman thought generation in a few years, so that's pretty fast. But intelligence isn't a singular capability; it's ~~two capabilities~~ a phenomenon better modeled as two capabilities, and increasing just one of them happens to have sub-linear returns on overall performance.

So far (as measured by the 7card puzzle, which It think is a fair data point) I think we went from 'no sequential reasoning whatsoever' to 'attempted sequential reasoning but basically failed' (Jun13 update) to now being able to do genuine sequential reasoning for the first time. And if you look at how DeepSeek does it, to me this looks like the kind of thing where I expect difficulty to grow exponentially with argument length. (Based on stuff like it constantly having to go back and double checking even when it got something right.)

What I'd expect from this is not a doubling every N months, but perhaps an ability to reliably do one more step every N months. I think this translates into more above-constant returns on the "horizon length" scale -- because I think humans need more than 2x time for 2x steps -- but not exponential returns.

Comment by Rafael Harth (sil-ver) on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-14T13:15:33.424Z · LW · GW

This is true but I don't think it really matters for eventual performance. If someone thinks about a problem for a month, the number of times they went wrong on reasoning steps during the process barely influences the eventual output. Maybe they take a little longer. But essentially performance is relatively insensitive to errors if the error-correcting mechanism is reliable.

I think this is actually a reason why most benchmarks are misleading (humans make mistakes there, and they influence the rating).

Comment by Rafael Harth (sil-ver) on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-14T10:50:31.817Z · LW · GW

If thought assessment is as hard as thought generation and you need a thought assessor to get AGI (two non-obvious conditionals), then how do you estimate the time to develop a thought assessor? From which point on do you start to measure the amount of time it took to come up with the transformer architecture?

The snappy answer would be "1956 because that's when AI started; it took 61 years to invent the transformer architecture that lead to thought generation, so the equivalent insight for thought assessment will take about 61 years". I don't think that's the correct answer, but neither is "2019 because that's when AI first kinda resembled AGI".

Comment by Rafael Harth (sil-ver) on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-13T21:46:33.542Z · LW · GW

I generally think that [autonomous actions due to misalignment] and [human misuse] are distinct categories with pretty different properties. The part you quoted addresses the former (as does most of the post). I agree that there are scenarios where the second is feasible and the first isn't. I think you could sort of argue that this falls under AIs enhancing human intelligence.

Comment by Rafael Harth (sil-ver) on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-13T21:43:47.503Z · LW · GW

So, I agree that there has been substantial progress in the past year, hence the post title. But I think if you naively extrapolate that rate of progress, you get around 15 years.

The problem with the three examples you've mentioned is again that they're all comparing human cognitive work across a short amount of time with AI performance. I think the relevant scale doesn't go from 5th grade performance over 8th grade performance to university-level performance or whatever, but from "what a smart human can do in 5 minutes" over "what a smart human can do in an hour" over "what a smart human can do in a day", and so on.

I don't know if there is an existing benchmark that measures anything like this. (I agree that more concrete examples would improve the post, fwiw.)

And then a separate problem is that math problems are in in the easiest category from §3.1 (as are essentially all benchmarks).

Comment by Rafael Harth (sil-ver) on Those of you with lots of meditation experience: How did it influence your understanding of philosophy of mind and topics such as qualia? · 2025-02-13T16:51:13.892Z · LW · GW

I don't the experience of no-self contradicts any of the above.

In general, I think you could probably make some factual statements about the nature of consciousness that's true and that you learn from attaining no-self, if you phrased it very carefully, but I don't think that's the point.

The way I'd phrase what happens would be mostly in terms of attachment. You don't feel as implicated by things that affect you anymore, you have less anxiety, that kind of thing. I think a really good analogy is just that regular consciousness starts to resemble consciousness during a flow state.

Comment by Rafael Harth (sil-ver) on How identical twin sisters feel about nieces vs their own daughters · 2025-02-09T22:25:53.874Z · LW · GW

I would have been shocked if twin sisters cared equally about nieces and kids. Genetic similarity is one factor, not the entire story.

Comment by Rafael Harth (sil-ver) on The Failed Strategy of Artificial Intelligence Doomers · 2025-02-02T11:55:56.693Z · LW · GW

I think this is true but also that "most people's reasons for believing X are vibes-based" is true for almost any X that is not trivially verifiable. And also that this way of forming beliefs works reasonably well in many cases. This doesn't contradict anything you're saying but feels worth adding, like I don't think AI timelines are an unusual topic in that regard.

Comment by Rafael Harth (sil-ver) on o3 · 2025-02-02T09:55:56.263Z · LW · GW

Tricky to answer actually.

I can say more about my model now. The way I'd put it now (h/t Steven Byrnes) is that there are three interesting classes of capabilities

A: sequential reasoning of any kind
B: sequential reasoning on topics where steps aren't easily verifiable
C: the type of thing Steven mentions here, like coming up with new abstractions/concepts to integrate into your vocabulary to better think about something

Among these, obviously B is a subset of A. And while it's not obvious, I think C is probably best viewed as a subset of B. Regardless, I think all three are required for what I'd call AGI. (This is also how I'd justify the claim that no current LLM is AGI.) Maybe C isn't strictly required, I could imagine a mind getting superhuman performance without it, but I think given how LLMs work otherwise, it's not happening.

Up until DeepSeek, I would have also said LLMs are terrible A. (This is probably a hot take, but I genuinely think it's true despite benchmark performances continuing to go up.) My tasks were designed to test A, with the hypothesis that LLMs will suck at A indefinitely. For a while, it seemed like people weren't even focusing on A, which is why I didn't want to talk a bout it. But this concern is no longer applicable; the new models are clearly focused on improving sequential reasoning. However, o1 was terrible at it (imo), almost no improvement form GPT-4 proper, so I actually found o1 reassuring.

This has now mostly been falsified with DeepSeek and o3. (I know the numbers don't really tell the story since it just went from 1 to 2, but like, including which stuff they solved and how they argue, DeepSeek was the where I went "oh shit they can actually do legit sequential reasoning now".) Now I'm expecting most of the other tasks to fall as well, so I won't do similar updates if it goes to 5/10 or 8/10. The hypothesis "A is an insurmountable obstacle" can only be falsified once.

That said, it still matters how fast they improve. How much it matters depends on whether you think better performance on A is progress toward B/C. I'm still not sure about this, I'm changing my views a lot right now. So idk. If they score 10/10 in the next year, my p(LLMs scale to AGI) will definitely go above 50%, probably if they do it in 3 years as well, but that's about the only thing I'm sure about.

Comment by Rafael Harth (sil-ver) on o3 · 2025-02-01T13:42:58.969Z · LW · GW

o3-mini-high gets 3/10; this is essentially the same as DeepSeek (there were two where DeepSeek came very close, this is one of them). I'm still slightly more impressed with DeepSeek despite the result, but it's very close.

Just chiming in to say that I'm also interested in the correlation between camps and meditation. Especially from people who claim to have experienced the jhanas.

User info

Posts

Comments