d0themath

In that case I think your response is a non sequitur, since clearly “really care” in this context means “determiners of what they end up doing in practice re influencing x-risk”.

Comment by Garrett Baker (D0TheMath) on johnswentworth's Shortform · 2025-04-23T02:19:59.520Z · LW · GW

Conjecture seems unusually good at sticking to reality across multiple domains.

I do not get this impression, why do you say this?

Comment by Garrett Baker (D0TheMath) on ErioirE's shortform: · 2025-04-22T21:44:26.923Z · LW · GW

In this case prediction markets will be predictably over-optimistic, and expert consensus is very split.

Comment by Garrett Baker (D0TheMath) on Research Notes: Running Claude 3.7, Gemini 2.5 Pro, and o3 on Pokémon Red · 2025-04-21T17:28:26.208Z · LW · GW

There is a question of interest here though: why does pixel art work so well on humans despite literally nothing in real life being pixel art?

I’m reminded of Gwern’s comments on the difficulty of getting GANs to generate novel pixel art interpolations

Pixel-art anything is derivative of a photorealistic world. If you look at 8-bit art and standard block sizes like Mario in NES Mario, if you were not already intimately familiar with the distribution of human faces, and had to learn starting with a completely blank slate like a GAN would, how would you ever understand what a human face was such that you could imagine correct variations like Princess Peach or Luigi? Or if you wanted to generate Pokemon, which are all based heavily on real animals, how would the model know anything about horses or zebras or hamsters or butterflies and be able to generate a sprite of Butterfree independently? If you look at the Pokemon GAN failure cases closer and compare them to ‘real’ Pokemon, you start to realize to what an extent each Pokemon is derivative of several real-world animals or objects—Pokemon in some ways do not exist in their own right, they are only shorthand or mnemonics of other things. Pikachu is the “electric mouse”: but if you had never seen any electricity iconography like ‘lightning bolts’ or any rodents like hamsters or chinchilla or jerboa, how could you ever understand an image of a ‘Pikachu’ or generate a plausible rodent variation of it? If you could, you’d need a lot more Pikachu training data, that’s for sure. (One is reminded of the joke about the mathematicians telling jokes. They knew all each other’s favorites, you see, so they only needed to call out the number. “#9.” Sensible chuckles. “#81.” Laughter. “#17!” Chortling all around. The new grad student ventures his own joke: “…#112?” Outright hysteria! You see, they had never heard that one before…)

Comment by Garrett Baker (D0TheMath) on Richard Ngo's Shortform · 2025-04-21T07:05:32.893Z · LW · GW

I mean this situation is grounded & formal enough you can just go and implement the relevant RL algorithm and see if its relevant for that computationally bounded agent, right?

Comment by Garrett Baker (D0TheMath) on A Dissent on Honesty · 2025-04-19T15:14:37.615Z · LW · GW

It seems pretty likely SBF happened because everyone in EA was implicitly trusting everyone else in EA. If people were more suspicious of each other, that seems less likely to have been allowed to happen.

Comment by Garrett Baker (D0TheMath) on Ryan Kidd's Shortform · 2025-04-18T07:58:33.160Z · LW · GW

Don’t double update! I got that information from that same interview!

Comment by Garrett Baker (D0TheMath) on Ryan Kidd's Shortform · 2025-04-17T21:47:32.645Z · LW · GW

My vague understanding is this is kinda what capabilities progress ends up looking like in big labs. Lots of very small experiments playing around with various parameters people with a track-record of good heuristics in this space feel should be played around with. Then a slow scale up to bigger and bigger models and then you combine everything together & "push to main" on the next big model run.

I'd also guess that the bottleneck isn't so much on the number of people playing around with the parameters, but much more on good heuristics regarding which parameters to play around with.

Comment by Garrett Baker (D0TheMath) on jacquesthibs's Shortform · 2025-04-17T21:36:50.779Z · LW · GW

It seems useful for those who disagreed to reflect on this LessWrong comment from ~3 months ago (around the time the Epoch/OpenAI scandal happened).

Comment by Garrett Baker (D0TheMath) on Alexander Gietelink Oldenziel's Shortform · 2025-04-14T00:42:25.928Z · LW · GW

The strong version of this argument seems false (eg Habryka's comment), but I think the weak version is true. That is, energy put into "purposely and deliberately develop a technology Y that is fundamentally different than X that does the same role as X without harm Z but slightly less competitively." is inefficient compared to energy put into strategies (i), (ii), and (iii).

Comment by Garrett Baker (D0TheMath) on shortplav · 2025-04-13T02:49:13.316Z · LW · GW

If it is encoding relevant info then this would be the definition of steganography

Comment by Garrett Baker (D0TheMath) on Will US tariffs push data centers for large model training offshore? · 2025-04-12T21:33:33.601Z · LW · GW

Note that "smartphones, computers and more electronics" are exempt. I'd guess this would include (or end up including) datacenters. The details of the exemption are here.

Comment by Garrett Baker (D0TheMath) on Jemist's Shortform · 2025-04-08T21:01:43.487Z · LW · GW

This hardly seems an argument against the one in the shortform, namely

Neither a physicalist nor a functionalist theory of consciousness can reasonably justify a number like this. Shrimp have 5 orders of magnitude fewer neurons than humans, so whether suffering is the result of a physical process or an information processing one, this implies that shrimp neurons do 4 orders of magnitude more of this process per second than human neurons. The authors get around this by refusing to stake themselves on any theory of consciousness.

If the original authors never thought of this that seems on them.

Comment by Garrett Baker (D0TheMath) on The first AI war will be in your computer · 2025-04-08T15:13:48.864Z · LW · GW

but most of the population will just succumb to the pressure. Okay Microsoft, if you insist that I use Edge, I will; if you insist that I use Bing, I will; if you insist that I have MSN as my starting web page, I will

Only about 5% of people use edge, with 66% chrome and 17% safari. Bing is similar, with 4% marketshare and Google having about 90%. I don’t know the number with MSN as their starting page (my parents had this), but I’d guess its also lower than you expect. I think you over-estimate the impact of nudge economics

Comment by Garrett Baker (D0TheMath) on How much progress actually happens in theoretical physics? · 2025-04-05T19:24:19.824Z · LW · GW

That's an inference, presumably Adam believes that for object-level reasons, which could be supported by eg looking at the age at which physicists make major advancements^[1] and the size of those advancements.

Edit: But also this wouldn't show whether or not theoretical physics is actually in a rut, to someone who doesn't know what the field looks like now.

Adjusted for similar but known to be fast moving fields like AI or biology to normalize for facts like eg the academic job market just being worse now than previously. ↩︎

Comment by Garrett Baker (D0TheMath) on Benito's Shortform Feed · 2025-04-05T18:15:36.654Z · LW · GW

Claude says its a gray area when I ask, since this isn’t asking for the journalist to make a general change to the story or present Ben or the subject in a particular light.

Comment by Garrett Baker (D0TheMath) on How much progress actually happens in theoretical physics? · 2025-04-05T07:50:31.416Z · LW · GW

This doesn’t seem to address the question, which was why do people believe there is a physics slow-down in the first place.

Comment by Garrett Baker (D0TheMath) on LoganStrohl's Shortform · 2025-04-05T03:04:28.856Z · LW · GW

(you also may want to look into other ways of improving your conscientiousness if you're struggling with that. Things like todo systems, or daily planners, or simply regularly trying hard things)

Comment by Garrett Baker (D0TheMath) on LoganStrohl's Shortform · 2025-04-05T03:02:39.113Z · LW · GW

It seems reasonable to mention that I know of many who have started doing "spells" like this, with a rationalized "oh I'm just hypnotizing myself, I don't actually believe in magic" framing who then start to go off the deep-end and start actually believing in magic.

That's not to say this happens in every case or even in most cases. Its also not to say that hypnotizing yourself can't be useful sometimes. But it is to say that if you find this tempting to do because you really like the idea of magic existing in real life, I suggest you re-read some parts of the sequences.

Comment by Garrett Baker (D0TheMath) on Changing my mind about Christiano's malign prior argument · 2025-04-04T20:17:24.460Z · LW · GW

I'm not sure what the type signature of is, or what it means to "not take into account $M$ 's simulation"

I know you know about logical decision theory, and I know you know its not formalized, and I'm not going to be able to formalize it in a LessWrong comment, so I'm not sure what you want me to say here. Do you reject the idea of logical counterfactuals? Do you not see how they could be used here?

I think you've misunderstood me entirely. Usually in a decision problem, we assume the agent has a perfectly true world model, and we assume that it's in a particular situation (e.g. with omega and knowing how omega will react to different actions). But in reality, an agent has to learn which kind of world its in using an inductor. That's all I meant by "get its beliefs".

Because we're talking about priors and their influence, all of this is happening inside the agent's brain. The agent is going about daily life, and thinks "hm, maybe there is an evil demon simulating me who will give me -10¹⁰10^10 utility if I don't do what they want for my next action". I don't see why this is obviously ill-defined without further specification of the training setup.

Comment by Garrett Baker (D0TheMath) on AI 2027: What Superintelligence Looks Like · 2025-04-04T20:06:23.420Z · LW · GW

Can you give something specific? It seems like pretty much every statement has a footnote grounding the relevant high-level claim in low-level indicators, and in cases where that's not the case, those predictions often seem clear derivatives of precise claims in eg their compute forecast

Comment by Garrett Baker (D0TheMath) on AI 2027: What Superintelligence Looks Like · 2025-04-04T19:18:26.297Z · LW · GW

I mean its not like they shy away from concrete predictions. Eg their first prediction is

We forecast that mid-2025 agents will score 85% on SWEBench-Verified.

Edit: oh wait nevermind their first prediction is actually

Specifically, we forecast that they score 65% on the OSWorld benchmark of basic computer tasks (compared to 38% for Operator and 70% for a typical skilled non-expert human).

Comment by Garrett Baker (D0TheMath) on leogao's Shortform · 2025-04-04T19:05:39.780Z · LW · GW

The closing off of China after/during Tinamen square I don't think happened after a transition of power, though I could be mis-remembering. See also the one-child policy, which I also don't think happened during a power transition (allowed for 2 children in 2015, then removed all limits in 2021, while Xi came to power in 2012).

I agree the zero-covid policy change ended up being slow. I don't know why it was slow though, I know a popular narrative is that the regime didn't want to lose face, but one fact about China is the reason why many decisions are made is highly obscured. It seems entirely possible to me there were groups (possibly consisting of Xi himself) who believed zero-covid was smart. I don't know much about this though.

I will also say this is one example of china being abnormally slow of many examples of them being abnormally fast, and I think the abnormally fast examples win out overall.

Mao is kind of an exception, but thats because he had so much power that it was impossible to challenge his authority even when he messed up

Ish? The reason he pursued the cultural revolution was because people were starting to question his power, after the great leap forward, but yeah he could be an outlier. I do think that many autocracies are governed by charismatic & powerful leaders though, so not that much an outlier.

Comment by Garrett Baker (D0TheMath) on Changing my mind about Christiano's malign prior argument · 2025-04-04T18:37:54.058Z · LW · GW

Let be an agent which can be instantiated in a much simpler world and has different goals from our limited Bayesian agent $A$ . We say $M$ is malign with respect to $A$ if $p (q | O) < p (q_{M, A} | O)$ where $q$ is the "real" world and $q_{M, A}$ is the world where $M$ has decided to simulate all of $A$ 's observations for the purpose of trying to invade their prior.

Now what influences $p (q_{M, A} | O)$ ? Well $M$ will only simulate all of $A$ 's observations if it expects this will give it some influence over $A$ . Let $L_{A}$ be an unformalized logical counterfactual operation that $A$ could make.

Then $p (q_{M, A} | O, L_{A})$ is maximal when $L_{A}$ takes into account $M$ 's simulation, and $0$ when $L_{A}$ doesn't take into account $M$ 's simulation. In particular, if $L_{A, \neg M}$ is a logical counterfactual which doesn't take $M$ 's simulation into account, then

$p (q_{M, A} | O, L_{A, \neg M}) = 0 < p (q | O, L_{A, \neg M})$ So the way in which the agent "gets its beliefs" about the structure of the decision theory problem is via these logical-counterfactual-conditional operations, same as in causal decision theory, and same as in evidential decision theory.

Comment by Garrett Baker (D0TheMath) on Changing my mind about Christiano's malign prior argument · 2025-04-04T17:25:16.771Z · LW · GW

no, I am not going to do what the evil super-simple-simulators want me to do because they will try to invade my prior iff (I would act like they have invaded my prior iff they invade my prior)

Comment by Garrett Baker (D0TheMath) on AI 2027: What Superintelligence Looks Like · 2025-04-04T17:17:35.970Z · LW · GW

This seems a pretty big backpedal from "I expect this to start not happening right away."

Comment by Garrett Baker (D0TheMath) on Changing my mind about Christiano's malign prior argument · 2025-04-04T17:15:53.045Z · LW · GW

My world model would have a loose model of myself in it, and this will change which worlds I'm more or less likely to be found in. For example, a logical decision theorist, trying to model omega, will have very low probability that omega has predicted it will two box.

Comment by Garrett Baker (D0TheMath) on leogao's Shortform · 2025-04-04T16:59:35.938Z · LW · GW

Autarchies, including China, seem more likely to reconfigure their entire economic and social systems overnight than democracies like the US, so this seems false.

Comment by Garrett Baker (D0TheMath) on Changing my mind about Christiano's malign prior argument · 2025-04-04T16:55:42.961Z · LW · GW

Oh my point wasn't against solomonoff in general, maybe more crisply, my clam is different decision theories will find different "pathologies" in the solomonoff prior, and in particular for causal and evidential decision theorists, I could totally buy the misaligned prior bit, and I could totally buy, if formalized, the whole thing rests on the interaction between bad decision theory and solomonoff.

Comment by Garrett Baker (D0TheMath) on Changing my mind about Christiano's malign prior argument · 2025-04-04T08:09:59.530Z · LW · GW

I think I mostly agree with this, I think things possibly get more complicated when you throw decision theory into the mix. I think it unlikely I'm being adversarially simulated in part. I could believe that such malign prior problems are actually decision theory problems much more than epistemic problems. Eg "no, I am not going to do what the evil super-simple-simulators want me to do because they will try to invade my prior iff (I would act like they have invaded my prior iff they invade my prior)".

Comment by Garrett Baker (D0TheMath) on Towards a scale-free theory of intelligent agency · 2025-04-02T20:09:29.631Z · LW · GW

I'm curious about what neuroscience evidence you're thinking of which supports that model.

Comment by Garrett Baker (D0TheMath) on CapResearcher's Shortform · 2025-04-01T01:04:08.368Z · LW · GW

I'm curious which model it was. Can you post some quotes? Especially after the mask dropped?

Comment by Garrett Baker (D0TheMath) on adamzerner's Shortform · 2025-03-31T15:11:45.497Z · LW · GW

That is the wrong question to ask. By their nature the result of experiments is unknown. The bar is whether or not in expectation and on the margin do they provide positive externalities, and the answer is clearly yes.

Comment by Garrett Baker (D0TheMath) on plex's Shortform · 2025-03-29T00:32:19.689Z · LW · GW

Note the error bars in the original

Comment by Garrett Baker (D0TheMath) on Share AI Safety Ideas: Both Crazy and Not. №2 · 2025-03-28T19:47:45.719Z · LW · GW

After that I was writing shorter posts but without long context the things I write are very counterintuitive. So they got ruined)

This sounds like a rationalization. It seems much more likely the ideas just aren't that high quality if you need a whole hour for a single argument that couldn't possibly be broken up into smaller pieces that don't suck.

Edit: Since if the long post is disliked, you can say "well they just didn't read it", and if the short post is disliked you can say "well it just sucks because its small". Meanwhile, it should in fact be pretty surprising you don't have any interesting or novel or useful insights in your whole 40 minute post which can't be explained in a reasonable length of blog post time.

Comment by Garrett Baker (D0TheMath) on Share AI Safety Ideas: Both Crazy and Not. №2 · 2025-03-28T19:15:54.735Z · LW · GW

usually if you randomly get a downvote early instead of an upvote, so your post has “-1” karma now, then no one else will open or read it

I will say that I often do read -1 downvoted posts, I will also say that much of the time it is deserved, despite how noisy a signal it may be.

Some of my articles take 40 minutes to read, so it can be anything, downvotes give me zero information and just demotivate more and more.

I think you should try writing shorter posts. Both for your sake (so you get more targeted information), and for the readers' sake.

Comment by Garrett Baker (D0TheMath) on sarahconstantin's Shortform · 2025-03-28T19:10:14.183Z · LW · GW

https://www.science.org/content/blog-post/alkanes-mars there are alkanes -- big organic molecules -- on Mars. these can be produced by abiotic processes, but usually that makes shorter chains than these. so....life? We Shall See.

Very exciting! I think the biggest "loophole" here is probably that they used a novel technique for detection, maybe if we used that technique more we would have to update the view that such big molecules are so unlikely to be produced non-biologically.

Comment by Garrett Baker (D0TheMath) on Daniel Tan's Shortform · 2025-03-28T19:01:03.938Z · LW · GW

I'm a bit skeptical, there's a reasonable amount of passed-down wisdom I've heard claiming (I think justifiably) that

If you write messy code, and say "I'll clean it later" you probably won't. So insofar as you eventually want to discover something others build upon, you should write it clean from the start.
Clean code leads to easier extensibility, which seems pretty important eg if you want to try a bunch of different small variations on the same experiment.
Clean code decreases the number of bugs and the time spent debugging. This seems especially useful insofar as you are trying to rule-out hypotheses with high confidence, or prove hypotheses with high confidence.
Generally (this may be double-counting 2 and 3), paradoxically, clean code is faster rather than dirty code.

You say you came from a more SWE based paradigm though, so you probably know all this already.

Comment by Garrett Baker (D0TheMath) on 2024 Unofficial LessWrong Survey Results · 2025-03-26T20:50:40.293Z · LW · GW

Ok first, when naming things I think you should do everything you can to not use double-negatives. So you should say "gym average" or "no gym average". Its shorter, and much less confusing.

Second, I'm still confused. Translating what you said, we'd have "no gym removed average" -> "gym average" (since you remove everyone who doesn't go to the gym meaning the only people remaining go to the gym), and "gym removed average" -> "no gym average" (since we're removing everyone who goes to the gym meaning the only remaining people don't go to the gym).

Therefore we have,

gym average = no gym removed average < gym removed average = no gym average

So it looks like the gym doesn't help, since those who don't go to the gym have a higher average number of pushups they can do than those who go to the gym.

Comment by Garrett Baker (D0TheMath) on johnswentworth's Shortform · 2025-03-26T02:21:37.434Z · LW · GW

Note: You can verify this is the case by filtering for male respondents with male partners and female respondents with female partners in the survey data

Comment by Garrett Baker (D0TheMath) on Will Jesus Christ return in an election year? · 2025-03-25T22:50:00.438Z · LW · GW

I think the math works out to be that the variation is much more extreme when you get to much more extreme probabilities. Going from 4% to 8% is 2x profits, but going from 50% to 58% is only 1.16x profits.

Comment by Garrett Baker (D0TheMath) on Daniel Tan's Shortform · 2025-03-25T16:17:33.220Z · LW · GW

This seems likely to depend on your preferred style of research, so what is your preferred style of research?

Comment by Garrett Baker (D0TheMath) on Linch's Shortform · 2025-03-25T16:14:44.126Z · LW · GW

And then if we say the bottleneck to meritocracy is mostly c rather than a or b, then in fact it seems like our society is absolutely obsessed with making our institutions highly accessible to as broad a pool of talent as possible. There are people who make a whole career out of just advocating for equality.

Comment by Garrett Baker (D0TheMath) on Recent AI model progress feels mostly like bullshit · 2025-03-25T16:11:58.560Z · LW · GW

I work at GDM so obviously take that into account here, but in my internal conversations about external benchmarks we take cheating very seriously -- we don't want eval data to leak into training data, and have multiple lines of defense to keep that from happening.

What do you mean by "we"? Do you work on the pretraining team, talk directly with the pretraining team, are just aware of the methods the pretraining team uses, or some other thing?

Comment by Garrett Baker (D0TheMath) on Linch's Shortform · 2025-03-25T05:01:15.794Z · LW · GW

More to the point, I haven't seen people try to scale those things either. The closest might be something like TripleByte? Or headhunting companies? Certainly when I think of a typical (or 95th-99th percentile) "person who says they care a lot about meritocracy" I'm not imagining a recruiter, or someone in charge of such a firm. Are you?

I think much of venture capital is trying to scale this thing, and as you said they don't use the framework you use. The philosophy there is much more oriented towards making sure nobody falls beneath the cracks. Provide the opportunity, then let the market allocate the credit.

That is, the way to scale meritocracy turns out to be maximizing c rather than the other considerations you listed, on current margins.

Comment by Garrett Baker (D0TheMath) on Linch's Shortform · 2025-03-25T01:59:11.768Z · LW · GW

Also this conclusion is highly dependent on you, who has thought about this topic for all of 10 minutes, out-thinking the hypothetical people who are actually serious about meritocracy. For example perhaps they do more one-on-one talent scouting or funding, which is indeed very very common and seems to be much more in-demand than psychometric evaluations.

Comment by Garrett Baker (D0TheMath) on Linch's Shortform · 2025-03-25T01:51:06.253Z · LW · GW

Given that ~ no one really does this, I conclude that very few people are serious about moving towards a meritocracy.

The field you should look at I think is Industrial and Organizational Psychology, as well as the classic Item Response Theory.

Comment by Garrett Baker (D0TheMath) on rhollerith_dot_com's Shortform · 2025-03-24T20:56:13.503Z · LW · GW

I suspect the vast majority of that sort of name-calling is much more politically motivated than based on not seeing the right slogans. For example if you go to Pause AI's website the first thing you see is a big, bold

and AI pause advocates are constantly arguing "no, we don't actually believe that" to the people who call them "luddites", but I have never actually seen anyone change their mind based on such an argument.

Comment by Garrett Baker (D0TheMath) on rhollerith_dot_com's Shortform · 2025-03-24T20:33:16.185Z · LW · GW

I don't think Pause AI's current bottleneck is people being pro AI in general not wanting to join (but of course I could be wrong). Most people are just against AI, and Pause AI's current strategy is to make them care enough about the issue to use their feet, while also telling them "its much much worse than you would've imagined bro".

Comment by Garrett Baker (D0TheMath) on rhollerith_dot_com's Shortform · 2025-03-24T18:46:08.197Z · LW · GW

That's a whole seven words!, most of which are a whole three syllables! There is no way a motto like that catches on.

User info

Posts

Comments