Posts

Lighthaven Sequences Reading Group #23 (Tuesday 02/25) 2025-02-23T05:01:25.105Z
Lighthaven Sequences Reading Group #22 (Tuesday 02/18) 2025-02-16T03:51:55.641Z
Lighthaven Sequences Reading Group #21 (Tuesday 02/11) 2025-02-06T20:49:34.290Z
Lighthaven Sequences Reading Group #20 (Tuesday 02/04) 2025-01-30T04:37:48.271Z
Lighthaven Sequences Reading Group #19 (Tuesday 01/28) 2025-01-26T00:02:49.220Z
Lighthaven Sequences Reading Group #18 (Tuesday 01/21) 2025-01-17T02:49:54.060Z
RESCHEDULED Lighthaven Sequences Reading Group #16 (Saturday 12/28) 2024-12-20T06:31:56.746Z
What and Why: Developmental Interpretability of Reinforcement Learning 2024-07-09T14:09:40.649Z
On Complexity Science 2024-04-05T02:24:32.039Z
So You Created a Sociopath - New Book Announcement! 2024-04-01T18:02:18.010Z
Announcing Suffering For Good 2024-04-01T17:08:12.322Z
Neuroscience and Alignment 2024-03-18T21:09:52.004Z
Epoch wise critical periods, and singular learning theory 2023-12-14T20:55:32.508Z
A bet on critical periods in neural networks 2023-11-06T23:21:17.279Z
When and why should you use the Kelly criterion? 2023-11-05T23:26:38.952Z
Singular learning theory and bridging from ML to brain emulations 2023-11-01T21:31:54.789Z
My hopes for alignment: Singular learning theory and whole brain emulation 2023-10-25T18:31:14.407Z
AI presidents discuss AI alignment agendas 2023-09-09T18:55:37.931Z
Activation additions in a small residual network 2023-05-22T20:28:41.264Z
Collective Identity 2023-05-18T09:00:24.410Z
Activation additions in a simple MNIST network 2023-05-18T02:49:44.734Z
Value drift threat models 2023-05-12T23:03:22.295Z
What constraints does deep learning place on alignment plans? 2023-05-03T20:40:16.007Z
Pessimistic Shard Theory 2023-01-25T00:59:33.863Z
Performing an SVD on a time-series matrix of gradient updates on an MNIST network produces 92.5 singular values 2022-12-21T00:44:55.373Z
Don't design agents which exploit adversarial inputs 2022-11-18T01:48:38.372Z
A framework and open questions for game theoretic shard modeling 2022-10-21T21:40:49.887Z
Taking the parameters which seem to matter and rotating them until they don't 2022-08-26T18:26:47.667Z
How (not) to choose a research project 2022-08-09T00:26:37.045Z
Information theoretic model analysis may not lend much insight, but we may have been doing them wrong! 2022-07-24T00:42:14.076Z
Modelling Deception 2022-07-18T21:21:32.246Z
Another argument that you will let the AI out of the box 2022-04-19T21:54:38.810Z
[cross-post with EA Forum] The EA Forum Podcast is up and running 2021-07-05T21:52:18.787Z
Information on time-complexity prior? 2021-01-08T06:09:03.462Z
D0TheMath's Shortform 2020-10-09T02:47:30.056Z
Why does "deep abstraction" lose it's usefulness in the far past and future? 2020-07-09T07:12:44.523Z

Comments

Comment by Garrett Baker (D0TheMath) on outlining is a historically recent underutilized gift to family · 2025-02-27T17:41:11.511Z · LW · GW

outlining is historically recent, since particular digital interfaces (such as Workflowy, Org Mode, Dynalist or Ravel) make it orders of magnitude easier to reorganize and nest text.

This seems false, according to Wikipedia, table of contents have existed, for example, since ancient Rome, getting abandoned when the price of paper became too high, and re-adopted in the 12th century once paper became cheap.

Comment by Garrett Baker (D0TheMath) on So You Want To Make Marginal Progress... · 2025-02-24T21:56:36.752Z · LW · GW

I think the connection would come from the concept of a Lagrangian dual problem in optimization. See also John's Mazes and Duality.

Comment by Garrett Baker (D0TheMath) on Neural networks generalize because of this one weird trick · 2025-02-21T19:06:41.643Z · LW · GW

Ah, I didn't realize earlier that this was the goal. Are there any theorems that use SLT to quantify out-of-distribution generalization? The SLT papers I have read so far seem to still be talking about in-distribution generalization, with the added comment that Bayesian learning/SGD is more likely to give us "simpler" models and simpler models generalize better. 

Sumio Watanabe has two papers on out of distribution generalization:

Asymptotic Bayesian generalization error when training and test distributions are different

In supervised learning, we commonly assume that training and test data are sampled from the same distribution. However, this assumption can be violated in practice and then standard machine learning techniques perform poorly. This paper focuses on revealing and improving the performance of Bayesian estimation when the training and test distributions are different. We formally analyze the asymptotic Bayesian generalization error and establish its upper bound under a very general setting. Our important finding is that lower order terms---which can be ignored in the absence of the distribution change---play an important role under the distribution change. We also propose a novel variant of stochastic complexity which can be used for choosing an appropriate model and hyper-parameters under a particular distribution change.

Experimental Bayesian Generalization Error of Non-regular Models under Covariate Shift

In the standard setting of statistical learning theory, we assume that the training and test data are generated from the same distribution. However, this assumption cannot hold in many practical cases, e.g., brain-computer interfacing, bioinformatics, etc. Especially, changing input distribution in the regression problem often occurs, and is known as the covariate shift. There are a lot of studies to adapt the change, since the ordinary machine learning methods do not work properly under the shift. The asymptotic theory has also been developed in the Bayesian inference. Although many effective results are reported on statistical regular ones, the non-regular models have not been considered well. This paper focuses on behaviors of non-regular models under the covariate shift. In the former study [1], we formally revealed the factors changing the generalization error and established its upper bound. We here report that the experimental results support the theoretical findings. Moreover it is observed that the basis function in the model plays an important role in some cases.

Comment by Garrett Baker (D0TheMath) on Sterrs's Shortform · 2025-02-13T04:08:53.947Z · LW · GW

Infinite eg energy would just push your scarcity to other resources, eg.

Comment by Garrett Baker (D0TheMath) on Racing Towards Fusion and AI · 2025-02-09T06:00:51.798Z · LW · GW

This seems false given that AI training will be/is bottlenecked on energy.

Comment by Garrett Baker (D0TheMath) on Wild Animal Suffering Is The Worst Thing In The World · 2025-02-07T01:09:32.371Z · LW · GW

I am sympathetic to, but unconvinced of the importance of animal suffering in general. However for those that are sympathetic to animal suffering, I could never understand their resistance to caring about wild animal suffering, a resistance which seems relatively common. So this post seems good for them.

This post does seem more of an EA forum sorta post though.

Comment by Garrett Baker (D0TheMath) on artifex0's Shortform · 2025-02-06T06:21:54.320Z · LW · GW

SCOTUS decision that said a state had to, say, extradite somebody accused of "abetting an abortion" in another state.

Look no further than how southern states responded to civil rights rulings, and how they (back when it was still held) they responded to roe v wade. Of course those reactions were much harder than, say, simply neglecting to enforce laws, which it should be noted liberal cities & states have been practicing doing for decades. Of course you say you're trying to enforce laws, but you just subject all your members to all the requirements of the US bureaucracy and you can easily stop enforce laws while complying with the letter of the law. Indeed, it is complying with the letter of the law which prevents you from enforcing the laws.

Comment by Garrett Baker (D0TheMath) on artifex0's Shortform · 2025-02-06T06:15:32.726Z · LW · GW

... and in the legal arena, there's a whole lot of pressure building up on that state and local resistance. So far it's mostly money-based pressure, but within a few years, I could easily see a SCOTUS decision that said a state had to, say, extradite somebody accused of "abetting an abortion" in another state.

What money based pressure are you thinking of? Cities, as far as I know, have and always will be much more liberal than the general populace, and ditto for the states with much of their populace in cities.

Comment by Garrett Baker (D0TheMath) on artifex0's Shortform · 2025-02-05T20:23:30.367Z · LW · GW

For rights, political power in the US is very federated. Even if many states overtly try to harm you, there will be many states you can run to, and most cities within states will fight against this. Note state-wise weed legalization and sanctuary cities. And the threat of this happening itself discourages such overt acts.

If you're really concerned, then just move to california! Its much easier than moving abroad.

As for war, the most relevant datapoint is this metaculus question, forecasting a 15% of >10k american deaths before 2030, however it doesn't seem like anyone's updated their forecast there since 2023, and some of the comments seem kinda unhinged. It should also be noted that the question counts all deaths, not just civilian deaths, and not just those in the contiguous US. So I think this is actually a very very optimistic number, and implies a lower than 5% chance of such events reaching civilians and the contiguous states.

Comment by Garrett Baker (D0TheMath) on DeepSeek Panic at the App Store · 2025-01-29T01:27:21.254Z · LW · GW

Yeah, these are mysteries, I don't know why. TSMC I think did get hit pretty hard though. 

Comment by Garrett Baker (D0TheMath) on DeepSeek Panic at the App Store · 2025-01-28T23:34:14.588Z · LW · GW

Politicians announce all sorts of things on the campaign trail, that usually is not much indication of what post-election policy will be.

Comment by Garrett Baker (D0TheMath) on DeepSeek Panic at the App Store · 2025-01-28T19:58:04.402Z · LW · GW

Seems more likely the drop was from Trump tariff leaks than deepseek’s app.

Comment by Garrett Baker (D0TheMath) on Ryan Kidd's Shortform · 2025-01-27T19:18:55.160Z · LW · GW

I also note that 30x seems like an under-estimate to me, but also too simplified. AIs will make some tasks vastly easier, but won't help too much with other tasks. We will have a new set of bottlenecks once we reach the "AIs vastly helping with your work" phase. The question to ask is "what will the new bottlenecks be, and who do we have to hire to be prepared for them?" 

If you are uncertain, this consideration should lean you much more towards adaptive generalists than the standard academic crop.

Comment by Garrett Baker (D0TheMath) on Ryan Kidd's Shortform · 2025-01-27T19:08:14.627Z · LW · GW

There's the standard software engineer response of "You cannot make a baby in 1 month with 9 pregnant women". If you don't have a term in this calculation for the amount of research hours that must be done serially vs the amount of research hours that can be done in parallel, then it will always seem like we have too few people, and should invest vastly more in growth growth growth!

If you find that actually your constraint is serial research output, then you still may conclude you need a lot of people, but you will sacrifice a reasonable amount of growth speed for attracting better serial researchers. 

(Possibly this shakes out to mathematicians and physicists, but I don't want to bring that conversation into here)

Comment by Garrett Baker (D0TheMath) on johnswentworth's Shortform · 2025-01-26T22:13:25.776Z · LW · GW

The most obvious one imo is the immune system & the signals it sends. 

Others:

  • Circadian rhythm
  • Age is perhaps a candidate here, though it may be more or less a candidate depending on if you're talking about someone before or after 30
  • Hospice workers sometimes talk about the body "knowing how to die", maybe there's something to that
Comment by Garrett Baker (D0TheMath) on The Hopium Wars: the AGI Entente Delusion · 2025-01-26T11:04:37.716Z · LW · GW

If that’s the situation, then why the “if and only if”, if we magically make then all believe they will die if they make ASI, then they would all individually be incentivized to stop it from happening independent of China’s actions.

Comment by Garrett Baker (D0TheMath) on The Hopium Wars: the AGI Entente Delusion · 2025-01-26T01:46:53.890Z · LW · GW

I think that China and the US would definitely agree to pause if and only if they can confirm the other also committing to a pause. Unfortunately, this is a really hard thing to confirm, much harder than with nuclear.

This seems false to me. Eg Trump for one seems likely to do what the person who pays him the most & is the most loyal to him tells him to do, and AI risk worriers do not have the money or the politics for either of those criteria compared to, for example, Elon Musk.

Comment by Garrett Baker (D0TheMath) on Learning By Writing · 2025-01-26T01:44:04.859Z · LW · GW

Its on his Linkedin at least. Apparently since the start of the year.

Comment by Garrett Baker (D0TheMath) on Instrumental Goals Are A Different And Friendlier Kind Of Thing Than Terminal Goals · 2025-01-24T23:50:44.589Z · LW · GW

I will note this sounds a lot like Turntrout's old Attainable Utility Preservation scheme. Not exactly, but enough that I wouldn't be surprised if a bunch of the math here has already been worked out by him (and possibly, in the comments, a bunch of the failure-modes identified).

Comment by Garrett Baker (D0TheMath) on jacquesthibs's Shortform · 2025-01-23T17:55:44.948Z · LW · GW

Engineers: Its impossible.

Meta management: Tony Stark DeepSeek was able to build this in a cave! With a box of scraps!

Comment by Garrett Baker (D0TheMath) on Detect Goodhart and shut down · 2025-01-23T16:32:44.927Z · LW · GW

Although I don't think the first example is great, seems more like a capability/observation-bandwidth issue.

I think you can have multiple failures at the same time. The reason I think this was also goodhart was because I think the failure-mode could have been averted if sonnet was told “collect wood WITHOUT BREAKING MY HOUSE” ahead of time.

Comment by Garrett Baker (D0TheMath) on Detect Goodhart and shut down · 2025-01-23T15:31:24.269Z · LW · GW

If you put current language models in weird situations & give them a goal, I’d say they do do edge instantiation, without the missing “creativity” ingredient. Eg see claude sonnet in minecraft repurposing someone’s house for wood after being asked to collect wood.

Edit: There are other instances of this too, where you can tell claude to protect you in minecraft, and it will constantly tp to your position, and build walls around you when monsters are around. Protecting you, but also preventing any movement or fun you may have wanted to have.

Comment by Garrett Baker (D0TheMath) on We don't want to post again "This might be the last AI Safety Camp" · 2025-01-21T22:46:57.775Z · LW · GW

I don't understand why Remmelt going "off the deep end" should affect AI safety camp's funding. That seems reasonable for speculative bets, but not when there's a strong track-record available. 

Comment by Garrett Baker (D0TheMath) on Lighthaven Sequences Reading Group #18 (Tuesday 01/21) · 2025-01-21T19:45:59.813Z · LW · GW

It is, we’ve been limiting ourselves to readings from the sequence highlights. I’ll ask around to see if other organizers would like to broaden our horizons.

Comment by Garrett Baker (D0TheMath) on Embee's Shortform · 2025-01-18T08:17:48.590Z · LW · GW

I mean, one of them’s math built bombs and computers & directly influenced pretty much every part of applied math today, and the other one’s math built math. Not saying he wasn’t smart, but no question are bombs & computers more flashy. 

Comment by Garrett Baker (D0TheMath) on Lighthaven Sequences Reading Group #18 (Tuesday 01/21) · 2025-01-18T07:36:33.395Z · LW · GW

Fixed!

Comment by Garrett Baker (D0TheMath) on The purposeful drunkard · 2025-01-17T19:17:51.523Z · LW · GW

The paper you're thinking of is probably The Developmental Landscape of In-Context Learning.

Comment by Garrett Baker (D0TheMath) on Lecture Series on Tiling Agents · 2025-01-17T19:13:03.002Z · LW · GW

@abramdemski I think I'm the biggest agree vote for alexander (without me alexander would have -2 agree), and I do see this because I follow both of you on my subscribe tab. 

I basically endorse Alexander's elaboration. 

On the "prep for the model that is coming tomorrow not the model of today" front, I will say that LLMs are not always going to be as dumb as they are today. Even if you can't get them to understand or help with your work now, their rate of learning still makes them in some sense your most promising mentee, and that means trying to get as much of the tacit knowledge you have into their training data as possible (if you want them to be able to more easily & sooner build on your work). Or (if you don't want to do that for whatever reason) just generally not being caught flat-footed once they are smart enough to help you, as all your ideas are in videos or otherwise in high context understandable-only-to-abram notes.

In the words of gwern

Should you write text online now in places that can be scraped? You are exposing yourself to 'truesight' and also to stylometric deanonymization or other analysis, and you may simply have some sort of moral objection to LLM training on your text.

This seems like a bad move to me on net: you are erasing yourself (facts, values, preferences, goals, identity) from the future, by which I mean, LLMs. Much of the value of writing done recently or now is simply to get stuff into LLMs. I would, in fact, pay money to ensure Gwern.net is in training corpuses, and I upload source code to Github, heavy with documentation, rationale, and examples, in order to make LLMs more customized to my use-cases. For the trifling cost of some writing, all the worlds' LLM providers are competing to make their LLMs ever more like, and useful to, me.

Comment by Garrett Baker (D0TheMath) on lemonhope's Shortform · 2025-01-15T15:10:29.575Z · LW · GW

in some sense that’s just hiring you for any other job, and of course if an AGI lab wants you, you end up with greater negotiating leverage at your old place, and could get a raise (depending on how tight capital constraints are, which, to be clear, in AI alignment are tight).

Comment by Garrett Baker (D0TheMath) on Nathan Helm-Burger's Shortform · 2025-01-12T00:28:01.455Z · LW · GW

I think its this

Comment by Garrett Baker (D0TheMath) on D0TheMath's Shortform · 2025-01-09T20:52:07.708Z · LW · GW

Over the past few days I've been doing a lit review of the different types of attention heads people have found and/or the metrics one can use to detect the presence of those types of heads. 

Here is a rough list from my notes, sorry for the poor formatting, but I did say its rough!

Comment by Garrett Baker (D0TheMath) on The Plan - 2024 Update · 2024-12-31T15:38:40.508Z · LW · GW

And yes, I do think that interp work today should mostly focus on image nets for the same reasons we focus on image nets. The field’s current focus on LLMs is a mistake

A note that word on the street in mech-interp land is that often you get more signal & a greater number of techniques work on bigger & smarter language models over smaller & dumber possibly-not-language-models. Presumably due to smarter & complex models having more structured representations.

Comment by Garrett Baker (D0TheMath) on If all trade is voluntary, then what is "exploitation?" · 2024-12-27T21:46:36.262Z · LW · GW

Can you show how a repeated version of this game results in overall better deals for the company? I agree this can happen, but I disagree for this particular circumstance.

Comment by Garrett Baker (D0TheMath) on If all trade is voluntary, then what is "exploitation?" · 2024-12-27T20:00:49.555Z · LW · GW

Then the company is just being stupid, and the previous definition of exploitation doesn't apply. The company is imposing large costs for a large cost to itself. If the company does refuse the deal, its likely because it doesn't have the right kinds of internal communication channels to do negotiations like this, and so this is indeed a kind of stupidity. 

Why the distinction between exploitation and stupidity? Well they require different solutions. Maybe we solve exploitation (if indeed it is a problem) via collective action outside of the company. But we would have to solve stupidity via better information channels & flexibility inside the company. There is also a competitive pressure to solve such stupidity problems where there may not be in an exploitation problem. Eg if a different company or a different department allowed that sort of deal, then the problem would be solved. 

Comment by Garrett Baker (D0TheMath) on What Have Been Your Most Valuable Casual Conversations At Conferences? · 2024-12-25T17:11:49.440Z · LW · GW

If conversations are heavy tailed then we should in fact expect people to have singular & likely memorable high-value conversations.

Comment by Garrett Baker (D0TheMath) on sarahconstantin's Shortform · 2024-12-10T16:00:09.514Z · LW · GW

otoh I also don't think cutting off contact with anyone "impure", or refusing to read stuff you disapprove of, is either practical or necessary. we can engage with people and things without being mechanically "nudged" by them.

I think the reason not to do this is because of peer pressure. Ideally you should have the bad pressures from your peers cancel out, and in order to accomplish this you need your peers to be somewhat decorrelated from each other, and you can't really do this if all your peers and everyone you listen to is in the same social group.

Comment by Garrett Baker (D0TheMath) on sarahconstantin's Shortform · 2024-12-10T15:55:24.986Z · LW · GW

there is no neurotype or culture that is immune to peer pressure

Seems like the sort of thing that would correlate pretty robustly to big-5 agreeableness, and in that sense there are neurotypes immune to peer pressure.

Edit: One may also suspect a combination of agreeableness and non-openness

Comment by Garrett Baker (D0TheMath) on Should you be worried about H5N1? · 2024-12-06T15:36:27.622Z · LW · GW

Some assorted polymarket and metaculus forecasts on the subject:

They are not exactly low.

Comment by Garrett Baker (D0TheMath) on Open Thread Fall 2024 · 2024-12-02T05:28:40.197Z · LW · GW

Those invited to the foresight workshop (also the 2023 one) are probably a good start, as well as foresight’s 2023 and 2024 lectures on the subject.

Comment by Garrett Baker (D0TheMath) on dirk's Shortform · 2024-11-30T05:52:43.860Z · LW · GW

I will take Zvi's takeaways from his experience in this round of SFF grants as significant outside-view evidence for my inside view of the field.

Comment by Garrett Baker (D0TheMath) on leogao's Shortform · 2024-11-28T18:44:50.368Z · LW · GW

I think you are possibly better/optimizing more than most others at selecting conferences & events you actually want to do. Even with work, I think many get value out of having those spontaneous conversations because it often shifts what they're going to do--the number one spontaneous conversation is "what are you working on" or "what have you done so far", which forces you to re-explain what you're doing & the reasons for doing it to a skeptical & ignorant audience. My understanding is you and David already do this very often with each other.

Comment by Garrett Baker (D0TheMath) on Eli's shortform feed · 2024-11-26T22:08:09.166Z · LW · GW

I think its reasonable for the conversion to be at the original author's discretion rather than an automatic process.

Comment by Garrett Baker (D0TheMath) on Shortform · 2024-11-23T08:00:56.699Z · LW · GW

Back in May, when the Crowdstrike bug happened, people were posting wild takes on Twitter and in my signal groupchats about how Crowdstrike is only used everywhere because the government regulators subject you to copious extra red tape if you try to switch to something else.

Here’s the original claim:

Microsoft blamed a 2009 antitrust agreement with the European Union that they said forced them to sustain low-level kernel access to third-party developers.[286][287][288] The document does not explicitly state that Microsoft has to provide kernel-level access, but says Microsoft must provide access to the same APIs used by its own security products.[287]

This seems consistent with your understanding of regulatory practices (“they do not give a rats ass what particular software vendor you use for anything”), and is consistent with the EU’s antitrust regulations being at fault—or at least Microsoft’s cautious interpretation of the regulations, which indeed is the approach you want to take here.

Comment by Garrett Baker (D0TheMath) on Which things were you surprised to learn are not metaphors? · 2024-11-22T01:05:35.587Z · LW · GW

I believed “bear spray” was a metaphor for a gun. Eg if you were posting online about camping and concerned about the algorithm disliking your use of the word gun, were going into a state park which has guns banned, or didn’t want to mention “gun” for some other reason, then you’d say “bear spray”, since bear spray is such an absurd & silly concept that people will certainly understand what you really mean.

Turns out, bear spray is real. Its pepper spray on steroids, and is actually more effective than a gun, since its easier to aim and is optimized to blind & actually cause pain rather than just damage. [EDIT:] Though see Jimmy's comment below for a counter-point.

Comment by Garrett Baker (D0TheMath) on Open Thread Fall 2024 · 2024-11-21T00:50:23.447Z · LW · GW

[Bug report]: The Popular Comments section's comment preview ignores spoiler tags

As seen on Windows/Chrome

Comment by Garrett Baker (D0TheMath) on What are the good rationality films? · 2024-11-20T22:09:55.618Z · LW · GW

Film: The Martian

Rationality Tie-in: Virtue of scholarship is thread throughout, but Watney is generally an intelligent person tacking a seemingly impossible to solve problem.

Comment by Garrett Baker (D0TheMath) on Lighthaven Sequences Reading Group #12 (Tuesday 11/26) · 2024-11-20T06:18:06.481Z · LW · GW

Moneyball

Comment by Garrett Baker (D0TheMath) on Lighthaven Sequences Reading Group #12 (Tuesday 11/26) · 2024-11-20T06:17:57.981Z · LW · GW

The Martian

Comment by Garrett Baker (D0TheMath) on Lighthaven Sequences Reading Group #12 (Tuesday 11/26) · 2024-11-20T06:13:02.505Z · LW · GW

A Boy and His Dog -- a weird one, but good for talking through & a heavy inspiration for Fallout

Comment by Garrett Baker (D0TheMath) on Lighthaven Sequences Reading Group #12 (Tuesday 11/26) · 2024-11-20T06:07:49.693Z · LW · GW

RRR