Posts

On Complexity Science 2024-04-05T02:24:32.039Z
So You Created a Sociopath - New Book Announcement! 2024-04-01T18:02:18.010Z
Announcing Suffering For Good 2024-04-01T17:08:12.322Z
Neuroscience and Alignment 2024-03-18T21:09:52.004Z
Epoch wise critical periods, and singular learning theory 2023-12-14T20:55:32.508Z
A bet on critical periods in neural networks 2023-11-06T23:21:17.279Z
When and why should you use the Kelly criterion? 2023-11-05T23:26:38.952Z
Singular learning theory and bridging from ML to brain emulations 2023-11-01T21:31:54.789Z
My hopes for alignment: Singular learning theory and whole brain emulation 2023-10-25T18:31:14.407Z
AI presidents discuss AI alignment agendas 2023-09-09T18:55:37.931Z
Activation additions in a small residual network 2023-05-22T20:28:41.264Z
Collective Identity 2023-05-18T09:00:24.410Z
Activation additions in a simple MNIST network 2023-05-18T02:49:44.734Z
Value drift threat models 2023-05-12T23:03:22.295Z
What constraints does deep learning place on alignment plans? 2023-05-03T20:40:16.007Z
Pessimistic Shard Theory 2023-01-25T00:59:33.863Z
Performing an SVD on a time-series matrix of gradient updates on an MNIST network produces 92.5 singular values 2022-12-21T00:44:55.373Z
Don't design agents which exploit adversarial inputs 2022-11-18T01:48:38.372Z
A framework and open questions for game theoretic shard modeling 2022-10-21T21:40:49.887Z
Taking the parameters which seem to matter and rotating them until they don't 2022-08-26T18:26:47.667Z
How (not) to choose a research project 2022-08-09T00:26:37.045Z
Information theoretic model analysis may not lend much insight, but we may have been doing them wrong! 2022-07-24T00:42:14.076Z
Modelling Deception 2022-07-18T21:21:32.246Z
Another argument that you will let the AI out of the box 2022-04-19T21:54:38.810Z
[cross-post with EA Forum] The EA Forum Podcast is up and running 2021-07-05T21:52:18.787Z
Information on time-complexity prior? 2021-01-08T06:09:03.462Z
D0TheMath's Shortform 2020-10-09T02:47:30.056Z
Why does "deep abstraction" lose it's usefulness in the far past and future? 2020-07-09T07:12:44.523Z

Comments

Comment by Garrett Baker (D0TheMath) on Examples of Highly Counterfactual Discoveries? · 2024-04-25T17:20:39.102Z · LW · GW

[edit: nevermind I see you already know about the following quotes. There's other evidence of the influence in Sedley's book I link below]

In De Reum Natura around line 716:

Add, too, whoever make the primal stuff Twofold, by joining air to fire, and earth To water; add who deem that things can grow Out of the four- fire, earth, and breath, and rain; As first Empedocles of Acragas, Whom that three-cornered isle of all the lands Bore on her coasts, around which flows and flows In mighty bend and bay the Ionic seas, Splashing the brine from off their gray-green waves. Here, billowing onward through the narrow straits, Swift ocean cuts her boundaries from the shores Of the Italic mainland. Here the waste Charybdis; and here Aetna rumbles threats To gather anew such furies of its flames As with its force anew to vomit fires, Belched from its throat, and skyward bear anew Its lightnings' flash. And though for much she seem The mighty and the wondrous isle to men, Most rich in all good things, and fortified With generous strength of heroes, she hath ne'er Possessed within her aught of more renown, Nor aught more holy, wonderful, and dear Than this true man. Nay, ever so far and pure The lofty music of his breast divine Lifts up its voice and tells of glories found, That scarce he seems of human stock create.

Or for a more modern translation from Sedley's Lucretius and the Transformation of Greek Wisdom

Of these [sc. the four-element theorists] the foremost is
Empedocles of Acragas, born within the three-cornered terres-
trial coasts of the island [Sicily] around which the Ionian Sea,
flowing with its great windings, sprays the brine from its green
waves, and from whose boundaries the rushing sea with its
narrow strait divides the coasts of the Aeolian land with its
waves. Here is destructive Charybdis, and here the rumblings of
Etna give warning that they are once more gathering the wrath
of their flames so that her violence may again spew out the fire
flung from her jaws and hurl once more to the sky the lightning
flashes of flame. Although this great region seems in many ways
worthy of admiration by the human races, and is said to deserve
visiting for its wealth of good things and the great stock of men
that fortify it, yet it appears to have had in it nothing more
illustrious than this man, nor more holy, admirable, and pre-
cious. What is more, the poems sprung from his godlike mind
call out and expound his illustrious discoveries, so that he
scarcely seems to be born of mortal stock.

Comment by Garrett Baker (D0TheMath) on Examples of Highly Counterfactual Discoveries? · 2024-04-24T19:49:50.898Z · LW · GW

I find this very hard to believe. Shouldn't Chinese merchants have figured out eventually, traveling long distances using maps, that the Earth was a sphere? I wonder whether the "scholars" of ancient China actually represented the state-of-the-art practical knowledge that the Chinese had.

Nevertheless, I don't think this is all that counterfactual. If you're obsessed with measuring everything, and like to travel (like the Greeks), I think eventually you'll have to discover this fact.

Comment by Garrett Baker (D0TheMath) on Examples of Highly Counterfactual Discoveries? · 2024-04-24T18:10:03.627Z · LW · GW

I've heard an argument that Mendel was actually counter-productive to the development of genetics. That if you go and actually study peas like he did, you'll find they don't make perfect Punnett squares, and from the deviations you can derive recombination effects. The claim is he fudged his data a little in order to make it nicer, then this held back others from figuring out the topological structure of genotypes.

Comment by Garrett Baker (D0TheMath) on Examples of Highly Counterfactual Discoveries? · 2024-04-24T04:14:17.602Z · LW · GW

A precursor to Lucretius's thoughts on natural selection is Empedocles, who we have far fewer surviving writings from, but which is clearly a precursor to Lucretius' position. Lucretius himself cites & praises Empedocles on this subject.

Comment by Garrett Baker (D0TheMath) on Examples of Highly Counterfactual Discoveries? · 2024-04-23T23:33:51.981Z · LW · GW

Possibly Wantanabe's singular learning theory. The math is recent for math, but I think only like '70s recent, which is long given you're impressed by a 20-year math gap for Einstein. The first book was published in 2010, and the second in 2019, so possibly attributable to the deep learning revolution, but I don't know of anyone making the same math--except empirical stuff like the "neuron theory" of neural network learning which I was told about by you, empirical results like those here, and high-dimensional probability (which I haven't read, but whose cover alone indicates similar content).

Comment by Garrett Baker (D0TheMath) on David Udell's Shortform · 2024-04-23T23:04:08.381Z · LW · GW

Many who believe in God derive meaning, despite God theoretically being able to do anything they can do but better, from the fact that He chose not to do the tasks they are good at, and left them tasks to try to accomplish. Its common for such people to believe that this meaning would disappear if God disappeared, but whenever such a person does come to no longer believe in God, they often continue to see meaning in their life[1].

Now atheists worry about building God because it may destroy all meaning to our actions. I expect we'll adapt.

(edit: That is to say, I don't think you've adequately described what "meaning of life" is if you're worried about it going away in the situation you describe)


  1. If anything, they're more right than wrong, there has been much written about the "meaning crisis" we're in, possibly attributable to greater levels of atheism. ↩︎

Comment by Garrett Baker (D0TheMath) on Transformers Represent Belief State Geometry in their Residual Stream · 2024-04-23T03:22:15.095Z · LW · GW

Post the chat logs?

Comment by Garrett Baker (D0TheMath) on Priors and Prejudice · 2024-04-22T22:52:19.779Z · LW · GW

Priors are not things you can arbitrarily choose, and then throw you hands up and say "oh well, I guess I just have stuck priors, and that's why I look at the data, and conclude neoliberal-libertarian economics is mostly correct, and socialist economics is mostly wrong" to the extent you say this, you are not actually looking at any data, you are just making up an answer that sounds good, and then when you encounter conflicting evidence, you're stating you won't change your mind because of a flaw in your reasoning (stuck priors), and that's ok, because you have a flaw in your reasoning (stuck priors). Its a circular argument!

If this is what you actually believe, you shouldn't be making donations to either charter cities projects or developing unions projects[1]. Because what you actually believe is that the evidence you've seen is likely under both worldviews, and if you were "using" a non-gerrymandered prior or reasoning without your bottom-line already written, you'd have little reason to prefer one over the other.

Both of the alternatives you've presented are fools who in the back of their minds know they're fools, but care more about having emotionally satisfying worldviews instead of correct worldviews. To their credit, they have successfully double-thought their way to reasonable donation choices which would otherwise have destroyed their worldview. But they could do much better by no longer being fools.


  1. Alternatively, if you justify your donation anyway in terms of its exploration value, you should be making donations to both. ↩︎

Comment by Garrett Baker (D0TheMath) on Express interest in an "FHI of the West" · 2024-04-19T15:55:34.099Z · LW · GW

I wonder if everyone excited is just engaging by filling out the form rather than publicly commenting.

Comment by Garrett Baker (D0TheMath) on Transformers Represent Belief State Geometry in their Residual Stream · 2024-04-17T19:04:31.665Z · LW · GW

There is evidence that transformers are not in fact even implicitly, internally, optimized for reducing global prediction error (except insofar as comp-mech says they must in order to do well on the task they are optimized for).

Do transformers "think ahead" during inference at a given position? It is known transformers prepare information in the hidden states of the forward pass at t that is then used in future forward passes t+τ. We posit two explanations for this phenomenon: pre-caching, in which off-diagonal gradient terms present in training result in the model computing features at t irrelevant to the present inference task but useful for the future, and breadcrumbs, in which features most relevant to time step t are already the same as those that would most benefit inference at time t+τ. We test these hypotheses by training language models without propagating gradients to past timesteps, a scheme we formalize as myopic training. In a synthetic data setting, we find clear evidence for pre-caching. In the autoregressive language modeling setting, our experiments are more suggestive of the breadcrumbs hypothesis.

Comment by Garrett Baker (D0TheMath) on Open Thread Spring 2024 · 2024-04-16T23:44:42.750Z · LW · GW

A new update

Hi John,

thank you for sharing the job postings. We’re starting something really exciting, and as research leads on the team, we - Paul Lessard and Bruno Gavranović - thought we’d provide clarifications.

Symbolica was not started to improve ML using category theory. Instead, Symbolica was founded ~2 years ago, with its 2M seed funding round aimed at tackling the problem of symbolic reasoning, but at the time, its path to getting there wasn’t via categorical deep learning (CDL). The original plan was to use hypergraph rewriting as means of doing learning more efficiently. That approach however was eventually shown unviable.

Symbolica’s pivot to CDL started about five months ago. Bruno had just finished his Ph.D. thesis laying the foundations for the topic and we reoriented much of the organization towards this research direction. In particular, we began: a) refining a roadmap to develop and apply CDL, and b) writing a position paper, in collaboration with with researchers at Google DeepMind which you’ve cited below.

Over these last few months, it has become clear that our hunches about applicability are actually exciting and viable research directions. We’ve made fantastic progress, even doing some of the research we planned to advocate for in the aforementioned position paper. Really, we discovered just how much Taking Categories Seriously gives you in the field of Deep Learning.

Many advances in DL are about creating models which identify robust and general patterns in data (see the Transformers/Attention mechanism, for instance). In many ways this is exactly what CT is about: it is an indispensable tool for many scientists, including ourselves, to understand the world around us: to find robust patterns in data, but also to communicate, verify, and explain our reasoning.

At the same time, the research engineering team of Symbolica has made significant, independent, and concrete progress implementing a particular deep learning model that operates on text data, but not in an autoregressive manner as most GPT-style models do.

These developments were key signals to Vinod and other investors, leading to the closing of the 31M funding round.

We are now developing a research programme merging the two, leveraging insights from theories of structure, e.g. categorical algebra, as means of formalising the process by which we find structure in data. This has twofold consequence: pushing models to identify more robust patterns in data, but also interpretable and verifiable ones.

In summary:

a) The push to apply category theory was not based on a singular whim, as the the post might suggest,

but that instead

b) Symbolica is developing a serious research programme devoted to applying category theory to deep learning, not merely hiring category theorists

All of this is to add extra context for evaluating the company, its team, and our direction, which does not come across in the recently published tech articles.

We strongly encourage interested parties to look at all of the job ads, which we’ve tailored to particular roles. Roughly, in the CDL team, we’re looking for either

1) expertise in category theory, and a strong interest in deep learning, or

2) expertise in deep learning, and a strong interest in category theory.

at all levels of seniority.

Happy to answer any other questions/thoughts.

Bruno Gavranović,

Paul Lessard

Comment by Garrett Baker (D0TheMath) on D0TheMath's Shortform · 2024-04-16T17:51:52.445Z · LW · GW

From The Guns of August

Old Field Marshal Moltke in 1890 foretold that the next war might last seven years—or thirty—because the resources of a modern state were so great it would not know itself to be beaten after a single military defeat and would not give up [...] It went against human nature, however—and the nature of General Staffs—to follow through the logic of his own prophecy. Amorphous and without limits, the concept of a long war could not be scientifically planned for as could the orthodox, predictable, and simple solution of decisive battle and a short war. The younger Moltke was already Chief of Staff when he made his prophecy, but neither he nor his Staff, nor the Staff of any other country, ever made any effort to plan for a long war. Besides the two Moltkes, one dead and the other infirm of purpose, some military strategists in other countries glimpsed the possibility of prolonged war, but all preferred to believe, along with the bankers and industrialists, that because of the dislocation of economic life a general European war could not last longer than three or four months. One constant among the elements of 1914—as of any era—was the disposition of everyone on all sides not to prepare for the harder alternative, not to act upon what they suspected to be true.

Comment by Garrett Baker (D0TheMath) on Alexander Gietelink Oldenziel's Shortform · 2024-04-15T01:20:13.693Z · LW · GW

But such people are very obvious. You just give them a FizzBuzz test! This is why we have interviews, and work-trials.

Comment by Garrett Baker (D0TheMath) on Alexander Gietelink Oldenziel's Shortform · 2024-04-14T19:25:09.995Z · LW · GW

This style of argument proves too much. Why not see this dynamic with all jobs and products ever?

Comment by Garrett Baker (D0TheMath) on Open Thread Spring 2024 · 2024-04-13T16:33:19.019Z · LW · GW

I don’t think the bitter lesson strictly applies here. Since they’re doing learning, and the bitter lesson says “learning and search is all that is good”, I think they’re in the clear, as long as what they do is compute scalable.

(this is different from saying there aren’t other reasons an ignorant person (a word I like more than outside view in this context since it doesn’t hide the lack of knowledge) may use to conclude they won’t succeed)

Comment by Garrett Baker (D0TheMath) on Claude wants to be conscious · 2024-04-13T02:49:57.394Z · LW · GW

Sounds right. It would be interesting to see how extremely unconvincing you can get the prompts and still see the same behavior.

Also, ideally you would have a procedure for which its impossible for you to have gamed. Like, a problem right now is your could have tried a bunch of different prompts for each value, and then chosen prompts which cause the results you want, and never reported the prompts which don't cause the results you want.

Comment by Garrett Baker (D0TheMath) on Claude wants to be conscious · 2024-04-13T02:07:07.503Z · LW · GW

The main concern I have with this is whether its robust to different prompts probing for the same value. I can see a scenario where the model is reacting to how convincing the prompt sounds rather than high level features of it.

Comment by Garrett Baker (D0TheMath) on Open Thread Spring 2024 · 2024-04-13T01:55:21.051Z · LW · GW

Is this coming from deep knowledge about Symbolica's method, or just on outside view considerations like "usually people trying to think too big-brained end up failing when it comes to AI".

Comment by Garrett Baker (D0TheMath) on How to accelerate recovery from sleep debt with biohacking? · 2024-04-10T17:54:05.196Z · LW · GW

I will tell you the received wisdom from my friends' experience, and their friends' experience with polyphasic sleep: It is in theory doable, but often in practice a disaster because you end up getting less sleep than the minimal theoretical requirements given by polyphasic sleep.

If on polyphasic sleep you are sufficiently undisciplined that you end up racking up 40 hours of sleep debt, this wisdom would say you should probably stop doing polyphasic sleep. And instead of biohacking your way out, just have a few nights of normal sleep.

Comment by D0TheMath on [deleted post] 2024-04-10T07:39:23.635Z

My intention was not to dismiss or downplay the importance of various values, but instead to clarify our values by making careful distinctions. It is reasonable to critique my language for being too dry, detached, and academic when these are serious topics with real-world stakes. But to the extent you're claiming that I am actually trying to dismiss the value of happiness and friendships, that was simply not part of the post.

I can't (and didn't) speak to your intention, but I can speak of the results, which are that you do in fact down-play the importance of values such as love, laughter, happiness, fun, family, and friendship in favor of values like the maximization of pleasure, preference-satisfaction, and short-term increases in wealth & life-spans. I can tell because you talk of the latter, but not of the former.

And regardless of your intention you do also dismiss their long-term value, by decrying those who hold their long-term value utmost as "speciesist".

Comment by D0TheMath on [deleted post] 2024-04-10T00:16:56.118Z

This view seems implicit in your dismissal of "human species preservationism". If instead you described that view as "the moral view that values love, laughter, happiness, fun, family, and friends", I'm sure Aysja would be less alarmed by your rhetoric (but perhaps more horrified you're willing to so casually throw away such values).

As it is, you're ready to casually throw away such values, without even acknowledging what you're throwing away, lumping it all unreflectively as "speciesism", which I do think is rhetorically cause for alarm.

Comment by Garrett Baker (D0TheMath) on Any evidence or reason to expect a multiverse / Everett branches? · 2024-04-09T15:39:29.440Z · LW · GW

My apologies

Comment by Garrett Baker (D0TheMath) on Any evidence or reason to expect a multiverse / Everett branches? · 2024-04-09T08:00:22.805Z · LW · GW

The question is not how big the universe under various theories is, but how complicated the equations describing that theory are.

Otherwise, we’d reject the so-called “galactic” theory of star formation, in favor of the 2d projection theory, which states that the night sky only appears to have far distant galaxies, but is instead the result of a relatively complicated (wrt to newtonian mechanics) cellular automata projected onto our 2d sky. You see, the galactic theory requires 6 parameters to describe each object, and posits an enormously large number of objects, while the 2d projection theory requires but 4 parameters, and assumes an exponentially smaller number of particles, making it a more efficient compression of our observations.

see also

Comment by Garrett Baker (D0TheMath) on Any evidence or reason to expect a multiverse / Everett branches? · 2024-04-09T05:48:55.185Z · LW · GW

I've usually heard the justification for favoring Everett over pilot wave theory is on simplicity terms. We can explain everything we need in terms of just wave functions interacting with other wave functions, why also add in particles to the mix too? You get more complicated equations (so I'm told), with a greater number of types of objects, and a less elegant theory, for what? More intuitive metaphysics? Bah!

Though the real test is experimental, as you know. I don't think there's any experiments which separate out the two hypotheses, so it really is still up in the air which actually is a better description of our universe.

Comment by Garrett Baker (D0TheMath) on Religion = Cult + Culture · 2024-04-08T07:45:12.138Z · LW · GW

I have a bit of a different prescription than you do: Instead of aiming to make the community saner, aim to make yourself saner, and especially in ways as de-correlated from the rest of the community. Which often means staying far away from community drama, talking with more people who think very differently than most in the community, following strings of logic in strange & un-intuitive directions, asking yourself whether claims are actually true when they're made in proportion to how confident community members seem to be in such claims (people are most confident when they're most wrong, for groupthink, tails come apart, and un-analyzed assumptions reasons), and learning a lot.

A kind of put on your own mask before others' sort of approach.

Comment by Garrett Baker (D0TheMath) on Religion = Cult + Culture · 2024-04-08T07:27:01.744Z · LW · GW

People have criticized Eliezer for taking time to write fan fiction and indulge in polyamorous orgies, but notice that he hasn't burned out, despite worrying about AI for decades.

Not really relevant to your overall point, but I in fact think Eliezer has burnt out. He doesn't really work on alignment anymore as far as I know.

Comment by Garrett Baker (D0TheMath) on Religion = Cult + Culture · 2024-04-07T17:03:23.364Z · LW · GW

Do these things happen automatically as a consequence of trying to be rational, or did just someone accidentally build the Bay Area community on top of an ancient Indian burial ground?

As someone “on the ground” in the Bay Area, my first guess would be that the EA and rationality community here (and they are mostly a single community here) is very insular. Many have zero friends they meet up with regularly who aren’t rationalists or EAs.

A recipe for insane cults in my book.

Comment by Garrett Baker (D0TheMath) on My intellectual journey to (dis)solve the hard problem of consciousness · 2024-04-06T21:08:06.698Z · LW · GW

My guess is that we (re)perceive our perception as a meta-modality different from ordinary modalities like vision, hearing, etc, and that causes the illusion. It's plausible that being raised in a WEIRD culture contributes to that inclination.

This seems exceedingly unlikely. Virtually every culture has a conception of "soul" which they are confused about, and ascribe supernatural non-materialist properties to.

Comment by Garrett Baker (D0TheMath) on On Complexity Science · 2024-04-05T22:47:51.124Z · LW · GW

The problem with the difficulty frame is that I don't really see any reason to believe that you get the same problems & solutions to increasing the difficulty of the problems you try to solve in the following fields:

  • Economics
  • Sociology
  • Biology
  • Evolution
  • Neuroscience
  • AI
  • Probability theory
  • Ecology
  • Physics
  • Chemistry

Except of course from the sources of

  1. Increasing the difficulty of these in some ways plausibly leads to insights about agency & self-reference
  2. There are a bunch of mathematical problems we don't have efficient solution methods for yet (and maybe never will), like nonlinear dynamics and chaos.

I'm happy with 1, and 2 sounds like applied math for which the sea isn't high enough to touch yet. Maybe its still good to understand "what are the types of things we can say about stuff we don't yet understand", but I often find myself pretty unexcited about the stuff in complex systems theory which takes that approach. Maybe I just haven't been exposed enough to the right people advocating that.

Comment by Garrett Baker (D0TheMath) on On Complexity Science · 2024-04-05T20:51:34.595Z · LW · GW

Anthropic's power laws for scaling are sort of unsurprising, in a certain sense, if you know how ubiquitous some kinds of relationships are given some kinds of underlying dynamics (e.g. minimizing cost dynamics)

Also unsurprising from the comp-mech point of view I'm told.

For the first one, I'm currently making a suite of long-running games/tasks to generate streams of data from LLMs (and some other kinds of algorithms too, like basic RL and genetic algorithms eventually) and am running some techniques borrowed from financial analysis and signal processing (etc) on them because of some intuitions built from experience with models as well as what other nearby fields do

I'm curious about the technical details here, if you're willing to provide them (privately is fine too).

Comment by Garrett Baker (D0TheMath) on On Complexity Science · 2024-04-05T19:12:26.693Z · LW · GW

You seem to be knowledgeable in this area, what would you recommend someone read to get a good picture of things you find interesting in complex systems theory?

Comment by Garrett Baker (D0TheMath) on On Complexity Science · 2024-04-05T19:11:15.528Z · LW · GW

How do you intend to do those 3 things? In particular, 1 seems pretty cool if you can pull it off.

Comment by Garrett Baker (D0TheMath) on On Complexity Science · 2024-04-05T04:00:51.987Z · LW · GW

I have been served well in the past by trying to re-frame problems in terms of networks, as an example.

Comment by Garrett Baker (D0TheMath) on On Complexity Science · 2024-04-05T02:52:20.210Z · LW · GW

I agree that there often seems to be something very shallow about the methods, but I don't necessarily hold that against them. Many very useful results come from very shallow claims, and my impression is complexity science is often pretty useful at its best, and its wrong to consider their average performers or the current & past hypes.

Comparing to work done by the alignment-style agent-foundationists, you'll probably be disappointed about the "deepness", but I do think impressed by the applicability.

The test of a good idea is how often you find yourself coming back to it despite not thinking it was all that useful when first learning it. This has happened many times to me after learning about many methods in complex systems theory.

Comment by Garrett Baker (D0TheMath) on So You Created a Sociopath - New Book Announcement! · 2024-04-03T16:37:04.513Z · LW · GW

We aim for the book to be accessible to both the innocent and guilt-ridden.

Of course, if the advice was different for both the guilt-ridden and innocent, then, well, not only would it make the book longer, but many people would be able to tell how you feel about the situation your social movement has gotten itself into by just looking at what set of actions you’re taking.

Our advice is to take these same 5 actions and nothing more so people can’t tell, even in principal! As you say, they work well regardless of your circumstances. And I say that’s a big plus!

Comment by Garrett Baker (D0TheMath) on Announcing Suffering For Good · 2024-04-01T20:01:00.552Z · LW · GW

We have, this is against current US regulations in the industry, and would likely make the products of the suffering minds less tasty, and therefore less profitable.

Edit: It is also just more efficient to have rats on heroin, since we get more pleasured minds per smaller dose.

Comment by Garrett Baker (D0TheMath) on Announcing Suffering For Good · 2024-04-01T19:04:32.680Z · LW · GW

I'd propose an extension: generate offsets by failing to be an animal farmer. Even at like a 10:1 ratio, it should be easy to justify suffering of an arbitrary number of animals, since there will be orders of magnitude of animals that don't even exist, and can't suffer. Or even allow self-offsets. Say you torture and kill 1000 pigs a month - you COULD have tortured and killed orders of magnitude more than that, so you're on net improving the welfare of the potential universe of animals.

We have indeed considered this strategy, but we found it was in fact more cost effective to torture & kill >1000 pigs given rat-heroin offsets.

In fact, with the knowledge advantage that activists have in factory farming, they can probably be MORE PROFITABLE at running their own farms than they can by selling the offsets. And these profits can be reinvested in more farming activities, now that we've shown them to be morally-positive (aka: earning to give, in addition to direct action).

This would be correct if we were long-termists, but we in fact find we have fairly steep time discount curves. There will be a startup period, where we just focus on factory farming optimization, but due to security concerns we are unable to release how long that period will be at this time.

Comment by Garrett Baker (D0TheMath) on [April Fools' Day] Introducing Open Asteroid Impact · 2024-04-01T18:47:29.658Z · LW · GW

I put together this quick problem factorization, and I don't think the numbers come back all that impressive:

  1. Locate asteroid (<1%, much of the space in space is not an asteroid)

  2. Get to asteroid (<1%, as in 1, since you have the same problems as 1 even if you know where the asteroid is)

  3. Get back to earth (<1%, as in 1 and 2, essentially the same problems as 1 and 2, most of space isn't the Earth)

  4. Get the asteroid through the atmosphere (5%, the asteroid would likely burn up, but perhaps you have a solution for that)

  5. Locate the asteroid on Earth (<25%, most of Earth is not asteroid, but you could use a GPS for this one. The problem is the asteroid may land in a location you don't have free access to, like... I don't know, anywhere in the ocean? If it lands in the ocean, because its made of rock, it will surely sink, and that itself will be an entirely new endeavor)

Comment by Garrett Baker (D0TheMath) on [April Fools' Day] Introducing Open Asteroid Impact · 2024-04-01T18:41:45.435Z · LW · GW

This is patently absurd. The vast majority of asteroids are incredibly small, and would most likely burn up in the upper atmosphere. Your chances of finding a large one are incredibly small, even assuming you can locate & get to an asteroid in the first place. In order to even get to the asteroids, and protect them on the way down, your ship would need to be bigger than them!

Don't get me wrong, you may make a few bucks from iron mining, but claiming to be in a "race" or that "safety" is a concern? Please.

Comment by Garrett Baker (D0TheMath) on metachirality's Shortform · 2024-04-01T03:41:37.267Z · LW · GW

You can randomize the default comment ordering in your account settings page.

Comment by Garrett Baker (D0TheMath) on D0TheMath's Shortform · 2024-03-29T20:25:13.044Z · LW · GW

I'm ssh-ing into it. I bet there's a way, but not worth it for me to figure out (but if someone knows the way, please tell).

Comment by Garrett Baker (D0TheMath) on D0TheMath's Shortform · 2024-03-29T01:31:40.740Z · LW · GW

A strange effect: I'm using a GPU in Russia right now, which doesn't have access to copilot, and so when I'm on vscode I sometimes pause expecting copilot to write stuff for me, and then when it doesn't I feel a brief amount of the same kind of sadness I feel when a close friend is far away & I miss them.

Comment by Garrett Baker (D0TheMath) on Alexander Gietelink Oldenziel's Shortform · 2024-03-26T15:49:31.808Z · LW · GW

I think the above would be considered relatively uncontroversial in EA circles.

I don’t think the application to EA itself would be uncontroversial.

Comment by Garrett Baker (D0TheMath) on All About Concave and Convex Agents · 2024-03-24T22:59:19.580Z · LW · GW

I can't as easily think of a general argument against a misaligned AI ending up convex though.

Most goals humans want you to achieve require concave-agent-like behaviors perhaps?

Comment by Garrett Baker (D0TheMath) on Neuroscience and Alignment · 2024-03-24T16:09:45.852Z · LW · GW

Does this comment I wrote clear up my claim?

A clarification about in what sense I claim "biological and artificial neural-networks are based upon the same fundamental principles":

I would not be surprised if the reasons why neural networks "work" are also exploited by the brain.

In particular why I think neuroscience for value alignment is good is because we can expect that the values part of the brain will be compatible with these reasons, and won’t require too much extra fundamental advances to actually implement, unlike say corrigibility, which will first progress from ideal utility maximizers, and then require a mapping from that to neural networks, which seems potentially just as hard as writing an AGI from scratch.

In the case where human values are incompatible with artificial neural networks, again I get much more pessimistic about all alternative forms of value alignment of neural networks.

Comment by Garrett Baker (D0TheMath) on D0TheMath's Shortform · 2024-03-24T05:48:22.513Z · LW · GW

You are right, but I guess the thing I do actually care about here is the magnitude of the advancement (which is relevant for determining the sign of the action). How large an effect do you think the model merging stuff has (I'm thinking the effect where if you train a bunch of models, then average their weights, they do better). It seems very likely to me its essentially zero, but I do admit there's a small negative tail that's greater than the positive, so the average is likely negative.

As for agent interactions, all the (useful) advances there seem things that definitely would have been made even if nobody released any LLMs, and everything was APIs.

Comment by Garrett Baker (D0TheMath) on What does "autodidact" mean? · 2024-03-24T02:48:59.942Z · LW · GW

But in a world where everyone in the last three generations was an autodidact, you most likely wouldn't be good at math, because you most likely wouldn't even know that there was such a thing as math.

This seems false. Often those who are rich get rich off of profitable subjects, and end up spreading awareness of those subjects. Many were never taught programming in school, yet learned to program anyway. Schools could completely neglect that subject, and still it would spread.

Comment by Garrett Baker (D0TheMath) on Shortform · 2024-03-24T00:54:16.904Z · LW · GW

actual memoirs of soldiers and the like who typically state that they were surprised how little they cared compared to the time they lied to their grandmother or whatever.

Recommendations for such memoirs?

Comment by Garrett Baker (D0TheMath) on D0TheMath's Shortform · 2024-03-23T14:50:16.251Z · LW · GW

In LLM land, though not as drastic, we see similar things happening, in particular technqiues for merging models to get rapid capability advances, and rapid creation of new patterns for agent interactions and tool use.

The biggest effect open sourcing LLMs seems to have is improving safety techniques. Why think this differentially accelerates capabilities over safety?

Comment by Garrett Baker (D0TheMath) on D0TheMath's Shortform · 2024-03-23T00:59:27.699Z · LW · GW

To be clear: The mechanism you're hypothesizing is:

  1. Critics say "AI alignment is dumb because you want to ban open source AI!"

  2. Naive supporters read this, believe the claim that AI alignment-ers want to ban open sourcing AI and think 'AI alignment is not dumb, therefore open sourcing AI must be bad'. When the next weight release happens they say "This is bad! Open sourcing weights is bad and should be banned!"

  3. Naive supporters read other naive supporters saying this, and believe it themselves. Wise supporters try to explain no, but are either labeled as a critic or weird & ignored.

  4. Thus, a group think is born. Perhaps some wise critics "defer to the community" on the subject.