## Posts

## Comments

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on D0TheMath's Shortform · 2024-06-12T09:05:55.834Z · LW · GW

Never ? That's quite a bold prediction. Seems more likely than not that AI companies will be effectively nationalized. I'm curious why you think it will never happen.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Dalcy's Shortform · 2024-06-09T23:11:33.036Z · LW · GW

yes !! discovered this last week - seems very important the quantitative regret bounds for approximatiions is especially promising

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Transformers Represent Belief State Geometry in their Residual Stream · 2024-06-09T17:56:19.864Z · LW · GW

You are absolutely right and I am of course absolutely and embarrasingly wrong.

The minimal optimal predictor as a Hidden Markov Model of the simple nonunfilar is indeed infinite. This implies that any other architecture must be capable of expressing infinitely many states - but this is quite a weak statement - it's very easy for a machine to dynamically express finitely many states with finite memory. In particular, a transformer should absolutely be able to learn the MSP of the epsilon machine of the simple nonunifilar source - indeed it can even be solved analytically.

This was an embarrasing mistake I should not have made. I regret my rash overconfidence - I should have taken a moment to think it through since the statement was obviously wrong. Thank you for pointing it out.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Announcing ILIAD — Theoretical AI Alignment Conference · 2024-06-05T16:44:41.437Z · LW · GW

We intend to review end of the submit deadline June 30th but I wouldn't hold off on your application.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Transformers Represent Belief State Geometry in their Residual Stream · 2024-06-05T16:41:47.521Z · LW · GW

Behold

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on What's next for the field of Agent Foundations? · 2024-05-30T10:01:07.108Z · LW · GW

You may be positively surprised to know I agree with you. :)

For context, the dialogue feature just came out on LW. We gave it a try and this was the result. I think we mostly concluded that the dialogue feature wasn't quite worth the effort. Anyway

I like what you're suggesting and would be open to do a dialogue about it !

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on D0TheMath's Shortform · 2024-05-30T09:56:24.583Z · LW · GW

Compare also the central conceit of QM /Koopmania. Take a classical nonlinear finite-dimensional system X described by a say a PDE. This is a dynamical system with evolution operator X -> X. Now look at the space H(X) of C/R-valued functions on the phase space of X. After completion we obtain an Hilbert space H. Now the evolution operator on X induces a map on H= H(X). We have now turned a finite-dimensional nonlinear problem into an infinite-dimensional linear problem.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Alexander Gietelink Oldenziel's Shortform · 2024-05-27T19:30:49.122Z · LW · GW

Probably within.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Alexander Gietelink Oldenziel's Shortform · 2024-05-27T19:30:01.226Z · LW · GW

I mostly regard LLMs = [scaling a feedforward network on large numbers of GPUs and data] as a single innovation.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Dalcy's Shortform · 2024-05-27T16:01:18.840Z · LW · GW

One result to mention in computational complexity is the PCP theorem which not only gives probabilistically checkable proofs but also gives approximation case hardness. Seems deep but I haven't understood the proof yet.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Alexander Gietelink Oldenziel's Shortform · 2024-05-27T12:52:22.928Z · LW · GW

**My mainline prediction scenario for the next decades.**

My mainline prediction * :

- LLMs will not scale to AGI. They will not spawn evil gremlins or mesa-optimizers. BUT Scaling laws will continue to hold and future LLMs will be very impressive and make a sizable impact on the real economy and science over the next decade.
- there is a single innovation left to make AGI-in-the-alex sense work, i.e. coherent, long-term planning agents (LTPA) that are effective and efficient in data sparse domains over long horizons.
- that innovation will be found within the next 10-15 years
- It will be clear to the general public that these are dangerous
- governments will act quickly and (relativiely) decisively to bring these agents under state-control. national security concerns will dominate.
- power will reside mostly with governments AI safety institutes and national security agencies. In so far as divisions of tech companies are able to create LTPAs they will be effectively nationalized.
- International treaties will be made to constrain AI, outlawing the development of LTPAs by private companies. Great power competition will mean US and China will continue developing LTPAs, possibly largely boxed. Treaties will try to constrain this development with only partial succes (similar to nuclear treaties).
- LLMs will continue to exist and be used by the general public
- Conditional on AI ruin the closest analogy is probably something like the Cortez-Pizarro-Afonso takeovers. Unaligned AI will rely on human infrastructure and human allies for the earlier parts of takeover - but its inherent advantages in tech, coherence, decision-making and (artificial) plagues will be the deciding factor.
- The world may be mildly multi-polar.
- This will involve conflict between AIs.
- AIs very possible may be able to cooperate in ways humans can't.

- The arrival of AGI will immediately inaugurate a scientific revolution. Sci-fi sounding progress like advanced robotics, quantum magic, nanotech, life extension, laser weapons, large space engineering, cure of many/most remaining diseases will become possible within two decades of AGI, possibly much faster.
- Military power will shift to automated manufacturing of drones & weaponized artificial plagues. Drones, mostly flying will dominate the battlefield. Mass production of drones and their rapid and effective deployment in swarms will be key to victory.

Two points on which I differ with most commentators: (i) I believe AGI is a real (mostly discrete) thing , not a vibe, or a general increase of improved tools. I believe it is inherently agenctic. I don't think spontaneous emergence of agents is impossible but I think it is more plausible agents will be built rather than grown.

(ii) I believe in general the ea/ai safety community is way overrating the importance of individual tech companies vis a vis broader trends and the power of governments. I strongly agree with Stefan Schubert's take here on the latent hidden power of government: https://stefanschubert.substack.com/p/crises-reveal-centralisation

Consequently, the ea/ai safety community is often myopically focusing on boardroom politics that are relativily inconsequential in the grand scheme of things.

*where by mainline prediction I mean the scenario that is the mode of what I expect. This is the single likeliest scenario. However, since it contains a large number of details each of which could go differently, the probability on this specific scenario is still low.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Alexander Gietelink Oldenziel's Shortform · 2024-05-20T12:15:55.481Z · LW · GW

**Why no prediction markets for large infrastructure projects?**

Been reading this excellent piece on why prediction markets aren't popular. They say that without subsidies prediction markets won't be large enough; the information value of prediction markets is often nog high enough.

Large infrastructure projects undertaken by governments, and other large actors often go overbudget, often hilariously so: 3x,5x,10x or more is not uncommon, indeed often even the standard.

One of the reasons is that government officials deciding on billion dollar infrastructure projects don't have enough skin in the game. Politicians are often not long enough in office to care on the time horizons of large infrastructure projects. Contractors don't gain by being efficient or delivering on time. To the contrary, infrastructure projects are huge cashcows. Another problem is that there are often far too many veto-stakeholders. All too often the initial bid is wildly overoptimistic.

Similar considerations apply to other government projects like defense procurement or IT projects.

Okay - how to remedy this situation? Internal prediction markets theoretically could prove beneficial. All stakeholders & decisionmakers are endowed with vested equity with which they are forced to bet on building timelines and other key performance indicators. External traders may also enter the market, selling and buying the contracts. The effective subsidy could be quite large. Key decisions could save billions.

In this world, government officials could gain a large windfall which may be difficult to explain to voters. This is a legitimate objection.

A very simple mechanism would simply ask people to make an estimate on the cost C and the timeline T for completion. Your eventual payout would be proportional to how close you ended up to the real C,T compared to the other bettors. [something something log scoring rule is proper].

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Alexander Gietelink Oldenziel's Shortform · 2024-05-20T07:39:47.406Z · LW · GW

I don't know what you mean by 'general intelligence' exactly but I suspect you mean something like human+ capability in a broad range of domains. I agree LLMs will become generally intelligent in this sense when scaled, arguably even are, for domains with sufficient data. But that's kind of the sticker right? Cave men didn't have the whole internet to learn from yet somehow did something that not even you seem to claim LLMs will be able to do: create the (date of the) Internet.

(Your last claim seems surprising. Pre-2014 games don't have close to the ELO of alphaZero. So a next-token would be trained to simulate a human player up tot 2800, not 3200+. )

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Alexander Gietelink Oldenziel's Shortform · 2024-05-19T21:15:41.943Z · LW · GW

I would be genuinely surprised if training a transformer on the pre2014 human Go data over and over would lead it to spontaneously develop alphaZero capacity. I would expect it to do what it is trained to: emulate / predict as best as possible the distribution of human play. To some degree I would anticipate the transformer might develop some emergent ability that might make it slightly better than Go-Magnus - as we've seen in other cases - but I'd be surprised if this would be unbounded. This is simply not what the training signal is.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Alexander Gietelink Oldenziel's Shortform · 2024-05-19T20:23:57.882Z · LW · GW

Could you train an LLM on pre 2014 Go games that could beat AlphaZero?

I rest my case.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Alexander Gietelink Oldenziel's Shortform · 2024-05-18T20:51:59.865Z · LW · GW

In my mainline model there are only a few innovations needed, perhaps only a single big one to product an AGI which just like the Turing Machine sits at the top of the Chomsky Hierarchy will be basically the optimal architecture given resource constraints. There are probably some minor improvements todo with bridging the gap between theoretically optimal architecture and the actual architecture, or parts of the algorithm that can be indefinitely improved but with diminishing returns (these probably exist due to Levin and possibly.matrix.multiplication is one of these). On the whole I expect AI research to be very chunky.

Indeed, we've seen that there was really just one big idea to all current AI progress: scaling, specifically scaling GPUs on maximally large undifferentiated datasets. There were some minor technical innovations needed to pull this off but on the whole that was the clinger.

Of course, I don't know. Nobody knows. But I find this the most plausible guess based on what we know about intelligence, learning, theoretical computer science and science in general.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Alexander Gietelink Oldenziel's Shortform · 2024-05-18T20:40:43.434Z · LW · GW

My timelines were not 2026. In fact, I made bets against doomers 2-3 years ago, one will resolve by next year.

I agree iterative improvements are significant. This falls under "naive extrapolation of scaling laws".

By nanotech I mean something akin to drexlerian nanotech or something similarly transformative in the vicinity. I think it is plausible that a true ASI will be able to make rapid progress (perhaps on the order of a few years or a decade) on nanotech. I suspect that people that don't take this as a serious possibility haven't really thought through what AGI/ASI means + what the limits and drivers of science and tech really are; I suspect they are simply falling prey to status-quo bias.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on D0TheMath's Shortform · 2024-05-18T20:31:23.153Z · LW · GW

Can somebody explain to me what's happening in this paper ?

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Alexander Gietelink Oldenziel's Shortform · 2024-05-15T20:12:28.057Z · LW · GW

Beautifully illustrated and amusingly put, sir!

A variant of what you are saying is that AI may once and for all allow us to calculate the true ~~counterfactual ~~ Shapley value of scientific contributions.

( re: ancestor simulations

I think you are onto something here. Compare the Q hypothesis:

https://twitter.com/dalcy_me/status/1780571900957339771

see also speculations about Zhuangzi hypothesis here )

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Alexander Gietelink Oldenziel's Shortform · 2024-05-15T19:29:59.076Z · LW · GW

Why do you think there are these low-hanging algorithmic improvements?

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Alexander Gietelink Oldenziel's Shortform · 2024-05-14T14:21:57.864Z · LW · GW

I didn't intend the causes to equate to direct computation of \phi(x) on the x_i. They are rather other pieces of evidence that the powerful agent has that make it believe \phi(x_i). I don't know if that's what you meant.

I agree seeing x_i such that \phi(x_i) should increase credence in \forall x \phi(x) even in the presence of knowledge of C_j. And the Shapely value proposal will do so.

(Bad tex. On my phone)

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Alexander Gietelink Oldenziel's Shortform · 2024-05-14T11:22:49.005Z · LW · GW

**Problem of Old Evidence, the Paradox of Ignorance and Shapley Values**

**Paradox of Ignorance**

Paul Christiano presents the "paradox of ignorance" where a weaker, less informed agent appears to outperform a more powerful, more informed agent in certain situations. This seems to contradict the intuitive desideratum that more information should always lead to better performance.

The example given is of two agents, one powerful and one limited, trying to determine the truth of a universal statement ∀x:ϕ(x) for some Δ0 formula ϕ. The limited agent treats each new value of ϕ(x) as a surprise and evidence about the generalization ∀x:ϕ(x). So it can query the environment about some simple inputs x and get a reasonable view of the universal generalization.

In contrast, the more powerful agent may be able to deduce ϕ(x) directly for simple x. Because it assigns these statements prior probability 1, they don't act as evidence at all about the universal generalization ∀x:ϕ(x). So the powerful agent must consult the environment about more complex examples and pay a higher cost to form reasonable beliefs about the generalization.

**Is it really a problem?**

However, I argue that the more powerful agent is actually justified in assigning less credence to the universal statement ∀x:ϕ(x). The reason is that the probability mass provided by examples x₁, ..., xₙ such that ϕ(xᵢ) holds is now distributed among the universal statement ∀x:ϕ(x) and additional causes Cⱼ known to the more powerful agent that also imply ϕ(xᵢ). Consequently, ∀x:ϕ(x) becomes less "necessary" and has less relative explanatory power for the more informed agent.

An implication of this perspective is that if the weaker agent learns about the additional causes Cⱼ, it should also lower its credence in ∀x:ϕ(x).

More generally, we would like the credence assigned to propositions P (such as ∀x:ϕ(x)) to be independent of the order in which we acquire new facts (like xᵢ, ϕ(xᵢ), and causes Cⱼ).

**Shapley Value**

The Shapley value addresses this limitation by providing a way to average over all possible orders of learning new facts. It measures the marginal contribution of an item (like a piece of evidence) to the value of sets containing that item, considering all possible permutations of the items. By using the Shapley value, we can obtain an order-independent measure of the contribution of each new fact to our beliefs about propositions like ∀x:ϕ(x).

**Further thoughts**

I believe this is closely related, perhaps identical, to the 'Problem of Old Evidence' as considered by Abram Demski.

Suppose a new scientific hypothesis, such as general relativity, explains a well-know observation such as the perihelion precession of mercury better than any existing theory. Intuitively, this is a point in favor of the new theory. However, the probability for the well-known observation was already at 100%. How can a previously-known statement provide new support for the hypothesis, as if we are re-updating on evidence we've already updated on long ago? This is known as

the problem of old evidence, and is usually levelled as a charge against Bayesian epistemology.

[Thanks to @Jeremy Gillen for pointing me towards this interesting Christiano paper]

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Alexander Gietelink Oldenziel's Shortform · 2024-05-14T08:32:04.306Z · LW · GW

Those numbers don't really accord with my experience actually using gpt-4. Generic prompting techniques just don't help all that much.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Alexander Gietelink Oldenziel's Shortform · 2024-05-14T05:30:48.980Z · LW · GW

I've never done explicit timelines estimates before so nothing to compare to. But since it's a gut feeling anyway, I'm saying my gut is lengthening.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Alexander Gietelink Oldenziel's Shortform · 2024-05-14T05:28:06.361Z · LW · GW

Yes agreed.

What I don't get about this position: If it was indeed just scaling - what's AI research for ? There is nothing to discover, just scale more compute. Sure you can maybe improve the speed of deploying compute a little but at the core of it it seems like a story that's in conflict with itself?

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Alexander Gietelink Oldenziel's Shortform · 2024-05-13T21:57:27.719Z · LW · GW

You may be right. I don't know of course.

At this moment in time, it seems scaffolding tricks haven't really improved the baseline performance of models that much. Overwhelmingly, the capability comes down to whether the rlfhed base model can do the task.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Alexander Gietelink Oldenziel's Shortform · 2024-05-13T21:45:18.580Z · LW · GW

To some degree yes, they were not guaranteed to hold. But by that point they held for over 10 OOMs iirc and there was no known reason they couldn't continue.

This might be the particular twitter bubble I was in but people definitely predicted capabilities beyond simple extrapolation of scaling laws.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Alexander Gietelink Oldenziel's Shortform · 2024-05-13T20:37:14.249Z · LW · GW

*My timelines are lengthening. *

I've long been a skeptic of scaling LLMs to AGI *. To me I fundamentally don't understand how this is even possible. It must be said that very smart people give this view credence. davidad, dmurfet. on the other side are vanessa kosoy and steven byrnes. When pushed proponents don't actually defend the position that a large enough transformer will create nanotech or even obsolete their job. They usually mumble something about scaffolding.

I won't get into this debate here but I do want to note that my timelines have lengthened, primarily because some of the never-clearly-stated but heavily implied AI developments by proponents of very short timelines have not materialized. To be clear, it has only been a year since gpt-4 is released, and gpt-5 is around the corner, so perhaps my hope is premature. Still my timelines are lengthening.

A year ago, when gpt-3 came out progress was blindingly fast. Part of short timelines came from a sense of 'if we got surprised so hard by gpt2-3, we are completely uncalibrated, who knows what comes next'.

People seemed surprised by gpt-4 in a way that seemed uncalibrated to me. gpt-4 performance was basically in line with what one would expect if the scaling laws continued to hold. At the time it was already clear that the only really important driver was compute data and that we would run out of both shortly after gpt-4. Scaling proponents suggested this was only the beginning, that there was a whole host of innovation that would be coming. Whispers of mesa-optimizers and simulators.

One year in: Chain-of-thought doesn't actually improve things that much. External memory and super context lengths ditto. A whole list of proposed architectures seem to serve solely as a paper mill. Every month there is new hype about the latest LLM or image model. Yet they never deviate from expectations based on simple extrapolation of the scaling laws. There is only one thing that really seems to matter and that is compute and data. We have about 3 more OOMs of compute to go. Data may be milked another OOM.

A big question will be whether gpt-5 will suddenly make agentGPT work ( and to what degree). It would seem that gpt-4 is in many ways far more capable than (most or all) humans yet agentGPT is curiously bad.

All-in-all AI progress** is developing according to the naive extrapolations of Scaling Laws but nothing beyond that. The breathless twitter hype about new models is still there but it seems to be believed more at a simulacra level higher than I can parse.

Does this mean we'll hit an AI winter? No. In my model there may be only one remaining roadblock to ASI (and I suspect I know what it is). That innovation could come at anytime. I don't know how hard it is, but I suspect it is not too hard.

* the term AGI seems to denote vastly different things to different people in a way I find deeply confusing. I notice that the thing that I thought everybody meant by AGI is now being called ASI. So when I write AGI, feel free to substitute ASI.

** or better, AI congress

addendum: since I've been quoted in dmurfet's AXRP interview as believing that there are certain kinds of reasoning that cannot be represented by transformers/LLMs I want to be clear that this is not really an accurate portrayal of my beliefs. e.g. I don't think transformers don't truly understand, are just a stochastic parrot, or in other ways can't engage in the abstract reasoning that humans do. I think this is clearly false, as seen by interacting with any frontier model.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Alexander Gietelink Oldenziel's Shortform · 2024-05-13T20:13:17.625Z · LW · GW

**Wildlife Welfare Will Win**

The long arc of history bend towards gentleness and compassion. Future generations will look with horror on factory farming. And already young people are following this moral thread to its logical conclusion; turning their eyes in disgust to mother nature, red in tooth and claw. Wildlife Welfare Done Right, compassion towards our pets followed to its forceful conclusion would entail the forced uploading of all higher animals, and judging by the memetic virulences of shrimp welfare to lower animals as well.

Morality-upon-reflexion may very well converge on a simple form of pain-pleasure utilitarianism.

There are few caveats: future society is not dominated, controlled and designed by a singleton AI-supervised state, technology inevitable stalls and that invisible hand performs its inexorable logic for the eons and an Malthuso-Hansonian world will emerge once again - the industrial revolution but a short blip of cornucopia.

Perhaps a theory of consciousness is discovered and proves once and for all homo sapiens and only homo sapiens are conscious ( to a significant degree). Perhaps society will wirehead itself into blissful oblivion. Or perhaps a superior machine intelligence arises, one whose final telos is the whole of and nothing but office supplies. Or perhaps stranger things still happen and the astronomo-cosmic compute of our cosmic endowment is engaged for mysterious purposes. Arise, self-made god of pancosmos. Thy name is UDASSA.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on New intro textbook on AIXI · 2024-05-12T11:48:16.016Z · LW · GW

I have heard of AIXI but haven't looked deeply into it. I'm curious about it. What are some results you think are cool in this field ?

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Thomas Kwa's Shortform · 2024-05-10T08:22:50.668Z · LW · GW

This seems valuable! I'd be curious to hear more !!

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Thomas Kwa's Shortform · 2024-05-04T08:42:06.431Z · LW · GW

Interesting...

Wouldn't I expect the evidence to come out in a few big chunks, e.g. OpenAI releasing a new product?

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Dalcy's Shortform · 2024-05-03T23:31:26.467Z · LW · GW

I agree with you.

Epsilon machine (and MSP) construction is most likely computationally intractable [I don't know an exact statement of such a result in the literature but I suspect it is true] for realistic scenarios.

Scaling an approximate version of epsilon reconstruction seems therefore of prime importance. Real world architectures and data has highly specific structure & symmetry that makes it different from completely generic HMMs. This must most likely be exploited.

The calculi of emergence paper has inspired many people but has not been developed much. Many of the details are somewhat obscure, vague. I also believe that most likely completely different methods are needed to push the program further. Computational Mechanics' is primarily a theory of hidden markov models - it doesn't have the tools to easily describe behaviour higher up the Chomsky hierarchy. I suspect more powerful and sophisticated algebraic, logical and categorical thinking will be needed here. I caveat this by saying that Paul Riechers has pointed out that actually one can understand all these gadgets up the Chomsky hierarchy as infinite HMMs which may be analyzed usefully just as finite HMMs.

The still-underdeveloped theory of epsilon transducers I regard as the most promising lens on agent foundations. This is uncharcted territory; I suspect the largest impact of computational mechanics will come from this direction.

Your point on True Names is well-taken. More basic examples than gauge information, synchronization order are the triple of quantites entropy rate , excess entropy and Crutchfield's statistical/forecasting complexity . These are the most important quantities to understand for any stochastic process (such as the structure of language and LLMs!)

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Transformers Represent Belief State Geometry in their Residual Stream · 2024-05-03T23:13:27.667Z · LW · GW

Non exhaustive list of reasons one could be interested in computational mechanics: https://www.lesswrong.com/posts/GG2NFdgtxxjEssyiE/dalcy-s-shortform?commentId=DdnaLZmJwusPkGn96

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Transformers Represent Belief State Geometry in their Residual Stream · 2024-05-03T14:30:24.648Z · LW · GW

I agree with you that the new/surprising thing is the linearity of the probe. Also I agree that not entirely clear how surprising & new linearity of the probe is.

If you understand how the causal states construction & the MSP works in computational mechanics the experimental results isn't surprising. Indeed, it can't be any other way!
That's exactly the *magic* of the definition of causal states.

What one person might find surprising or new another thinks trivial. The subtle magic of the right theoretical framework is that it makes the complex simple, surprising phenomena apparent.

Before learning about causal states I would have not even considered that there is a unique (!) optimal minimal predictor canonical constructible from the data. Nor that the geometry of synchronizing belief states is generically a fractal. Of course, once one has properly internalized the definitions this is almost immediate. Pretty pictures can be helpful in building that intuition !

Adam and I (and many others) have been preaching the gospel of computational mechanics for a while now. Most of it has fallen on deaf ears before. Like you I have been (positively!) surprised and amused by the sudden outpouring of interest. No doubt it's in part a the testimony to the Power of the Visual! Never look a gift horse in the mouth ! ^{_}

I would say the parts of computational mechanics I am really excited are a little deeper - downstream of causal states & the MSP. This is just a taster.

I'm confused & intrigued by your insistence that this is follows from the good regulator theorem. Like Adam I don't understand it. It is my understanding is that the original 'theorem' was wordcelled nonsense but that John has been able to formulate a nontrivial version of the theorem. My experience is that it the theorem is often invoked in a handwavey way that leaves me no less confused than before. No doubt due to my own ignorance !

I would be curious to hear a *precise * statement why the result here follows from the Good Regular Theorem.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Can stealth aircraft be detected optically? · 2024-05-02T07:54:59.504Z · LW · GW

Military nerds correct me if I'm wrong but I think the answer might be the following. I'm not a pilot etc etc.

Stealth can be a bit of a misleading term. F35 aren't actually 'stealth aircraft' - they are low-observable aircraft. You can detect F35s with longwave radar.

The problem isn't knowing that there is a F35 but to get a weapon -grade lock on it. This is much harder and your grainy gpt-interpreted photo isn't close to enough for a missile I think. You mentioned this already as a possibility.

The Ukrainians pioneered something similar for audio which is used to detect missiles & drones entering Ukrainian airspace.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on dkornai's Shortform · 2024-05-01T19:59:56.461Z · LW · GW

It also suggests that there might some sort of conservation law for pain for agents.

Conservation of Pain if you will

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Why I stopped being into basin broadness · 2024-04-27T16:05:54.797Z · LW · GW

ingular Sure! I'll try and say some relevant things below. In general, I suggest looking at Liam Carroll's distillation over Watanabe's book (which is quite heavy going, but good as a reference text). There are also some links below that may prove helpful.

The empirical loss and its second derivative are statistical estimator of the population loss and its second derivative. Ultimately the latter controls the properties of the former (though the relation between the second derivative of the empirical loss and the second derivative of the population loss is a little subtle).

The [matrix of] second derivatives of the population loss at the minima is called the Fischer information metric. It's * always * degenerate [i.e. singular] for any statistical model with hidden states or hierarchichal structure. Analyses that don't take this into account are inherently flawed.

SLT tells us that the local geometry around the minimum nevertheless controls the learning and generalization behaviour of any Bayesian learner for large N. N doesn't have to be that large though, empirically the asymptotic behaviour that SLT predicts is already hit for N=200.

In some sense, SLT says that the broad basin intuition is broadly correct but this needs to be heavily caveated. Our low-dimensional intuition for broad basin is misleading. For singular statistical models (again everything used in ML is highly singular) the local geometry around the minima in high dimensions is very weird.

Maybe you've heard of the behaviour of the volume of a sphere in high dimensions: most of it is contained on the shell. I like to think of the local geometry as some sort of fractal sea urchin. Maybe you like that picture, maybe you don't but it doesn't matter. SLT gives actual math that is provably the right thing for a Bayesian learner.

[real ML practice isn't Bayesian learning though? Yes, this is true. Nevertheless, there is both empirical and mathematical evidence that the Bayesian quantitites are still highly relevant for actual learning]

SLT says that the Bayesian posterior is controlled by the local geometry of the minimum. The dominant factor for N~>= 200 is the fractal dimension of the minimum. This is the RLCT and it is the most important quantity of SLT.

There are some misconception about the RLCT floating around. One way to think about is as an 'effective fractal dimension' but one has to be careful about this. There is a notion of effective dimension in the standard ML literature where one takes the parameter count and mods out parameters that don't do anything (because of symmetries). The RLCT picks up on symmetries but it is not *just* that. It picks up on how degenerate directions in the fischer information metric are ~= how broad is the basin in that direction.

Let's consider a maximally simple example to get some intuition. Let the population loss function be . The number of parameters and the minimum is at .

For the minimum is nondegenerate (the second derivative is nonzero). In this case the RLCT is half the dimension. In our case the dimension is just so

For the minimum is degenerate (the second derivative is zero). Analyses based on studying the second derivatives will not see the difference between but in fact the local geometry is vastly different. The higher is the broader the basin around the minimum. The RLCT for is . This means, the is lower the 'broader' the basin is.

Okay so far this only recapitulates the broad basin story. But there are some important points

- this is an actual quantity that can be estimated at scale for real networks that provably dominates the learning behaviour for moderately large .
- SLT says that the minima with low rlct will be preferred. It evens says how much they will be preferred. There is tradeoff between lower rlct minima with moderate loss ('simpler solutions') and minima with higher rlct but lower loss. As This means that the RLCT is actually 'the right notion of model complexity/ simplicty' in the parameterized Bayesian setting. This is too much to recap in this comment but I refer you to Hoogland & van Wingerden's post here. This is the also the start of the phase transition story which I regard as the principal insight of SLT.
- The RLCT doesn't just pick up on basin broadness. It also picks up on more elaborate singular structure. E.g. a crossing valley type minimum like . I won't tell you the answer but you can calculate it yourself using Shaowei Lin's cheat sheet. This is key - actual neural networks have highly highly singular structure that determines the RLCT.
- The RLCT is the most important quantity in SLT but SLT is not just about the RLCT. For instance, the second most important quantity the 'singular fluctuation' is also quite important. It has a strong influence on generaliztion behaviour and is the largest factor in the variance of trained models. It controls approximation to Bayesian learning like the way neural networks are trained.
- We've seen that the directions defined by the matrix of second derivatives is fundamentally flawed because neural networks are highly singular. Still, there is something noncrazy about studying these directions. There is upcoming work which I can't discuss in detail yet that explains to large degree how to correct this naive picture both mathematically and empirically.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Why I stopped being into basin broadness · 2024-04-26T19:57:33.439Z · LW · GW

This is all answered very elegantly by singular learning theory.

You seem to have a strong math background! I really encourage you take the time and really study the details of SLT. :-)

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Examples of Highly Counterfactual Discoveries? · 2024-04-26T19:53:49.122Z · LW · GW

I would not say that the central insight of SLT is about priors. Under weak conditions the prior is almost irrelevant. Indeed, the RLCT is independent of the prior under very weak nonvanishing conditions.

The story that symmetries mean that the parameter-to-function map is not injective is true but already well-understood outside of SLT. It is a common misconception that this is what SLT amounts to.

To be sure - generic symmetries are seen by the RLCT. But these are, in some sense, the uninteresting ones. The interesting thing is the local singular structure and its unfolding in phase transitions during training.

The issue of the true distribution not being contained in the model is called 'unrealizability' in Bayesian statistics. It is dealt with in Watanabe's second 'green' book. Nonrealizability is key to the most important insight of SLT contained in the last sections of the second to last chapter of the green book: algorithmic development during training through phase transitions in the free energy.

I don't have the time to recap this story here.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Examples of Highly Counterfactual Discoveries? · 2024-04-26T19:41:10.391Z · LW · GW

All proofs are contained in the Watanabe's standard text, see here

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Examples of Highly Counterfactual Discoveries? · 2024-04-25T15:42:56.687Z · LW · GW

Did I just say SLT is the Newtonian gravity of deep learning? Hubris of the highest order!

But also yes... I think I am saying that

**Singular Learning Theory is the first highly accurate model of breath of optima.**- SLT tells us to look at a quantity Watanabe calls , which has the highly-technical name 'real log canonical threshold (RLCT). He proves several equivalent ways to describe it one of which is as the (fractal) volume scaling dimension around the optima.
- By computing simple examples (see Shaowei's guide in the links below) you can check for yourself how the RLCT picks up on basin broadness.
- The RLCT = first-order term for in-distribution generalization error and also Bayesian learning (technically the 'Bayesian free energy'). This justifies the name of 'learning coefficient' for lambda. I emphasize that these are mathematically precise statements that have complete proofs, not conjectures or intuitions.
- Knowing a little SLT will inoculate you against many wrong theories of deep learning that abound in the literature. I won't be going in to it but suffice to say that any paper assuming that the Fischer information metric is regular for deep neural networks or any kind of hierarchichal structure is fundamentally flawed. And you can be sure this assumption is sneaked in all over the place. For instance, this is almost always the case when people talk about Laplace approximation.

**It's one of the most computationally applicable ones we have**? Yes. SLT quantities like the RLCT can be analytically computed for many statistical models of interest, correctly predicts phase transitions in toy neural networks and it can be estimated at scale.

EDIT: no hype about future work. Wait and see ! :)

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Examples of Highly Counterfactual Discoveries? · 2024-04-25T11:58:16.462Z · LW · GW

- Scott Garrabrant's discovery of Logical Inductors.

I remembered hearing about the paper from a friend and thinking it couldn't possibly be true in a non-trivial sense. To someone with even a modicum of experience in logic - a computable procedure assigning probabilities to arbitrary logical statements in a natural way is surely to hit a no-go diagonalization barrier.

Logical Inductors get around the diagonalization barrier in a very clever way. I won't spoil how it does here. I recommend the interested reader to watch Andrew's Critch talk on Logical Induction.

It was the main reason convincing that MIRI != clowns but were doing substantial research.

The Logical Induction paper has a fairly thorough discussion of previous work. Relevant previous work to mention is de Finetti's on betting and probability, previous work by MIRI & associates (Herreshof, Taylor, Christiano, Yudkowsky...), the work of Shafer-Vovk on financial interpretations of probability & Shafer's work on aggregation of experts. There is also a field which doesn't have a clear name that studies various forms of expert aggregation. Overall, my best judgement is that nobody else was close before Garrabrant.

- The Antikythera artifact: a Hellenistic Computer.
- You probably learned heliocentrism= good, geocentrism=bad, Copernicus-Kepler-Newton=good epicycles=bad. But geocentric models and heliocentric models are equivalent, it's just that Kepler & Newton's laws are best expressed in a heliocentric frame. However, the raw data of observations is actually made in a geocentric frame. Geocentric models stay closer to the data in some sense.
- Epicyclic theory is now considered bad, an example of people refusing to see the light of scientific revolution. But actually, it was an enormous innovation. Using high-precision gearing epicycles could be actually
*implemented on a (Hellenistic) computer*implicitly doing Fourier analysis to predict the motion of the planets. Astounding. - A Roman author (Pliny the Elder?) describes a similar device in posession of Archimedes of Rhodes. It seems likely that Archimedes or a close contemporary (s) designed the artifact and that several were made in Rhodes.

Actually, since we're on the subject of scientific discoveries

- Discovery & description of the complete Antikythera mechanism. The actual artifact that was found is just a rusty piece of bronze. Nobody knew how it worked. There were several sequential discoveries over multiple decades that eventually led to the complete solution of the mechanism.The final pieces were found just a few years ago. An astounding scientific achievement. Here is an amazing documentary on the subject:

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Examples of Highly Counterfactual Discoveries? · 2024-04-25T10:50:11.740Z · LW · GW

Singular Learning Theory is another way of "talking about the breadth of optima" in the same sense that Newton's Universal Law of Gravitation is another way of "talking about Things Falling Down".

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Examples of Highly Counterfactual Discoveries? · 2024-04-25T10:32:26.705Z · LW · GW

Don't forget Wallace !

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Examples of Highly Counterfactual Discoveries? · 2024-04-25T10:29:47.003Z · LW · GW

Yes, beautiful example ! Van Leeuwenhoek was the one-man ASML of the 17th century. In this case, we actually have evidence to the counterfactual impact as other lensmakers trailed van Leeuwenhoek by many decades.

It's plausible that high-precision measurement and fabrication is the key bottleneck in most technological and scientific progress- it's difficult to oversell the importance of van Leeuwenhoek.

Antonie van Leeuwenhoek made more than 500 optical lenses. He also created at least 25 single-lens microscopes, of differing types, of which only nine have survived. These microscopes were made of silver or copper frames, holding hand-made lenses. Those that have survived are capable of magnification up to 275 times. It is suspected that Van Leeuwenhoek possessed some microscopes that could magnify up to 500 times. Although he has been widely regarded as a dilettante or amateur, his scientific research was of remarkably high quality.

^{[39]}The single-lens microscopes of Van Leeuwenhoek were relatively small devices, the largest being about 5 cm long.

^{[40]}^{[41]}They are used by placing the lens very close in front of the eye. The other side of the microscope had a pin, where the sample was attached in order to stay close to the lens. There were also three screws to move the pin and the sample along three axes: one axis to change the focus, and the two other axes to navigate through the sample.Van Leeuwenhoek maintained throughout his life that there are aspects of microscope construction "which I only keep for myself", in particular his most critical secret of how he made the lenses.

^{[42]}For many years no one was able to reconstruct Van Leeuwenhoek's design techniques, but in 1957, C. L. Stong used thin glass thread fusing instead of polishing, and successfully created some working samples of a Van Leeuwenhoek design microscope.^{[43]}Such a method was also discovered independently by A. Mosolov and A. Belkin at the Russian Novosibirsk State Medical Institute.^{[44]}In May 2021 researchers in the Netherlands published a non-destructive neutron tomography study of a Leeuwenhoek microscope.^{[22]}One image in particular shows a Stong/Mosolov-type spherical lens with a single short glass stem attached (Fig. 4). Such lenses are created by pulling an extremely thin glass filament, breaking the filament, and briefly fusing the filament end. The nuclear tomography article notes this lens creation method was first devised by Robert Hooke rather than Leeuwenhoek, which is ironic given Hooke's subsequent surprise at Leeuwenhoek's findings.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Examples of Highly Counterfactual Discoveries? · 2024-04-25T10:06:15.583Z · LW · GW

Here are some reflections I wrote on the work of Grothendieck and relations with his contemporaries & predecessors.

Take it with a grain of salt - it is probably too deflationary of Grothendieck's work, pushing back on mythical narratives common in certain mathematical circles where Grothendieck is held to be an Christ-like figure. I pushed back on that a little. Nevertheless, it would probably not be an exaggeration to say that Grothendieck's purely scientific contributions [as opposed to real-life consequences] were comparable to those of Einstein.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Examples of Highly Counterfactual Discoveries? · 2024-04-25T09:56:08.301Z · LW · GW

Here's a document called "Upper and lower bounds for Alien Civilizations and Expansion Rate" I wrote in 2016. Hanson et al. Grabby Aliens paper was submitted in 2021.

The draft is very rough. Claude summarizes it thusly:

The document presents a probabilistic model to estimate upper and lower bounds for the number of alien civilizations and their expansion rates in the universe. It shares some similarities with Robin Hanson's "Grabby Aliens" model, as both attempt to estimate the prevalence and expansion of alien civilizations, considering the idea of expansive civilizations that colonize resources in their vicinity.

However, there are notable differences. Hanson's model focuses on civilizations expanding at the highest possible speed and the implications of not observing their visible "bubbles," while this document's model allows for varying expansion rates and provides estimates without making strong claims about their observable absence. Hanson's model also considers the idea of a "Great Filter," which this document does not explicitly discuss.

Despite these differences, the document implicitly contains the central insight of Hanson's model – that the expansive nature of spacefaring civilizations and the lack of observable evidence for their existence imply that intelligent life is sparse and far away. The document's conclusions suggest relatively low numbers of spacefaring civilizations in the Milky Way (fewer than 20) and the Local Group (up to one million), consistent with the idea that intelligent life is rare and distant.

The document's model assumes that alien civilizations will become spacefaring and expansive, occupying increasing volumes of space over time and preventing new civilizations from forming in those regions. This aligns with the "grabby" nature of aliens in Hanson's model. Although the document does not explicitly discuss the implications of not observing "grabby" aliens, its low estimates for the number of civilizations implicitly support the idea that intelligent life is sparse and far away.

The draft was never finished as I felt the result wasn't significant enough. To be clear, the Hanson-Martin-McCarter-Paulson paper contains more detailed models and much more refined statistical analysis. I didn't pursue these ideas further.

I wasn't part of the rationality/EA/LW community. Nobody I talked to was interested in these questions.

Let this be a lesson for young people: Don't assume. Publish! Publish in journals. Publish on LessWrong. Make something public even if it's not in a journal!

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Examples of Highly Counterfactual Discoveries? · 2024-04-24T16:40:26.536Z · LW · GW

Idk the Nobel prize committee thought it wasn't significant enough to give out a separate prize 🤷

I am not familiar enough with the particulars to have an informed opinion. My best guess is that in general statements to the effect of "yes X also made scientific contribution A but Y phrased it better' overestimate the actual scientific counterfactual impact of Y. It generically weighs how well outsiders can understand the work too much vis a vis specialists/insiders who have enough hands-on experience that the value-add of a simpler/neater formalism is not that high (or even a distraction).

The reason Dick Feynmann is so much more well-known than Schwinger and Tomonaga surely must not be entirely unrelated with the magnetic charisma of Dick Feynmann.

**Alexander Gietelink Oldenziel (alexander-gietelink-oldenziel)**on Transformers Represent Belief State Geometry in their Residual Stream · 2024-04-24T12:29:08.471Z · LW · GW

Depending on what one means by 'learn' this is provably impossible. The reason has nothing to do with the transformer architecture (which one shouldn't think of as a canonical architecture in the grand scheme of things anyway).

There is a 2-state generative HMM such that the optimal predictor of the output of said generative model provably requires an infinite number of states. This is for any model of computation, any architecture.

Of course, that's maybe not what you intend by 'learn'. If you mean by 'learn' express the underlying function of an HMM then the answer is yes by the Universal Approximation Theorem (a very fancy name for a trivial application of the Stone-Weierstrass theorem).

Hope this helped. 😄