Posts

Transformers Represent Belief State Geometry in their Residual Stream 2024-04-16T21:16:11.377Z
Basic Mathematics of Predictive Coding 2023-09-29T14:38:28.517Z
Geoff Hinton Quits Google 2023-05-01T21:03:47.806Z
Learn the mathematical structure, not the conceptual structure 2023-03-01T22:24:19.451Z
What is a world-model? 2023-02-16T22:39:16.998Z
Mental Abstractions 2022-11-30T05:07:06.383Z
Some thoughts about natural computation and interactions 2022-11-27T18:15:24.724Z
Entropy Scaling And Intrinsic Memory 2022-11-15T18:11:42.219Z
Pondering computation in the real world 2022-10-28T15:57:27.419Z
Beyond Kolmogorov and Shannon 2022-10-25T15:13:56.484Z
What do we mean when we say the brain computes? 2022-01-31T03:33:48.978Z

Comments

Comment by Adam Shai (adam-shai) on Transformers Represent Belief State Geometry in their Residual Stream · 2024-04-18T21:55:59.382Z · LW · GW

Thanks!

  • one way to construct an HMM is by finding all past histories of tokens that condition the future tokens with the same probablity distribution, and make that equivalence class a hidden state in your HMM. Then the conditional distributions determine the arrows coming out of your state and which state you go to next. This is called the "epsilon machine" in Comp Mech, and it is unique. It is one presentation of the data generating process, but in general there are an infinite number of HMM presntations that would generate the same data. The epsilon machine is a particular type of HMM presentation - it is the smallest one where the hidden states are the minimal sufficient statistics for predicting the future based on the past. The epsilon machine is one of the most fundamental things in Comp Mech but I didn't talk about it in this post. In the future we plan to make a more generic Comp Mech primer that will go through these and other concepts.
  • The interpretability of these simplexes is an issue that's in my mind a lot these days. The short answer is I'm still wrestling with it. We have a rough experimental plan to go about studying this issue but for now, here are some related questions I have in my mind:
    • What is the relationship between the belief states in the simplex and what mech interp people call "features"?
    • What are the information theoretic aspects of natural language (or coding databases or some other interesting training data) that we can instantiate in toy models and then use our understanding of these toy systems to test if similar findings apply to real systems.

For something like situational awareness, I have the beginnings of a story in my head but it's too handwavy to share right now. For something slightly more mundane like out-of-distribution generaliztion or transfer learning or abstraction, the idea would be to use our ability to formalize data-generating structure as HMMs, and then do theory and experiments on what it would mean for a transformer to understand that e.g. two HMMs have similar hidden/abstract structure but different vocabs. 

Hopefully we'll have a lot more to say about this kind of thing soon!

Comment by Adam Shai (adam-shai) on Transformers Represent Belief State Geometry in their Residual Stream · 2024-04-17T22:13:09.078Z · LW · GW

Oh wait one thing that looks not quite right is the initial distribution. Instead of starting randomly we begin with the optimal initial distribution, which is the steady-state distribution. Can be computed by finding the eigenvector of the transition matrix that has an eigenvalue of 1. Maybe in practice that doesn't matter that much for mess3, but in general it could.

Comment by Adam Shai (adam-shai) on Transformers Represent Belief State Geometry in their Residual Stream · 2024-04-17T22:10:34.911Z · LW · GW

I should have explained this better in my post.

For every input into the transformer (of every length up to the context window length), we know the ground truth belief state that comp mech says an observer should have over the HMM states. In this case, this is 3 numbers. So for each input we have a 3d ground truth vector.  Also, for each input we have the residual stream activation (in this case a 64D vector). To find the projection we just use standard Linear Regression (as implemented in sklearn) between the 64D residual stream vectors and the 3D (really 2D) ground truth vectors. Does that make sense?

Comment by Adam Shai (adam-shai) on Transformers Represent Belief State Geometry in their Residual Stream · 2024-04-17T22:03:21.123Z · LW · GW

Everything looks right to me! This is the annoying problem that people forget to write the actual parameters they used in their work (sorry).

Try x=0.05, alpha=0.85. I've edited the footnote with this info as well.

Comment by Adam Shai (adam-shai) on Transformers Represent Belief State Geometry in their Residual Stream · 2024-04-17T20:53:40.104Z · LW · GW

That sounds interesting. Do you have a link to the apperception paper?

Comment by Adam Shai (adam-shai) on Transformers Represent Belief State Geometry in their Residual Stream · 2024-04-17T18:42:32.194Z · LW · GW

That's an interesting framing. From my perspective that is still just local next-token accuracy (cross-entropy more precisely), but averaged over all subsets of the data up to the context length. That is distinct from e.g. an objective function that explicitly mentioned not just next-token prediction, but multiple future tokens in what was needed to minimize loss. Does that distinction make sense?

One conceptual point I'd like to get across is that even though the equation for the predictive cross-entropy loss only has the next token at a given context window position in it, the states internal to the transformer have the information for predictions into the infinite future.

This is a slightly different issue than how one averages over training data, I think.

Comment by Adam Shai (adam-shai) on Transformers Represent Belief State Geometry in their Residual Stream · 2024-04-17T18:26:48.089Z · LW · GW

Thanks! I appreciate the critique. From this comment and from John's it seems correct and I'll keep it in mind for the future.

On the question, by optimize the representation do you mean causally intervene on the residual stream during inference (e.g. a patching experiment)? Or do you mean something else that involves backprop? If the first, then we haven't tried, but definitely want to! It could be something someone does at the Hackathon, if interested ;)

Comment by Adam Shai (adam-shai) on Transformers Represent Belief State Geometry in their Residual Stream · 2024-04-17T18:24:27.366Z · LW · GW

Cool question. This is one of the things we'd like to explore more going forward. We are pretty sure this is pretty nuanced and has to do with the relationship between the (minimal) state of the generative model, the token vocab size, and the residual stream dimensionality.

One your last question, I believe so but one would have to do the experiment! It totally should be done. check out the Hackathon if you are interested ;)

Comment by Adam Shai (adam-shai) on Transformers Represent Belief State Geometry in their Residual Stream · 2024-04-17T15:39:13.118Z · LW · GW

this looks highly relevant! thanks!

Comment by Adam Shai (adam-shai) on Transformers Represent Belief State Geometry in their Residual Stream · 2024-04-17T07:10:35.249Z · LW · GW

Good catch! That should be eta_00, thanks! I'll change it tomorrow.

Comment by Adam Shai (adam-shai) on Transformers Represent Belief State Geometry in their Residual Stream · 2024-04-17T03:26:06.510Z · LW · GW

Cool idea! I don't know enough about GANs and their loss so I don't have a prediction to report right now. If it is the case that GAN loss should really give generative and not predictive structure, this would be a super cool experiment.

The structure of generation for this particular process has just 3 points equidistant from eachother, no fractal. But in general the shape of generation is a pretty nuanced issue because it's nontrivial to know for sure that you have the minimal structure of generation. There's a lot more to say about this but @Paul Riechers knows these nuances more than I do so I will leave it to him!

Comment by Adam Shai (adam-shai) on Transformers Represent Belief State Geometry in their Residual Stream · 2024-04-17T02:54:55.810Z · LW · GW

Responding in reverse order:

If there's literally a linear projection of the residual stream into two dimensions which directly produces that fractal, with no further processing/transformation in between "linear projection" and "fractal", then I would change my mind about the fractal structure being mostly an artifact of the visualization method.

There is literally a linear projection (well, we allow a constant offset actually, so affine) of the residual stream into two dimensions which directly produces that fractal. There's no distributions in the middle or anything. I suspect the offset is not necessary but I haven't checked ::adding to to-do list:: 

edit: the offset isn't necessary. There is literally a linear projection of the residual stream into 2D which directly produces the fractal.

But the "fractal-ness" is mostly an artifact of the MSP as a representation-method IIUC; the stochastic process itself is not especially "naturally fractal".

(As I said I don't know the details of the MSP very well; my intuition here is instead coming from some background knowledge of where fractals which look like those often come from, specifically chaos games.)

I'm not sure I'm following, but the MSP is naturally fractal (in this case), at least in my mind. The MSP is a stochastic process, but it's a very particular one - it's the stochastic process of how an optimal observer's beliefs (about which state an HMM is in) change upon seeing emissions from that HMM. The set of optimal beliefs themselves are fractal in nature (for this particular case).

Chaos games look very cool, thanks for that pointer!

Comment by Adam Shai (adam-shai) on Transformers Represent Belief State Geometry in their Residual Stream · 2024-04-17T01:42:42.672Z · LW · GW

Can you elaborate on how the fractal is an artifact of how the data is visualized?

From my perspective, the fractal is there because we chose this data generating structure precisely because it has this fractal pattern as it's Mixed State Presentation (ie. we chose it because then the ground truth would be a fractal, which felt like highly nontrivial structure to us, and thus a good falsifiable test that this framework is at all relevant for transformers. Also, yes, it is pretty :) ).  The fractal is a natural consequence of that choice of data generating structure - it is what Computational Mechanics says is the geometric structure of synchronization for the HMM.  That there is a linear 2d plane in the residual stream that when you project onto it you get that same fractal seems highly non-artifactual, and is what we were testing.

Though it should be said that an HMM with a fractal MSP is a quite generic choice. It's remarkably easy to get such fractal structures. If you randomly chose an HMM from the space of HMMs for a given number of states and vocab size, you will often get synchronizations structures with infinite transient states and fractals.

This isn't a proof of that previous claim, but here are some examples of fractal MSPs from https://arxiv.org/abs/2102.10487:

Comment by Adam Shai (adam-shai) on A starting point for making sense of task structure (in machine learning) · 2024-02-29T20:07:11.952Z · LW · GW

I find this focus on task structure and task decomposition to be incredibly important when thinking about what neural networks are doing, what they could be doing in the future, and how they are doing it. The manner in which a system understands/represents/instantiates task structures and puts them in relation to one another is, as far as I can tell, just a more concrete way of asking "what is it that this neural network knows? what cognitive abilities does it have? what abstractions is it making? under what out of distribution inputs will it succeed/fail, etc."

This comment isn't saying anything that wasn't in the post, just wanted to express happiness and solidarity with this framing!

I do wonder if the tree-structure of which-task and then task algorithm is what we should expect, in general. I have nothing super concrete to say here, my feeling is just that the manners in which a neural network can represent structures and put them in relation to eachother may be instantiated differently than a tree (with that specific ordering). The onus is probably on me here though - I should come up with a set of tasks in certain relations that aren't most naturally described with tree structures.

Another question that comes to mind is, is there a hard distinction between categorizing which sub-task one is in and the algorithm which carries out the computation for a specific subtask. Is it all just tasks all the way down?

Comment by Adam Shai (adam-shai) on What’s up with LLMs representing XORs of arbitrary features? · 2024-01-04T19:12:44.225Z · LW · GW

I think you might need to change permissions on your github repository?

Comment by Adam Shai (adam-shai) on What’s up with LLMs representing XORs of arbitrary features? · 2024-01-04T19:12:34.981Z · LW · GW
Comment by Adam Shai (adam-shai) on AI #39: The Week of OpenAI · 2023-11-24T15:00:25.378Z · LW · GW

The blog post linked says it's from August. Is there something new I'm missing?

Comment by Adam Shai (adam-shai) on Graphical tensor notation for interpretability · 2023-10-05T19:22:52.245Z · LW · GW

This is so cool! Thanks so much, I plan to go through it in full when I have some time. For now, I was wondering if the red circled matrix multiplication should actually be reversed, and the vector should be column (ie. matrix*column, instead of row*matrix). I know the end result is equivalent but it seems in order to be consistent it should be switched, ie in every other example of a vector with leg sticking out leftward its a column vector? maybe this really doesnt matter since I can just turn the page upside down and then b would be on the left with a leg sticking out to the right..., but the fact that A dot b = b.T dot A is itself an interesting fact.

Comment by Adam Shai (adam-shai) on Basic Mathematics of Predictive Coding · 2023-10-01T15:08:44.508Z · LW · GW

Just to add to Carl Feynman's response, which I thought was good.

Part of the reason these systems are inefficient is because it requires you to (effectively) run gradient descent even at inference, even after training is over. Or you can run the RNN, which is mathematically equivalent but again you can see where the inefficiency comes in: the value at time t=3 is a function of the value at time t=2, which is a function of t=1 and so on, so in order to get the converged value of the activations you have to, in a for loop, compute each timestep one by one.

This is in contrast to a feedforward network like a (normal) convnet or transformer, which can run extremely quickly and in parallel on gpu.

Comment by Adam Shai (adam-shai) on Basic Mathematics of Predictive Coding · 2023-10-01T15:01:47.667Z · LW · GW

Thanks! 

I think your thinking makes sense, and, if for instance on every timestep you presented a different images in a stereotypically defined sequence, or with a certain correlation structure, you would indeed get information about those correlations in the weights. However, this model was designed to be used in the restricted to settings where you show a single still image for many timesteps until convergence. In that setting, weights give you image features for static images (in a heirarchical manner), and priors for low level features will feed back from activations in higher level areas.

There are extensions to this model that deal with video, where there are explicit spatiotemporal expectations built into the network. you can see one of those networks in this paper: https://arxiv.org/abs/2112.10048

But I've never implemented such a network myself.

Comment by Adam Shai (adam-shai) on how do short-timeliners reason about the differences between brain and AI? · 2023-09-29T09:32:15.329Z · LW · GW

First, brains (and biological systems more generally) have many constraints that artificial networks do not. Brains exist in the context of a physically instantiated body, with heavy energy constraints. Further, they exist in specific niches, with particular evolutionary histories, which has enormous effects on structure and function.

Second, biological brains have different types of intelligence from AI systems, at least currently. A bird is able to land fluidly on a thin branch in windy conditions, while gpt4 can help you code. In general, the intelligences that one thinks of in the context of AGI do not totally overlap with the varied, often physical and metabolic, intelligences of biology.

All that being said, who knows what future AI systems will look like

Comment by Adam Shai (adam-shai) on Pondering computation in the real world · 2023-09-16T22:07:55.108Z · LW · GW

Thanks so much for this comment (and sorry for taking ~1 year to respond!!). I really liked everything you said.

For 1 and 2, I agree with everything and don't have anything to add.
3. I agree that there is something about the input/output mapping that is meaningful but it is not everything. Having a full theory for exactly the difference, and what the distinctions between what structure counts as interesting internal computation (not a great descriptor of what I mean but can't think of anything better right now) vs input output computation would be great.

4. I also think a great goal would be in generalizing and formalizing what an "observer" of a computation is. I have a few ideas but they are pretty half-baked right now.

5. That is an interesting point. I think it's fair. I do want to be careful to make sure that any "disagreements" are substantial and not just semantic squabling here. I like your distinction between representation work and computational work. The idea of using vs. performing a computation is also interesting. At the end of the day I am always left craving some formalism where you could really see the nature of these distinctions.

6. Sounds like a good idea!

7. Agreed on all counts.

8. I was trying to ask the question if there is anything that tells us that the output node is semantically meaningful without reference to e.g. the input images of cats, or even knowledge of the input data distribution. Interpretability work, both in artificial neural networks and more traditionally in neuroscience, always use knowledge of input distributions or even input identity to correlate activity of neurons to the input, and in that way assign semantics to neural activity (e.g. recently, othello board states, or in neuroscience jennifer aniston neurons or orientation tuned neurons) . But when I'm sitting down with my eyes closed and just thinking, there's no homonculus there that has access to input distributions on my retina that can correlate some activity pattern to "cat." So how can the neural states in my brain "represent" or embody or whatever word you want to use, the semantic information of cat, without this process of correlating to some ground truth data. WHere does "cat" come from when theres no cat there in the activity?!

9. SO WILD

Comment by Adam Shai (adam-shai) on Douglas Hofstadter changes his mind on Deep Learning & AI risk (June 2023)? · 2023-07-05T17:20:21.517Z · LW · GW

Can you explain what you mean by second or third order dynamics? That sounds interesting. Do you mean e.g. the order of the differential equation or something else?

Comment by Adam Shai (adam-shai) on Gemini will bring the next big timeline update · 2023-05-29T20:27:08.549Z · LW · GW

This is not obvious to me. It seems somewhat likely that the multimodaility actually induces more explicit representations and uses of human-level abstract concepts, e.g. a Jennifer Aniston neuron in a human brain is multimodal.

Comment by Adam Shai (adam-shai) on Correcting a misconception: consciousness does not need 90 billion neurons, at all · 2023-03-31T16:47:42.008Z · LW · GW

This is the standard understanding in neuroscience (and for what its worth is my working belief), but there is some evidence that throws a wrench into this idea, and needs to be explained, for instance this review "Consciousness without a cerebral cortex: a challenge for neuroscience and medicine" where evidence towards the idea that consciousness without a cortex can occur. in particular this is a famous case of a human with hardly any cortex that seemed to act normally, in most regards.

Comment by Adam Shai (adam-shai) on The algorithm isn't doing X, it's just doing Y. · 2023-03-17T16:03:08.606Z · LW · GW

I think the issue is that what people often mean by. "computing matrix multiplication" is something like what youve described here, but when (at least sometimes, as you've so elegantly talked about in other posts, vibes and context really matter!) talk about "recognizing dogs" they are referring not only to the input output transformation of the task (or even the physical transformation of world states) but also the process by which the dog is recognized, which includes lots of internal human abstractions moving about in a particular way in the brains of people, which may or may not be recapitulated in an artificial classification system.

To some degree it's a semantic issue. I will grant you that there is a way of talking about "recognizing dogs" that reduces it to the input/output mapping, but there is another way in which this doesn't work. The reason it makes sense for human beings to have these two different notions of performing a task is because we really care about theory of mind, and social settings, and figuring out what other people are thinking (and not just the state of their muscles or whatever dictates their output).

Although for precisions sake, maybe they should really have different words associated with them, though I'm not sure what the words should be exactly. Maybe something like "solving a task" vs. "understanding a task" though I don't really like that.

Actually my thinking can go the other way to. I think there actually is a sense in which the computer is not doing matrix multiplication, and its really only the system of computer+human that is able to do it, and the human is doing A LOT of work here. I recognize this is not the sense people usually mean when they talk about computers doing matrix multiplication, but again, I think there are two senses of performing a computation even though people use the same words.

Comment by Adam Shai (adam-shai) on The algorithm isn't doing X, it's just doing Y. · 2023-03-17T15:52:10.012Z · LW · GW

I think I am the one that is misunderstanding. Why doesn't your definitions work?

For every Rilke that  that can turn 0 pages into 1 page, there exists another machine B s.t. 

(1) B can turn 1 page into 1 page, while interacting with Rilke. (I can copy a poem from a rilke book while rilke writes another poem next to me, or while Rilke reads the poem to me, or while Rilke looks at the first wood of the poem and then creates the poem next to me, etc.)

(2) the combined Rilke and B doesnt expend much more physical resource to turn 1 page into 1 page as Rilke expends writing a page of poetry. 

I have a feeling I am misentrepreting one or both of the conditions.

Comment by Adam Shai (adam-shai) on The algorithm isn't doing X, it's just doing Y. · 2023-03-17T14:00:26.578Z · LW · GW

Instead of responding philosophically I think it would be instructive to go through an example, and hear your thoughts about it. I will take your definition of physical reduction  (focusing on 4.) and assign tasks and machines to the variables:

Here's your defintion:

A task  reduces to task  if and only if...

For every machine  that solves task , there exists another machine  such that...

(1)  solves task  by interacting with .
(2) The combined machine  doesn't expend much more physical resources to solve  as  expends to solve .

Now I want X to be the task of copying a Rilke poem onto a blank piece of paper, and Y to be the task of Rilke writing a poem onto a blank piece of paper. 

so let's call X = COPY_POEM, Y = WRITE_POEM, and let's call A = Rilke. So plugging into your definition:

A task COPY_POEM reduces to task WRITE_POEM if and only if...

For every Rilke that solves task WRITE_POEM, there exists another machine  such that...

(1)  solves task COPY_POEM by interacting with Rilke.
(2) The combined machine  doesn't expend much more physical resources to solve COPY_POEM as Rilke expends to solve WRITE_POEM.

This seems to work. If I let Rilke write the poem, and I just copy his work, the the poem will be written on the piece of paper., and Rilke has done much of the physical labor.  The issue is that when people say something like "writing a poem is more than just copying a poem," that seems meaningful to me (this is why teachers are generally unhappy when you are assigned to write a poem and they find out you copied one from a book), and to dismiss the difference as not useful seems to be missing something important about what it means to write a poem. How do you feel about this example?

Just for context, I do strongly agree with many of your other examples, I just think this doesn't work in general. And basing all of your intuitions about intelligence on this will leave you missing something fundamental about intelligence (of the type that exists in humans, at least). 

Comment by Adam Shai (adam-shai) on The algorithm isn't doing X, it's just doing Y. · 2023-03-17T00:18:27.574Z · LW · GW

Like, seriously? What do you mean when you say Google Maps "finds the shortest route from your house to the pub"? Your phone is just displaying certain pixels, it doesn't output an actual physical road! So what do you mean? What you mean is that, by using Google Maps as an oracle with very little overhead, you can find the shortest route from your house to the pub.

This is getting at a deep and important point, but I think this sidesteps an important difference between "writing poetry" (like when a human does it) and "computing addition" (like when a calculator does it). You get really close to it here.

The problem is that when the task is writing poetry (as a human does it), what entity is the "you" who is making use of the physical machinations that is producing the poetry "with very little overhead"? There is something different about a writing poetry and doing addition with a calculator. The task of writing poetry (as a human) is not just about transforming inputs to outputs, it matters what the internal states are. Unlike in the case where "you" make sense of the dynamics of the calculator in order get the work of addition done, in the case of writing poetry, you are the one who is making sense of your own dynamics.

I'm not saying there's anything metaphysical going on, but I would argue your definition of task is not a good abstraction for humans writing poetry, it's not even a good abstraction for humans performing mathematics (at least when they aren't doing rote symbol manipulation using pencil and paper). 

Maybe this will jog your intuitions in my direction: one can think of the task of recognizing a dog, and think about how a human vs. a convnet does that task. 

I wrote about these issues here a little bit. But I have been meaning to write something more formalized. https://www.lesswrong.com/posts/f6nDFvzvFsYKHCESb/pondering-computation-in-the-real-world

Comment by Adam Shai (adam-shai) on Addendum: basic facts about language models during training · 2023-03-06T19:52:03.786Z · LW · GW

I really appreciate this work and hope you and others continue to do more like it. So I really do mean this criticism with a lot of goodwill. I think even a small amount of making the figures/animations look nicer would go a long way to making this be more digestible (and look more professional). Things like keeping constant the axes through the animations, and using matplotlib or seaborn styles.  https://seaborn.pydata.org/generated/seaborn.set_context.html and https://seaborn.pydata.org/generated/seaborn.set_style.html

apologies if you already know this and were just saving time!

Comment by Adam Shai (adam-shai) on Learn the mathematical structure, not the conceptual structure · 2023-03-03T15:17:54.282Z · LW · GW

I thought it was a great way to put it and I appreciated it a lot! I'm not even sure the post has more value than the summary; at the very least that one sentence adds a lot of explanatory power imho.

Comment by Adam Shai (adam-shai) on Bayes is Out-Dated, and You’re Doing it Wrong · 2023-02-25T23:44:27.525Z · LW · GW

Well I don't know SAS at all but a quick search of the SAS documentation for dirilecht calls it a "nonparametric Bayes approach"...

 

https://documentation.sas.com/doc/en/casactml/8.3/casactml_nonparametricbayes_details12.htm

Comment by Adam Shai (adam-shai) on Bayes is Out-Dated, and You’re Doing it Wrong · 2023-02-25T23:39:43.403Z · LW · GW

I don't know if I'm missing something, but it sounds like you are discussing for a particular method of picking a prior within a Bayesian context, but you are not arguing against Bayes itself. If anything, it seems to me this is pro-Bayes, just using DIrilecht Processes as a prior.

Comment by Adam Shai (adam-shai) on Are there any AI safety relevant fully remote roles suitable for someone with 2-3 years of machine learning engineering industry experience? · 2023-02-20T20:55:48.954Z · LW · GW

You should still apply for jobs even if they say 5+ years experience. The requirements are often more flexible than they seem, and the worst that can happen is they say no.

Comment by Adam Shai (adam-shai) on Schizophrenia as a deficiency in long-range cortex-to-cortex communication · 2023-02-01T19:49:18.943Z · LW · GW

Some quick thoughts, can expand later with refs:

  • there are other similar results where schizophrenics do better than neurotypical. Two I remember are (1) an experiment where the experimenter pushes on the arm (or palm of hand I dont remember) of the subject with a particular force, and then the subject is asked to recreate that force by pushing on themselves. Neurotypicals push harder on themselves than when pushed on by an external source. (2) Motion tracking of a moving ball especially when there are non-predictive jumps in the balls trajectories.
  • The theories for both of these tend to be similar to what you said, an error in the signaling having to do with predictions of upcoming sensory stimulii, usually assumed to take place via long range cortex-cortex connections (feedback).
  • For the moment I can recommend a chapter in Surfing Uncertainty, which I'm pretty sure is where I got these examples. Though there are probably predictive processing reviews that cover this.
Comment by Adam Shai (adam-shai) on Consider using reversible automata for alignment research · 2022-12-12T01:21:15.911Z · LW · GW

I share your confusions/intution about what is meant by optimization here. But I think for the purposes of this post, optimization is defined here, which is linked to at the beginning of this post. In that link, optimization is thought of as a pattern that persists in the face of perturbations and that evolves towards a small set of states. I'm still not totally grokking it though.

Comment by Adam Shai (adam-shai) on Consider using reversible automata for alignment research · 2022-12-11T17:42:21.108Z · LW · GW

This is super interesting. I was wondering if you could give a few more thoughts/intuitions about why you think reversibility is important. I understand that it would make the simulations more physics like, but why is being physics like important to alignment research and/or agency research?

I clicked on the paper by the Critter creator, which seems like it might go deeper into that issue, but don't have the time to read through it right now. Super exciting stuff! Thanks.

Comment by Adam Shai (adam-shai) on Who are some prominent reasonable people who are confident that AI won't kill everyone? · 2022-12-06T01:26:04.918Z · LW · GW

David Deutsch https://www.google.com/url?sa=t&source=web&rct=j&url=https://www.daviddeutsch.org.uk/wp-content/uploads/2019/07/PossibleMinds_Deutsch.pdf&ved=2ahUKEwj16YjV6OP7AhXxL0QIHXU4DdkQFnoECDoQAQ&usg=AOvVaw0giHdn4BKOci3swaQ1bqlN

Comment by Adam Shai (adam-shai) on Call For Distillers · 2022-12-01T20:31:46.752Z · LW · GW

Do you have a particular story that shows the types of negative outcomes that could happen? While it's not impossible for me to imagine an overly sensitive academic getting angry or annoyed unreasonably, at a distillation, it hardly seems to me like it would be at all likely. I have fairly high confidence in my understanding of academic mindsets, and a single sentence at the top "this is a summary of XYZ's work on whatever" with a link would in almost all cases be enough. You could even add in another flattering sentence, "I'm very excited about this work because... I find it super exciting so here's my notes/attempt at understanding it more"

Generally, academics like it when people try to understand their work.

Comment by Adam Shai (adam-shai) on Is RL involved in sensory processing? · 2022-10-29T17:27:45.707Z · LW · GW

Another paper you might be interested in that shows reinforcement learning effects even after training has reached asymptotic performance in a perceptual task:  https://elifesciences.org/articles/49834

Comment by Adam Shai (adam-shai) on Beyond Kolmogorov and Shannon · 2022-10-29T17:10:37.264Z · LW · GW

These points are well taken. I agree re: log-score. We were trying to compare to the most straightforward/naive reward maximization setup. Like in a game where you get +1 point for correct answers and -1 point for incorrect. But I take your point that other scores lead to different (in this case better) results.

Re: cheating. Yes! That is correct. It was too much to explain in this post, but ultimately we would like to argue that we can augment Turing Machines with a heat source (ie a source of entropy) such that it can generate random bits. Under that type of setup the "random number generator" becomes much simpler/easier than having to use a deterministic algorithm. In addition, the argument will go, this augmented Turing machine is in better correspondence with natural systems that we want to understand as computing.

Which leads to your last point, which I think is very fundamental. I disagree a little bit, in a specific sense. While it is true that "randomness" comes from a specific type of "laziness," I think it's equally true that this laziness actually confers computational powers of a certain sort. For now, I'll just say that this has to do with abstraction and uncertainty, and leave the explanation of that for another post.

Comment by Adam Shai (adam-shai) on Beyond Kolmogorov and Shannon · 2022-10-29T16:45:37.637Z · LW · GW

Wow, despite no longer being endorsed, this comment is actually extremely relevant to the upcoming posts! I have to admit I never went through the original paper in detail. It looks like Shannon was even more impressive than I realized! Thanks, I'll definitely have to go through this slowly.

Comment by Adam Shai (adam-shai) on Pondering computation in the real world · 2022-10-29T16:42:02.367Z · LW · GW

Great question. Hopefully soon I'll write a longer post on exactly this topic, but for now you can look at this recent post, Beyond Kolmogorov and Shannon, by me and Alexander Gietelink Oldenziel that tries to explain what is lacking in Turing Machines. This intuition is also found in James Crutchfield's work, e.g. here, and in the article by him in the external resources section in this post.

In short, the extra desirability condition is a mechanism to generate random bits. I think this is fundamental to computation "in the real world" (ie brains and other natural systems) because of the central role uncertainty plays in the functioning of such systems. But a complete explanation for why that is the case will have to wait for a longer post.

Admittedly, I am overloading the word "computation" here, since there is a very well developed mathematical framework for computation in terms of Turing Machines. But I can't think of a better one at the moment, and I do think this more general sense of computation is what many biologists are pointing towards (neuroscientists in particular) when they use the word. Maybe I should call it "natural computation."

Comment by Adam Shai (adam-shai) on Pondering computation in the real world · 2022-10-29T16:32:44.490Z · LW · GW

Embarrassingly, I've never actually thought of how compilers fit into this set of questions/thoughts. Very interesting, I'll definitely give it some thought now. I like the idea of a compiler as some kind of overseer/organizer of computation.

Comment by Adam Shai (adam-shai) on Pondering computation in the real world · 2022-10-29T16:30:48.949Z · LW · GW

Thanks! Sounds like I need to have a better understanding of lambda calculus, and as always, category theory :)

Comment by Adam Shai (adam-shai) on Beyond Kolmogorov and Shannon · 2022-10-26T16:11:41.764Z · LW · GW

These are both great points and are definitely going to be important parts of where the story is going! Probably we could have done a better job with explication, especially with that last point, thanks. Maybe one way to think about it is, what are the most useful ways we can convert data to distributions, and what do they tell us about the data generation process, which is what the next post will be about.

Comment by Adam Shai (adam-shai) on Beyond Kolmogorov and Shannon · 2022-10-26T15:55:58.602Z · LW · GW

No I haven't! That sounds very interesting, I'll definitely take a look, thanks. Do you have a particular introduction to it?

Comment by adam-shai on [deleted post] 2022-10-21T14:58:29.304Z

Taking a t-shirt, folding it over a few times, and tying it around my head works better than any sleep mask, even the expensive ones, in my experience.

Comment by Adam Shai (adam-shai) on A Butterfly's View of Probability · 2022-06-15T16:27:42.450Z · LW · GW

This is great! The issue of timescale is interesting to me in this. I am wondering for different systems at different levels of the ergodic heirarchy, if there are certain statements you can make (when considering the relevant timescales).

Also I am wondering how this plays with the issue of observer models. When I say that some event one month from now has 30% probability, are you imagining that I have a chaotic world model that I somehow run forward many times or push a probability distribution forward in some way and then count the volume in model space that contains the event? How would that process actually work in practice (ie how does my brain do it?).

Comment by Adam Shai (adam-shai) on Unfinished Projects Thread · 2022-04-03T13:05:41.627Z · LW · GW

Great idea! My intuition says this won't work, as you'll just capture half of the mechanism of the type of chaotic attractor we want. This will give you the "stretching" of points close in phase space to some elongated section, but not by itself the folding over of that stretched section, which at least in my current thinking is necessary. But it's definitely worth trying, I could very well be wrong! Thanks for the idea :)

Similarly it's not obvious to me that constraining the lyapanov exponent to a certain value gives you the correct "structure". For instance, if instead of ..01R... I wanted to train on ...10R... Or ...11R... Etc. But maybe the training of the lyapanov would just be one part of the optimization, and then other factors could play into it.