Gary Marcus vs Cortical Uniformity 2020-06-28T18:18:54.650Z · score: 13 (5 votes)
Building brain-inspired AGI is infinitely easier than understanding the brain 2020-06-02T14:13:32.105Z · score: 37 (14 votes)
Help wanted: Improving COVID-19 contact-tracing by estimating respiratory droplets 2020-05-22T14:05:10.479Z · score: 8 (2 votes)
Inner alignment in the brain 2020-04-22T13:14:08.049Z · score: 63 (19 votes)
COVID transmission by talking (& singing) 2020-03-29T18:26:55.839Z · score: 43 (16 votes)
COVID-19 transmission: Are we overemphasizing touching rather than breathing? 2020-03-23T17:40:14.574Z · score: 30 (15 votes)
SARS-CoV-2 pool-testing algorithm puzzle 2020-03-20T13:22:44.121Z · score: 42 (12 votes)
Predictive coding and motor control 2020-02-23T02:04:57.442Z · score: 23 (8 votes)
On unfixably unsafe AGI architectures 2020-02-19T21:16:19.544Z · score: 30 (12 votes)
Book review: Rethinking Consciousness 2020-01-10T20:41:27.352Z · score: 52 (18 votes)
Predictive coding & depression 2020-01-03T02:38:04.530Z · score: 21 (5 votes)
Predictive coding = RL + SL + Bayes + MPC 2019-12-10T11:45:56.181Z · score: 33 (13 votes)
Thoughts on implementing corrigible robust alignment 2019-11-26T14:06:45.907Z · score: 26 (8 votes)
Thoughts on Robin Hanson's AI Impacts interview 2019-11-24T01:40:35.329Z · score: 21 (13 votes)
steve2152's Shortform 2019-10-31T14:14:26.535Z · score: 4 (1 votes)
Human instincts, symbol grounding, and the blank-slate neocortex 2019-10-02T12:06:35.361Z · score: 36 (14 votes)
Self-supervised learning & manipulative predictions 2019-08-20T10:55:51.804Z · score: 17 (6 votes)
In defense of Oracle ("Tool") AI research 2019-08-07T19:14:10.435Z · score: 20 (10 votes)
Self-Supervised Learning and AGI Safety 2019-08-07T14:21:37.739Z · score: 25 (12 votes)
The Self-Unaware AI Oracle 2019-07-22T19:04:21.188Z · score: 24 (9 votes)
Jeff Hawkins on neuromorphic AGI within 20 years 2019-07-15T19:16:27.294Z · score: 164 (59 votes)
Is AlphaZero any good without the tree search? 2019-06-30T16:41:05.841Z · score: 27 (8 votes)
1hr talk: Intro to AGI safety 2019-06-18T21:41:29.371Z · score: 33 (11 votes)


Comment by steve2152 on Goals and short descriptions · 2020-07-03T13:09:52.980Z · score: 3 (2 votes) · LW · GW

Hmm, maybe we're talking past each other. Let's say I have something like AlphaZero, where 50,000 bytes of machine code trains an AlphaZero-type chess-playing agent, whose core is a 1-billion-parameter ConvNet. The ConvNet takes 1 billion bytes to specify. Meanwhile, the reward-calculator p, which calculates whether checkmate has occurred, is 100 bytes of machine code.

Would you say that the complexity of the trained chess-playing agent is 100 bytes or 50,000 bytes or 1 billion bytes?

I guess you're going to say 50,000, because you're imagining a Turing machine that spends a year doing the self-play to calculate the billion-parameter ConvNet and then immediately the same Turning machine starts running that ConvNet it just calculated. From the perspective of Kolmogorov complexity, it doesn't matter that it spends a year calculating the ConvNet, as long as it does so eventually.

By the same token, you can always turn a search-y agent into an equivalent discriminitive-y agent, given infinite processing time and storage, by training the latter on a googol queries of the former. If you're thinking about Kolmogorov complexity, then you don't care about a googol queries, as long as it works eventually.

Therefore, my first comment is not really relevant to what you're thinking about. Sorry. I was not thinking about algorithms-that-write-arbitrary-code-and-then-immediately-run-it, I was thinking about the complexity of the algorithms that are actually in operation as the agent acts in the world.

If my hand touches fire and thus immediately moves backwards by reflex, would this be an example of a discriminative policy, because an input signal directly causes an action without being processed in the brain?

Yes. But the lack-of-processing-in-the-brain is not the important part. A typical ConvNet image classifier does involve many steps of processing, but is still discriminative, not search-y, because it does not work by trying out different generative models and picks the one that best explains the data. You can build a search-y image classifier that does exactly that, but most people these days don't.

Comment by steve2152 on Goals and short descriptions · 2020-07-02T19:24:51.883Z · score: 3 (2 votes) · LW · GW

If we're talking about algorithmic complexity, there's a really important distinction, I think. In the space of actions, we have:

  • SEARCH: Search over an action space for an action that best achieves a goal
  • DISCRIMINATIVE: There is a function that goes directly from sensory inputs to the appropriate action (possibly with a recurrent state etc.)

Likewise, in the space of passive observations (e.g. classifying images), we have:

  • SEARCH: Search over a space of generative models for the model that best reproduces the observations (a.k.a. "analysis by synthesis")
  • DISCRIMINATIVE: There is a function that goes directly from sensory inputs to understanding / classification.

The search methods are generally:

  • Algorithmically simpler
  • More sample-efficient
  • Slower at run-time

(Incidentally, I think the neocortex does the "search" option in both these cases (and that they aren't really two separate cases in the brain). Other parts of the brain do the "discriminative" option in certain cases.)

I'm a bit confused about how my comment here relates to your post. If p is the goal ("win at chess"), the simplest search-based agent is just about exactly as complicated as p ("do a minimax search to win at chess"). But RL(p), at least with the usual definition of "RL", will learn a very complicated computation that contains lots of information about which particular configurations of pieces are advantageous or not, e.g. the ResNet at the core of AlphaZero.

Are you imagining that the policy π is searching or discriminative? If the latter, why are you saying that π is just as simple as p? (Or are you saying that?) "Win at chess" seems a lot simpler than "do the calculations described by the following million-parameter ResNet", right?

Comment by steve2152 on Models, myths, dreams, and Cheshire cat grins · 2020-06-24T11:34:27.059Z · score: 11 (2 votes) · LW · GW

Sorta related: my comment here

Comment by steve2152 on [AN #104]: The perils of inaccessible information, and what we can learn about AI alignment from COVID · 2020-06-20T19:14:22.277Z · score: 3 (2 votes) · LW · GW

On the Russell / Pinker debate, I thought Pinker had an interesting rhetorical sleight-of-hand that I hadn't heard before...

When people on the "AGI safety is important" side explain their position, there's kinda a pedagogical dialog:

A: Superintelligent AGI will be awesome, what could go wrong? B: Well it could outclass all of humanity and steer the future in a bad direction. A: OK then we won't give it an aggressive goal. B: Even with an innocuous-sounding goal like "maximize paperclips" it would still kill everyone... A: OK, then we'll give it a good goal like "maximize human happiness". B: Then it would forcibly drug everyone. A: OK, then we'll give it a more complicated goal like ... B: That one doesn't work either because ...

...And then Pinker reads this back-and-forth dialog, removes a couple pieces of it from their context, and says "The existential risk scenario that people are concerned about is the paperclip scenario and/or the drugging scenario! They really think those exact things are going to happen!" Then that's the strawman that he can easily rebut.

Pinker had other bad arguments too, I just thought that was a particularly sneaky one.

Comment by steve2152 on What's Your Cognitive Algorithm? · 2020-06-20T18:32:45.496Z · score: 2 (1 votes) · LW · GW

Well, sure, you could take bigger gradient-descent steps for some errors than others. I'm not aware of people doing that, but again, I haven't checked. I don't know how well that would work (if at all).

The thing you're talking about here sounds to me like "a means to an end" rather than "an end in itself", right? If writing "Karma 100000: ..." creates the high-karma-ish answer we wanted, does it matter that we didn't use rewards to get there? I mean, if you want algorithmic differences between Transformers and brains, there are loads of them, I could go on and on! To me, the interesting question raised by this post is: to what extent can they do similar things, even if they're doing it in very different ways? :-)

Comment by steve2152 on What's Your Cognitive Algorithm? · 2020-06-20T15:00:08.180Z · score: 10 (5 votes) · LW · GW

Meanwhile, Jeff Hawkins says "Every part of the neocortex is running the same algorithm", and it's looking like maybe brains aren't doing that complicated a set of things.

This is nitpicking, but your post goes back and forth between the "underlying algorithm" level and the "learned model" level. Jeff Hawkins is talking about the underlying algorithm level when he says that it is (more or less) the same in every part of the neocortex. But almost all the things you mention in "My algorithm as I understand it" are habits of thought that you've learned over the years. (By the same token, we should distinguish between "Transformer + SGD" and "whatever calculations are being done by the particular weight settings in the trained Transformer model".)

I don't expect there to be much simplicity or universality at the "learned model" level ... I expect that people use lots of different habits of thought.

Has anyone done anything like "Train a neural net on Reddit, where it's somehow separately rewarded for predicting the next word, and also for predicting how much karma a cluster of words will get, and somehow propagating that back into the language generation?")

I imagine the easiest thing would be to pre-pend the karma to each post, fine-tune the model, then you can generate high-karma posts by just prompting with "Karma 1000: ...". I'm not aware of anyone having done this specific thing but didn't check. I vaguely recall something like that for AlphaStar, where they started by imitation learning with the player's skill flagged, and then could adjust the flag to make their system play better or worse.

What's happening in System 2 thought?

If you haven't already, see Kaj's Against System 1 and System 2. I agree with everything he wrote; the way I would describe it is: Our brains house a zoo of compositional generative models, and system 2 is a cool thing where generative models can self-assemble into an ad-hoc crappy serial computer. For example, you can learn a Generative Model X that first summons a different Generative Model Y, and then summons either Generative Model Z₁ or Z₂ conditional on some feature of Generative Model Y. (Something like that ... I guess I should write this up better someday.) Anyway, this is a pretty neat trick. Can a trained Transformer NN do anything like that? I think there's some vague sense in which a 6-layer Transformer can do similar things as a series of 6 serial human thoughts maybe?? I don't know. There's definitely a ton of differences too.


My vague sense about foresight (rolling out multiple steps before deciding what to do) is that it's helpful for sample-efficiency but not required in the limit of infinite training data. Some examples: in RL, both TD learning and tree search eventually converge to the same optimal answer; AlphaGo without a tree search is good but not as good as AlphaGo with a tree search.

Perhaps not coincidentally, language models are pretty sample inefficient compared to people...

In my everyday life, I feel like my thoughts very often involve a sequence of two or three chunks, like "I will reach into my bag and then pull out my wallet", and somewhat less often is it a longer sequence than that, but i dunno.

Maybe "AlphaStar can't properly block or not block narrow passages using buildings" is an example where it's held back by lack of foresight.

Comment by steve2152 on If AI is based on GPT, how to ensure its safety? · 2020-06-19T17:35:37.386Z · score: 6 (3 votes) · LW · GW

You're asking about pure predictive (a.k.a. self-supervised) learning. As far as I know, it's an open question what the safety issues are for that (if any), even in a very concrete case like "this particular Transformer architecture trained on this particular dataset using SGD". I spent a bit of time last summer thinking about it, but didn't get very far. See my post self-supervised learning and manipulative predictions for one particular possible failure mode that I wasn't able to either confirm or rule out. (I should go back to it at some point.) See also my post self-supervised learning and AGI safety for everything else I know on the topic. And of course I must mention Abram's delightful Parable of Predict-o-matic if you haven't already seen it; again, this is high-level speculation that might or might not apply to any particular concrete system ("this particular Transformer architecture trained by SGD"). Lots of open questions!

An additional set of potential problems comes from your suggestion to put it in a robot body and actually execute the commands. Can it even walk? Of course it can figure out walking by letting it try with a reward signal, but now we're not talking about pure predictive learning anymore. Hmm, after thinking about it, I guess I'm cautiously optimistic that, in the limit of infinite training data from infinitely many robots learning to walk, a large enough Transformer doing predictive learning could learn to read its own sense data and walk without any reward signal. But then how do you get it to do useful things? My suggestion here was to put a metadata flag into inputs where a robot is being super-helpful, and then when you have the robot start acting in the real world, turn that flag on. Now we're bringing in supervised learning, I guess.

In the event that the robot was actually capable of doing anything at all, I would be very concerned that you press go and then the system wanders farther and farther out of distribution and does weird, dangerous things that have a high impact on the world.

As for concrete advice for the GPT-7 team: I would suggest at least throwing out the robot body and making a text / image prediction system in a box, and then put a human in the loop looking at the screen before going out and doing stuff. This can still be very powerful and economically useful, and it's a step in the right direction: it eliminates the problem of the system just going off and doing something weird and high-impact in the world because it wandered out of distribution. It doesn't eliminate all problems, because the system might still become manipulative. As I mentioned in the 1st paragraph, I don't know whether that's a real problem or not, more research is needed. It's possible that we're all just doomed in your scenario. :-)

Comment by steve2152 on Jeff Hawkins on neuromorphic AGI within 20 years · 2020-06-17T19:52:43.226Z · score: 2 (1 votes) · LW · GW

Ooh, sounds interesting, thanks for the tip!

Hofstadter also has a thing maybe like that when he talks about "analogical reasoning".

Comment by steve2152 on Inner alignment in the brain · 2020-06-17T18:50:29.129Z · score: 5 (3 votes) · LW · GW


I’m curious what you think is going on here that seems relevant to inner alignment.

Hmm, I guess I didn't go into detail on that. Here's what I'm thinking.

For starters, what is inner alignment anyway? Maybe I'm abusing the term, but I think of two somewhat different scenarios.

  • In a general RL setting, one might say that outer alignment is alignment between what we want and the reward function, and inner alignment is alignment between the reward function and "what the system is trying to do". (This one is closest to how I was implicitly using the term in this post.)

  • In the "risks from learned optimization" paper, it's a bit different: the whole system (perhaps an RL agent and its reward function, or perhaps something else entirely) is conceptually bundled together into a single entity, and you do a black-box search for the most effective "entity". In this case, outer alignment is alignment between what we want and the search criterion, and inner alignment is alignment between the search criterion and "what the system is trying to do". (This is not really what I had in mind in this post, although it's possible that this sort of inner alignment could also come up, if we design the system by doing an outer search, analogous to evolution.)

Note that neither of these kinds of "inner alignment" really comes up in existing mainstream ML systems. In the former (RL) case, if you think of an RL agent like AlphaStar, I'd say there isn't a coherent notion of "what the system is trying to do", at least in the sense that AlphaStar does not do foresighted planning towards a goal. Or take AlphaGo, which does have foresighted planning because of the tree search; but here we program the tree search by hand ourselves, so there's no risk that the foresighted planning is working towards any goal except the one that we coded ourselves, I think.

So, "RL systems that do foresighted planning towards explicit goals which it invents itself" are not much of a thing these days (as far as I know), but they presumably will be a thing in the future (among other things, this is essential for flexibly breaking down goals into sub-goals). And the neocortex is in this category. So yeah, it seems reasonable to me to extend the term "inner alignment" to this case too.

So anyway, the neocortex creates explicit goals for itself, like "I want to get out of debt", and uses foresight / planning to try to bring them about. (Of course it creates multiple contradictory goals, and also has plenty of non-goal-seeking behaviors, but foresighted goal-seeking is one of the things people sometimes do! And of course transient goals can turn into all-consuming goals in self-modifying AGIs) The neocortical goals have something to do with subcortical reward signals, but it's obviously not a deterministic process, and therefore there's an opportunity for inner alignment problems.

...noting there doesn’t seem to be much divergence between their objective functions...

This is getting into a different question I think... OK, so, If we build an AGI along the lines of a neocortex-like system plus a subcortex-like system that provides reward signals and other guidance, will it reliably do the things we designed it to do? My default is usually pessimism, but I guess I shouldn't go too far. I think some of the things that this system is designed to do, it seems to do very reliably. Like, almost everyone learns language. This requires, I believe, a cooperation between the neocortex and a subcortical system that flags human speech sounds as important. And it works almost every time! A more important question is, can we design the system such that the neocortex will wind up reliably seeking pre-specified goals? Here, I just don't know. I don't think humans and animals provide strong evidence either way, or at least it's not obvious to me...

Comment by steve2152 on What are the high-level approaches to AI alignment? · 2020-06-16T23:30:34.356Z · score: 4 (2 votes) · LW · GW

Sorta related is my appendix to this article.

Comment by steve2152 on Research snap-shot: question about Global Workspace Theory · 2020-06-16T16:56:07.524Z · score: 2 (1 votes) · LW · GW

I do think signal propagation time is probably a big contributor. I think activating a generative model in the GNW entails activating a particular set of interconnected neurons scattered around the GNW parts of the neocortex, which in turn requires those neurons to talk with each other. You can think of a probabilistic graphical model ... you change the value of some node and then run the message-passing algorithm a bit, and the network settles into a new configuration. Something like that, I think...

Comment by steve2152 on Research snap-shot: question about Global Workspace Theory · 2020-06-16T14:18:00.906Z · score: 4 (2 votes) · LW · GW

The GNW seems like it can only broadcast "simple" or "small" things. ... Something like a hypothesis in the PP paradigm seems like too big and complex a thing to be "sent" on the GNW.

My picture is: the GNW can broadcast everything that you have ever consciously thought about. This is an awfully big space. And it is a space of generative models (a.k.a. hypotheses).

The GNW is at the top of the (loose) hierarchy: the GNW sends predictions down to the lower-level regions of the neocortex, which in turn send prediction errors back to the GNW.

If, say, some particular upcoming input is expected to be important, then GNW can learn to build a generative model where the prediction of that input has a high confidence attached to it. That will bias the subsequent behavior of the system towards whatever models tend to be driven by the prediction error coming from that input. Thus we build serial computers out of our parallel minds.

Comment by steve2152 on Research snap-shot: question about Global Workspace Theory · 2020-06-16T14:02:43.715Z · score: 4 (2 votes) · LW · GW

Given the serial, discrete nature of the GNW, it follows that consciousness is fundamentally a discrete and choppy thing, not a smooth continuous stream.

Here's my take. Think of the neocortex as having a zoo of generative models with methods for building them and sorting through them. The models are compositional—compatible models can snap together like legos. Thus I can imagine a rubber wine glass, because the rubber generative models bottom out in a bunch of predictions of boolean variables, the wine glass generative models bottom out in a bunch of predictions of different boolean variables (and/or consistent predictions of the same boolean variables), and therefore I can union the predictions of the two sets of models.

Your GNW has an active generative model built out of lots of component models. I would say that the "tennis-match-flow" case entails little sub-sub-components asynchronously updating themselves as new information comes in—the tennis ball was over there, and now it's over here. By contrast the more typically "choppy" way of thinking involves frequently throwing out the whole manifold of generative models all at once, and activating a wholly new set of interlocking generative models. The latter (unlike the former) involves an attentional blink, because it takes some time for all the new neural codes to become active and synchronized, and in between you're in an incoherent, unstable state with mutually-contradictory generative models fighting it out.

Perhaps the attentional blink literature is a bit complicated because, with practice or intention, you can build a single GNW generative model that predicts both of two sequential inputs.

Comment by steve2152 on Research snap-shot: question about Global Workspace Theory · 2020-06-16T13:39:49.637Z · score: 4 (2 votes) · LW · GW

Q.1: Is the global neuronal workspace a bottleneck for motor control?

I vote strong no, or else I'm misunderstanding what you're talking about. Let's say you're standing up, and your body tips back microscopically, and you slightly tension your ankles to compensate and stay balanced. Are you proposing that this ankle-tension command has to go through the GNW? I'm quite confident that it doesn't. Stuff like that doesn't necessarily even reach the neocortex at all, let alone the GNW. In this post I mentioned the example of Ian Waterman, who could not connect motor control output signals to feedback signals except by routing through the GNW. He had to be consciously thinking about how his body was moving constantly; if he got distracted he would collapse.

Comment by steve2152 on What are some Civilizational Sanity Interventions? · 2020-06-14T16:50:08.314Z · score: 9 (2 votes) · LW · GW

Note also that politicians will strategically choose to be less polarizing, if being less polarizing is the recipe for electoral success. (Or less-polarizing politicians will be the ones who succeed and become prominent contributors to national conversation.) And people take cues from politicians, they don't just elect politicians who agree with their fixed opinions. So anyway, I guess I'm saying, there isn't a clean upstream / downstream flow, I think...

Comment by steve2152 on The "hard" problem of consciousness is the least interesting problem of consciousness · 2020-06-08T22:28:55.205Z · score: 8 (4 votes) · LW · GW

Given that, I think I want to rename Chalmers' categories as the "Boring" and "Interesting" problem of consciousness.

Just to be sure, hard=boring and easy=interesting, right?

Comment by steve2152 on [Link] Lex Fridman Interviews Karl Friston · 2020-06-08T15:46:42.705Z · score: 7 (2 votes) · LW · GW

Update: They're complementary. The Sean Carroll podcast had much more technical details, beyond what even I could follow (as a professional physicist). The Lex one is much more basic. In the Lex one, I think Friston was better at clarifying which parts of his free energy stuff are weird galaxy-brain ways to think about obvious tautological things, and which parts are content-ful falsifiable hypotheses.

Comment by steve2152 on Consciousness as Metaphor: What Jaynes Has to Offer · 2020-06-07T23:49:19.052Z · score: 2 (1 votes) · LW · GW


Comment by steve2152 on Consciousness as Metaphor: What Jaynes Has to Offer · 2020-06-07T20:04:37.559Z · score: 6 (3 votes) · LW · GW

I think everything you wrote is consistent with my current take on the mechanisms of consciousness. Our conception of consciousness is a mental model (schema) of a mental process (the channeling of information through the global neuronal workspace), and of course this mental model is built out of the parts of various other mental models (as analogies/metaphors), just like our mental models of everything else is constructed that way.

(Do tell me if you disagree with that.)

Looking forward to your subsequent posts exploring how this works in more detail! :-)

Comment by steve2152 on Reply to Paul Christiano's “Inaccessible Information” · 2020-06-06T17:17:38.567Z · score: 2 (1 votes) · LW · GW

OK, well I spend most of my time thinking about a particular AGI architecture (1 2 etc.) in which the learning algorithm is legible and hand-coded ... and let me tell you, in that case, all the problems of AGI safety and alignment are still really really hard, including the "inaccessible information" stuff that Paul was talking about here.

If you're saying that it would be even worse if, on top of that, the learning algorithm itself is opaque, because it was discovered from a search through algorithm-space ... well OK, yeah sure, that does seem even worse.

Comment by steve2152 on Reply to Paul Christiano's “Inaccessible Information” · 2020-06-06T02:29:52.054Z · score: 6 (4 votes) · LW · GW

finding a solution to the design problem for intelligent systems that does not rest on a blind search for policies that satisfy some evaluation procedure

I'm a bit confused by this. If you want your AI to come up with new ideas that you hadn't already thought of, then it kinda has to do something like running a search over a space of possible ideas. If you want your AI to understand concepts that you don't already have yourself and didn't put in by hand, then it kinda has to be at least a little bit black-box-ish.

In other words, let's say you design a beautiful AGI architecture, and you understand every part of it when it starts (I'm actually kinda optimistic that this part is possible), and then you tell the AGI to go read a book. After having read that book, the AGI has morphed into a new smarter system which is closer to "black-box discovered by a search process" (where the learning algorithm itself is the search process).

Right? Or sorry if I'm being confused.

Comment by steve2152 on Human instincts, symbol grounding, and the blank-slate neocortex · 2020-06-03T01:12:57.333Z · score: 5 (3 votes) · LW · GW

Thanks for the comment! When I think about it now (8 months later), I have three reasons for continuing to think CCA is broadly right:

  1. Cytoarchitectural (quasi-) uniformity. I agree that this doesn't definitively prove anything by itself, but it's highly suggestive. If different parts of the cortex were doing systematically very different computations, well maybe they would start out looking similar when the differentiation first started to arise millions of years ago, but over evolutionary time you would expect them to gradually diverge into superficially-obviously-different endpoints that are more appropriate to their different functions.

  2. Narrowness of the target, sorta. Let's say there's a module that takes specific categories of inputs (feedforward, feedback, reward, prediction-error flags) and has certain types of outputs, and it systematically learns to predict the feedforward input and control the outputs according to generative models following this kind of selection criterion (or something like that). This is a very specific and very useful thing. Whatever the reward signal is, this module will construct a theory about what causes that reward signal and make plans to increase it. And this kind of module automatically tiles—you can connect multiple modules and they'll be able to work together to build more complex composite generative models integrating more inputs to make better reward predictions and better plans. I feel like you can't just shove some other computation into this system and have it work—it's either part of this coordinated prediction-and-action mechanism, or not (in which case the coordination prediction-and-action mechanism will learn to predict it and/or control it, just like it does for the motor plant etc.). Anyway, it's possible that some part of the neocortex is doing a different sort of computation, and not part of the prediction-and-action mechanism. But if so, I would just shrug and say "maybe it's technically part of the neocortex, but when I say "neocortex", I'm using the term loosely and excluding that particular part." After all, I am not an anatomical purist; I am already including part of the thalamus when I say "neocortex" for example (I have a footnote in the article apologizing for that). Sorry if this description is a bit incoherent, I need to think about how to articulate this better.

  3. Although it's probably just the Dunning-Kruger talking, I do think I at least vaguely understand what the algorithm is doing and how it works, and I feel like I can concretely see how it explains everything about human intelligence including causality, counterfactuals, hierarchical planning, task-switching, deliberation, analogies, concepts, etc. etc.

Comment by steve2152 on Building brain-inspired AGI is infinitely easier than understanding the brain · 2020-06-02T15:20:49.116Z · score: 5 (3 votes) · LW · GW

The human neocortical algorithm probably wouldn't work very well if it were applied in a brain 100x smaller

I disagree, as I discussed here, I think the neocortex is uniform-ish and that a cortical column in humans is doing a similar calculation as a cortical column in rats or the equivalent bundle of cells (arranged not as a column) in a bird pallium or lizard pallium. I do think you need lots and lots of cortical columns, initialized with appropriate region-to-region connections, to get human intelligence. Well, maybe that's what you meant by "human neocortical algorithm", in which case I agree. You also need appropriate subcortical signals guiding the neocortex, for example to flag human speech sounds as being important to attend to.

human intelligence minus rat intelligence is probably easier to understand and implement than rat intelligence alone..

Well, I do think that there's a lot of non-neocortical innovations between humans and rats, particularly to build our complex suite of social instincts, see here. I don't think understanding those innovations is necessary for AGI, although I do think it would be awfully helpful to understand them if we want aligned AGI. And I think they are going to be hard to understand, compared to the neocortex.

I don't think we can learn arbitrarily domains, not even close

Sure. A good example is temporal sequence learning. If a sequence of things happens, we expect the same sequence to recur in the future. In principle, we can imagine an anti-inductive universe where, if a sequence of things happens, then it's especially unlikely to recur in the future, at all levels of abstraction. Our learning algorithm would crash and burn in such a universe. This is a particular example of the no-free-lunch theorem, and I think it illustrates that, while there are domains that the neocortical learning algorithm can't learn, they may be awfully weird and unlikely to come up.

Comment by steve2152 on Building brain-inspired AGI is infinitely easier than understanding the brain · 2020-06-02T15:05:20.717Z · score: 2 (1 votes) · LW · GW

For example, I mentioned 1, 2, 3 earlier in the article... That should get you started but I'm happy to discuss more. :-)

Comment by steve2152 on [Link] Lex Fridman Interviews Karl Friston · 2020-05-31T13:09:35.019Z · score: 6 (4 votes) · LW · GW

Friston was also on Sean Carroll's podcast recently - link. I found it slightly helpful. I may listen to this one too; if I do I'll comment on how they compare.

Comment by steve2152 on On the construction of the self · 2020-05-29T16:06:38.973Z · score: 6 (3 votes) · LW · GW

I'm really enjoying all these posts, thanks a lot!

something about the argument brought unpleasant emotions into your mind. A subsystem activated with the goal of making those emotions go away, and an effective way of doing so was focusing your attention on everything that could be said to be wrong with your spouse.

Wouldn't it be simpler to say that righteous indignation is a rewarding feeling (in the moment) and we're motivated to think thoughts that bring about that feeling?

the “regretful subsystem” cannot directly influence the “nasty subsystem”

Agreed, and this is one of the reasons that I think normal intuitions about how agents behave don't necessarily carry over to self-modifying agents whose subagents can launch direct attacks against each other, see here.

it looks like craving subsystems run on cached models

Yeah, just like every other subsystem right? Whenever any subsystem (a.k.a. model a.k.a. hypothesis) gets activated, it turns on a set of associated predictions. If it's a model that says "that thing in my field of view is a cat", it activates some predictions about parts of the visual field. If it's a model that says "I am going to brush my hair in a particular way", it activates a bunch of motor control commands and related sensory predictions. If it's a model that says "I am going to get angry at them", it activates, um, hormones or something, to bring about the corresponding arousal and valence. All these examples seem like the same type of thing to me, and all of them seem like "cached models".

Comment by steve2152 on From self to craving (three characteristics series) · 2020-05-29T13:26:25.629Z · score: 2 (1 votes) · LW · GW


I don't really see the idea of hypotheses trying to prove themselves true. Take the example of saccades that you mention. I think there's some inherent (or learned) negative reward associated with having multiple active hypotheses (a.k.a. subagents a.k.a. generative models) that clash with each other by producing confident mutually-inconsistent predictions about the same things. So if model A says that the person coming behind you is your friend and model B says it's a stranger, then that summons model C which strongly predicts that we are about to turn around and look at the person. This resolves the inconsistency, and hence model C is rewarded, making it ever more likely to be summoned in similar circumstances in the future.

You sorta need multiple inconsistent models for it to make sense for one to prove one of them true. How else would you figure out which part of the model to probe? If a model were trying to prevent itself from being falsified, that would predict that we look away from things that we're not sure about rather than towards them.

OK, so here's (how I think of) a typical craving situation. There are two active models.

Model A: I will eat a cookie and this will lead to an immediate reward associated with the sweet taste

Model B: I won't eat the cookie, instead I'll meditate on gratitude and this will make me very happy

Now in my perspective, this is great evidence that valence and reward are two different things. If becoming happy is the same as reward, why haven't I meditated in the last 5 years even though I know it makes me happy? And why do I want to eat that cookie even though I totally understand that it won't make me smile even while I'm eating it, or make me less hungry, or anything?

When you say "mangling the input quite severely to make it fit the filter", I guess I'm imagining a scenario like, the cookie belongs to Sally, but I wind up thinking "She probably wants me to eat it", even if that's objectively far-fetched. Is that Model A mangling the evidence to fit the filter? I wouldn't really put it that way...

The thing is, Model A is totally correct; eating the cookie would lead to an immediate reward! It doesn't need to distort anything, as far as it goes.

So now there's a Model A+D that says "I will eat the cookie and this will lead to an immediate reward, and later Sally will find out and be happy that I ate the cookie, which will be rewarding as well". So model A+D predicts a double reward! That's a strong selective pressure helping advance that model at the expense of other models, and thus we expect this model to be adopted, unless it's being weighed down by a sufficiently strong negative prior, e.g. if this model has been repeatedly falsified in the past, or if it contradicts a different model which has been repeatedly successful and rewarded in the past.

(This discussion / brainstorming is really helpful for me, thanks for your patience.)

Comment by steve2152 on Wrist Issues · 2020-05-29T01:09:47.538Z · score: 7 (2 votes) · LW · GW

Here's my weird story from 15 years ago where I had a year of increasingly awful debilitating RSI and then eventually I read a book and a couple days later it was gone forever.

My dopey little webpage there has helped at least a few people over the years, including apparently the co-founder of, so I continue to share it, even though I find it mildly embarrassing. :-P

Needless to say, but YMMV; I can speak to my own experience, but I don't pretend to know how to cure anyone else.

Anyway, that sucks and I hope you feel better soon!

Comment by steve2152 on Source code size vs learned model size in ML and in humans? · 2020-05-27T14:47:32.804Z · score: 4 (2 votes) · LW · GW

OK, I think that helps.

It sounds like your question should really be more like how many programmer-hours go into putting domain-specific content / capabilities into an AI. (You can disagree.) If it's very high, then it's the Robin-Hanson-world where different companies make AI-for-domain-X, AI-for-domain-Y, etc., and they trade and collaborate. If it's very low, then it's more plausible that someone will have a good idea and Bam, they have an AGI. (Although it might still require huge amounts of compute.)

If so, I don't think the information content of the weights of a trained model is relevant. The weights are learned automatically. Changing the code from num_hidden_layers = 10 to num_hidden_layers = 100 is not 10× the programmer effort. (It may or may not require more compute, and it may or may not require more labeled examples, and it may or may not require more hyperparameter tuning, but those are all different things, and in no case is there any reason to think it's a factor of 10, except maybe some aspects of compute.)

I don't think the size of the PyTorch codebase is relevant either.

I agree that the size of the human genome is relevant, as long as we all keep in mind that it's a massive upper bound, because perhaps a vanishingly small fraction of that is "domain-specific content / capabilities". Even within the brain, you have to synthesize tons of different proteins, control the concentrations of tons of chemicals, etc. etc.

I think the core of your question is generalizability. If you have AlphaStar but want to control a robot instead, how much extra code do you need to write? Do insights in computer vision help with NLP and vice-versa? That kind of stuff. I think generalizability has been pretty high in AI, although maybe that statement is so vague as to be vacuous. I'm thinking, for example, it's not like we have "BatchNorm for machine translation" and "BatchNorm for image segmentation" etc. It's the same BatchNorm.

On the brain side, I'm a big believer in the theory that the neocortex has one algorithm which simultaneously does planning, action, classification, prediction, etc. (The merging of action and understanding in particular is explained in my post here, see also Planning By Probabilistic Inference.) So that helps with generalizability. And I already mentioned my post on cortical uniformity. I think a programmer who knows the core neocortical algorithm and wants to then imitate the whole neocortex would mainly need (1) a database of "innate" region-to-region connections, organized by connection type (feedforward, feedback, hormone receptors) and structure (2D array of connections vs 1D, etc.), (2) a database of region-specific hyperparameters, especially when the region should lock itself down to prevent further learning ("sensitive periods"). Assuming that's the right starting point, I don't have a great sense for how many bits of data this is, but I think the information is out there in the developmental neuroscience literature. My wild guess right now would be on the order of a few KB, but with very low confidence. It's something I want to look into more when I get a chance. Note also that the would-be AGI engineer can potentially just figure out those few KB from the neuroscience literature, rather than discovering it in a more laborious way.

Oh, you also probably need code for certain non-neocortex functions like flagging human speech sounds as important to attend to etc. I suspect that that particular example is about as straightforward as it sounds, but there might be other things that are hard to do, or where it's not clear what needs to be done. Of course, for an aligned AGI, there could potentially be a lot of work required to sculpt the reward function.

Just thinking out loud :)

Comment by steve2152 on From self to craving (three characteristics series) · 2020-05-23T12:53:18.568Z · score: 4 (2 votes) · LW · GW


My running theory so far (a bit different from yours) would be:

  • Motivation = prediction of reward
  • Craving = unhealthily strong motivation—so strong that it breaks out of the normal checks and balances that prevent wishful thinking etc.
  • When empathetically simulating someone's mental state, we evoke the corresponding generative model in ourselves (this is "simulation theory"), but it shows up in attenuated form, i.e. with weaker (less confident) predictions (I've already been thinking that, see here).
  • By meditative practice, you can evoke a similar effect in yourself, sorta distancing yourself from your feelings and experiencing them in a quasi-empathetic way.
  • ...Therefore, this is a way to turn cravings (unhealthily strong motivations) into mild, healthy, controllable motivations

Incidentally, why is the third bullet point true, and how is it implemented? I was thinking about that this morning and came up with the following ... If you have generative model A that in turn evokes (implies) generative model B, the model B inherits the confidence of A, attenuated by a measure of how confident you are that A leads to B (basically, P(A) × P(B|A)). So if you indirectly evoke a generative model, it's guaranteed to appear with a lower confidence value than if you directly evoke it.

In empathetic simulation, A would be the model of the person you're simulating, and B would be the model that you think that person is thinking / feeling.

Sorry if that's stupid, just thinking out loud :)

Comment by steve2152 on Get It Done Now · 2020-05-22T13:46:39.863Z · score: 16 (9 votes) · LW · GW

it includes detailed advice that approximately no one will follow

Hey, I read the book in 2012, and I still have a GTD-ish alphabetical file, GTD-ish desk "inbox", and GTD-ish to-do list. Of course they've all gotten watered down a bit over the years from the religious fervor of the book, but it's still something.

If you decide to eventually do a task that requires less than two minutes to do, that can efficiently be done right now, do it right now.

Robert Pozen Extreme Productivity has a closely-related principle he calls "OHIO"—Only Handle It Once. If you have all the decision-relevant information that you're likely to get, then just decide right away. He gives an example of getting an email invitation to something, checking his calendar, and immediately booking a flight and hotel. I can't say I follow that one very well, but at least I acknowledge it as a goal to aspire to.

Comment by steve2152 on Source code size vs learned model size in ML and in humans? · 2020-05-21T20:32:45.070Z · score: 4 (2 votes) · LW · GW

I'm not sure exactly what you're trying to learn here, or what debate you're trying to resolve. (Do you have a reference?)

If almost all the complexity is in architecture, you can have fast takeoff because it doesn't work well until the pieces are all in place; or you can have slow takeoff in the opposite case. If almost all the complexity is in learned content, you can have fast takeoff because there's 50 million books and 100,000 years of YouTube videos and the AI can deeply understand all of them in 24 hours; or you can have slow takeoff because, for example, maybe the fastest supercomputers can just barely run the algorithm at all, and the algorithm gets slower and slower as it learns more, and eventually grinds to a halt, or something like that.

If an algorithm uses data structures that are specifically suited to doing Task X, and a different set of data structures that are suited to Task Y, would you call that two units of content or two units of architecture?

(I personally do not believe that intelligence requires a Swiss-army-knife of many different algorithms, see here, but this is certainly a topic on which reasonable people disagree.)

Comment by steve2152 on Pointing to a Flower · 2020-05-20T10:25:55.032Z · score: 6 (3 votes) · LW · GW

If you're saying that "consistent low-level structure" is a frequent cause of "recurring patterns", then sure, that seems reasonable.

Do they always go together?

  • If there are recurring patterns that are not related to consistent low-level structure, then I'd expect an intuitive concept that's not an OP-type abstraction. I think that happens: for example any word that doesn't refer to a physical object: "emotion", "grammar", "running", "cold", ...

  • If there are consistent low-level structures that are not related to recurring patterns, then I'd expect an OP-type abstraction that's not an intuitive concept. I can't think of any examples. Maybe consistent low-level structures are automatically a recurring pattern. Like, if you make a visualization in which the low-level structure(s) is highlighted, you will immediately recognize that as a recurring pattern, I guess.

Comment by steve2152 on Pointing to a Flower · 2020-05-19T21:48:49.016Z · score: 13 (4 votes) · LW · GW

I think the human brain answer is close to "Flower = instance of a recurring pattern in the data, defined by clustering" with an extra footnote that we also have easy access to patterns that are compositions of other known patterns. For example, a recurring pattern of "rubber" and recurring pattern of "wine glass" can be glued into a new pattern of "rubber wine glass", such that we would immediately recognize one if we saw it. (There may be other footnotes too.)

Given that I believe that's the human brain answer, I'm immediately skeptical that a totally different approach could reliably give the same answer. I feel like either there's gotta be lots of cases where your approach gives results that we humans find unintuitive, or else you're somehow sneaking human intuition / recurring patterns into your scheme without realizing it. Having said that, everything you wrote sounds reasonable, I can't point to any particular problem. I dunno.

Comment by steve2152 on Craving, suffering, and predictive processing (three characteristics series) · 2020-05-17T18:50:16.458Z · score: 2 (1 votes) · LW · GW

Yeah, I haven't read any of these references, but I'll elaborate on why I'm currently very skeptical that "model-free" vs "model-based" is a fundamental difference.

I'll start with an example unrelated to motivation, to take it one step at a time.

Imagine that, every few hours, your whole field of vision turns bright blue for a couple seconds, then turns yellow, then goes back to normal. You have no idea why. But pretty soon, every time your field of vision turns blue, you'll start expecting it to then turn yellow within a couple seconds. This expectation is completely divorced from everything else you know, since you have no idea why it's happening, and indeed all your understanding of the world says that this shouldn't be happening.

Now maybe there's a temptation here to say that the expectation of yellow is model-free pattern recognition, and to contrast it with model-based pattern recognition, which would be something like expecting a chess master to beat a beginner, which is a pattern that you can only grasp using your rich contextual knowledge of the world.

But I would not draw that contrast. I would say that the kind of pattern recognition that makes us expect to see yellow after blue just from direct experience without understanding why, is exactly the same kind of pattern recognition that originally built up our entire world-model from scratch, and which continues to modify it throughout our lives.

For example, to a 1-year-old, the fact that the words "1 2 3 4..." is usually followed by "5" is just an arbitrary pattern, a memorized sequence of sounds. But over time we learn other patterns, like seeing two things while someone says "two", and we build connections between all these different patterns, and wind up with a rich web of memorized patterns that comprises our entire world-model.

Different bits of knowledge can be more or less integrated into this web. "I see yellow after blue, and I have no idea why" would be an extreme example—an island of knowledge isolated from everything else we know. But it's a spectrum. For example, take everyone on Earth who knows the phrase "E=mc²". There's a continuum, from people who treat it as a memorized sequence of meaningless sounds in the same category as "yabba dabba doo", to people who know that the E stands for energy but nothing else, to physics students who kinda get it, all the way to professional physicists who find E=mc² to be perfectly obvious and inevitable and then try to explain it on Quora because I guess I had nothing better to do on New Years Day 2014... :-)

So, I think model-based and model-free is not a fundamental distinction. But I do think that with different ways of acquiring knowledge, there are systematic trends in prediction strength, with first-hand experience leading to much stronger predictions than less-direct inferences. If I have repeated direct experience of my whole field of vision filling with yellow after blue, that will develop into a very very strong (confident) prediction. After enough times seeing blue-then-yellow, if I see blue-then-green I might literally jump out of my seat and scream!! By contrast, the kind of expectation that we arrive at indirectly via our world model tends to be a weaker prediction. If I see a chess master lose to a beginner, I'll be surprised, but I won't jump out of my seat and scream. Of course that's appropriate: I only predicted the chess master would win via a long chain of uncertain probabilistic inferences, like "the master was trying to win", "nobody cheated", "the master was sober", "chess is not the kind of game where you can win just by getting lucky", etc. So it's appropriate for me to be predicting the win with less confidence. As yet a third example, let's say a professional chess commentator is watching the same match, in the context of a proper tournament. The commentator actually might jump out of her chair and scream when the master loses! For her, the sight of masters crushing beginners is something that she has repeatedly and directly experienced. Thus her prediction is much stronger than mine. (I'm not really into chess.)

All this is about perception, not motivation. Now, back to motivation. I think we are motivated to do things proportionally to our prediction of the associated reward.

I think we learn to predict reward in a similar way that we learn to predict anything else. So it's the same idea. Some reward predictions will be from direct experience, and not necessarily well-integrated with the rest of our world-model: "Don't know why, but it feels good when I do X". It's tempting to call these "model-free". Other reward predictions will be more indirect, mediated by our understanding of how some plan will unfold. The latter will tend to be weaker reward predictions in general (as is appropriate since they rely on a longer chain of uncertain inferences), and hence they tend to be less motivating. It's tempting to call these "model-based". But I don't think it's a fundamental or sharp distinction. Even if you say "it feels good when I do X", we have to use our world-model to construct the category X and classify things as X or not-X. Conversely, if you make a plan expecting good results, you implicitly have some abstract category of "plans of this type" and you do have previous direct experience of rewards coming from the objects in this abstract category.

Again, this is just my current take without having read the literature :-D

Comment by steve2152 on Craving, suffering, and predictive processing (three characteristics series) · 2020-05-16T20:01:07.009Z · score: 2 (1 votes) · LW · GW

It seems like if you have to choose between bad options, the healthy thing is to declare that all your options are bad, and take the least bad one. This sometimes feels like "becoming resigned to your fate" maybe? The unhealthy thing is to fight against this, and not accept reality.

Why is the latter so tempting? I think it comes from the Temporal Difference Learning algorithm used by the brain's reward system. I think the TD learning algorithm attaches a very strong negative reward to the moment where you start believing that your predicted reward is a lot lower than what you had thought it would be before. So that would create an exceptionally strong motivation to not accept that, even if it's true.

This ties into my other comment that maybe craving is fundamentally the same as other motivations, but stronger, and in particular, so strong that it screws up our ability to think straight.

Comment by steve2152 on Craving, suffering, and predictive processing (three characteristics series) · 2020-05-16T19:47:28.198Z · score: 2 (1 votes) · LW · GW

After reading this and lukeprog's post you referenced, I'm still not convinced that there is fundamentally more than one motivational system—although I don't have high confidence and still want to chase down the references.

(Well, I see a distinction between actions not initiated by the neocortex, like flinching away from a projectile, versus everything else—see here—but that's not what we're talking about here.)

It seems to me that what you call "craving" is what I would call "an unhealthily strong motivation". The background picture in my head is this, where wishful thinking is a failure mode built into the deepest foundations of our thoughts. Wishful thinking stays under control mainly because the force of "What we can imagine and expect is constrained by past experience and world-knowledge" can usually defeat the force of Wishful thinking. But if we want something hard enough, it can break through those shackles, so that, for example, it doesn't get immediately suppressed even if our better judgment declares that it cannot work.

Like, take your dentist example:

  • The thought of being at the dentist is aversive.
  • The thought of having clean healthy teeth is attractive.

We make the decision by weighing these against each other, I think. You are categorizing the former as a craving and the latter as motivation-that-is-not-craving (right?), but they seem like fundamentally the same type of thing to me. (After all, we can weigh them against each other.) It seems like the difference is that the former is exceptionally strong—so strong that it prevents us from thinking straight about it. The latter is a normal mild attraction, which is healthy and unproblematic. I see a continuum between the two.

(If this is right, it doesn't undermine the idea that cravings exist and that we should avoid them. I still believe that. I'm just suggesting that maybe craving vs motivations-that-are-not-craving is a difference of degree not kind.)

I dunno, I'm just spitballing here :-D

Comment by steve2152 on Craving, suffering, and predictive processing (three characteristics series) · 2020-05-16T18:48:15.305Z · score: 2 (1 votes) · LW · GW

Craving superficially looks like it cares about outcomes. However, it actually cares about positive or negative feelings (valence).

Really? My model has been that you can want something without really enjoying it or being happy to have it. (That comes mostly from reading Scott's old post on wanting/liking/approving.) Or maybe you're using "feelings (valence)" in a broader sense that encompasses "dopamine rush"? (I may be misunderstanding the exact meaning of "valence"; I haven't dived deep into it, although I've been meaning to.)

Comment by steve2152 on Tips/tricks/notes on optimizing investments · 2020-05-13T00:17:16.276Z · score: 2 (1 votes) · LW · GW

The list of options and interest rates as of right now (2020-05-12) are here, so you can decide for yourself. (If anyone is reading this message in the future, PM me for an updated screenshot.)

I haven't paid much attention to how fast the transfers go through because I've never needed to transfer money in a hurry. My vague impression is that it usually takes a day or so, I guess.

Comment by steve2152 on Corrigibility as outside view · 2020-05-10T00:48:00.121Z · score: 9 (2 votes) · LW · GW

I'm trying to think whether or not this is substantively different from my post on corrigible alignment.

If the AGI's final goal is to "do what the human wants me to do" (or whatever other variation you like), then it's instrumentally useful for the AGI to create an outside-view-ish model of when its behavior does or doesn't accord with the human's desires.

Conversely, if you have some outside-view-ish model of how well you do what the human wants (or whatever) and that information guides your decisions, then it seems kinda implicit that you are acting in pursuit of a final goal of doing want the human wants.

So I guess my conclusion is that it's fundamentally the same idea, and in this post you're flagging one aspect of what successful corrigible alignment would look like. What do you think?

Comment by steve2152 on How uniform is the neocortex? · 2020-05-08T17:03:27.724Z · score: 4 (2 votes) · LW · GW

It's certainly true that you can't slice off a neocortex from the rest of the brain and expect it to work properly by itself. The neocortex is especially intimately connected to the thalamus and hippocampus, and so on.

But I don't think bringing up birds is relevant. Birds don't have a neocortex, but I think they have other structures that have a similar microcircuitry and are doing similar calculations—see this paper.

You can arrange neurons in different ways without dramatically altering the connectivity diagram (which determines the algorithm). The large-scale arrangement in the mammalian neocortex (six-layered structure) is different than the large-scale arrangement in the bird pallium, even if the two are evolved from the same origin and run essentially the same algorithm using the same types of neurons connected in the same way. ( far as I know, but I haven't studied this topic beyond skimming that paper I linked above.)

So why isn't it called "neocortex" in birds? I assume it's just because it looks different than the mammalian neocortex. I mean, the people who come up with terminology for naming brain regions, they're dissecting bird brains and describing how they look to the naked eye and under a microscope. They're not experts on neuron micro-circuitry. I wouldn't read too much into it.

I don't know much about fish brains, but certainly different animals have different brains that do different things. Some absolutely lack "higher-order functions"—e.g. nemotodes. I am comfortable saying that the moral importance of animals is a function F(brain) ... but what is that function F? I don't know. I do agree that F is not going to be a checklist of gross anatomical features ("three points for a neocortex, one point for a basal ganglia..."), but rather it should refer to the information-processing that this brain is engaged in.

I haven't personally heard anyone suggest that all mammals are all more morally important than all birds because mammals have a neocortex and birds don't. But if that is a thing people believe, I agree that it's wrong and we should oppose it.

Comment by steve2152 on Tips/tricks/notes on optimizing investments · 2020-05-08T00:45:09.290Z · score: 2 (1 votes) · LW · GW

I think MaxMyInterest only lists savings accounts that offer unlimited free online transfers. You might think that there must be a trade-off of that requirement against interest rate, but that doesn't seem to be the case; the rates are as good as anything on the market, even including CDs, as far as I've been able to tell the couple times I've quickly looked over the years. PM me if you want a screenshot of their current offerings.

Comment by steve2152 on Covid-19 5/7: Fighting Limbo · 2020-05-07T19:52:58.991Z · score: 2 (1 votes) · LW · GW

I think that, when most people use the term "herd immunity", they mean "herd immunity sufficient to get R<1 while everyone parties like it's 2019". That could require 75% to be infected, don't you think?

Comment by steve2152 on Tips/tricks/notes on optimizing investments · 2020-05-07T13:20:49.383Z · score: 12 (8 votes) · LW · GW

I've been using MaxMyInterest since 2015. They list out the highest-interest FDIC-insured savings accounts and make it easy to open them and transfer money between them. They'll also automatically track which of your savings accounts has the highest interest rate (if you have more than one) and move your money there. (Or if you have so much money that you exceed the FDIC limit ... which somehow has never been a problem for me! ... it can split your money into multiple accounts to get around that.) It also links with your low-interest everyday checking account, and will periodically transfer money back and forth to keep the latter balance at whatever amount you tell it. I really like that last feature, it saves me time and mental energy. They charge a fee of (currently) 0.08%/year × however much money you have in the high-interest savings accounts.

Comment by steve2152 on Assessing Kurzweil predictions about 2019: the results · 2020-05-06T15:25:44.257Z · score: 7 (4 votes) · LW · GW

Thanks¸ that actually helps a lot, I didn't get that it was from the voice of someone in the future. I still don't see any way to make sense of that as a "prediction", i.e. something that is true but was not fully recognized in 1999.

The closest thing I can think of that would make sense is if he were claiming that the neocortex comprises many specialized regions, each with its own topology and architecture of interneuronal connections (cf zhukeepa's post a couple days ago). But that's not it. Not only would Kurzweil be unlikely to say "brain" when he meant "neocortex", but I also happen to know that Kurzweil is a strong advocate against the idea that the neocortex comprises many architecturally-different regions. Well, at least he advocated for cortical uniformity in his 2012 book, and when I read that I also got the impression that he had believed the same thing for a long time before that.

I think he put that in and phrased it as a prediction just for narrative flow, while setting up the subsequent sentences ... like if he had written

"It is now fully recognized that every object is fundamentally made out of just a few dozen types of atoms. Therefore, molecular assemblers with the right feedstock can make any object on demand..."

or whatever. The first sentence here is phrased as a prediction but it isn't really.

Comment by steve2152 on Assessing Kurzweil predictions about 2019: the results · 2020-05-06T14:14:06.088Z · score: 3 (2 votes) · LW · GW

"It is now fully recognized that the brain comprises many specialized regions, each with its own topology and architecture of interneuronal connections."

Is this really a prediction? I would call it "A blindingly obvious fact." This page says "Herophilus not only distinguished the cerebrum and the cerebellum, but provided the first clear description of the ventricles", and the putamen and corpus callosum were discovered in the 16th century, etc. etc. Sorry if I'm misunderstanding, I don't know the context.

ETA: Maybe I should be more specific and nuanced. I think it's uncontroversial and known for hundreds if not thousands of years that the brain comprises many regions which look different—for example, the cerebellum, the putamen, etc. I think it's also widely agreed for 100+ years that each is "specialized", at least in the sense that different regions have different functions, although the term is kinda vague. The idea that "each [has] its own topology and architecture of interneuronal connections" is I think the default assumption ... if they had the same topology and architecture, why would they look different? And now that we know what neurons are and have good microscopes, this is no longer just a default assumption, but an (I think) uncontroversial observation.

Comment by steve2152 on How uniform is the neocortex? · 2020-05-06T13:58:37.173Z · score: 2 (1 votes) · LW · GW

In case it helps, my main aids-to-imagination right now are the sequence memory / CHMM story (see my comment here) and Dileep George's PGM-based vision model and his related follow-up papers like this, plus miscellaneous random other stuff.

Comment by steve2152 on How uniform is the neocortex? · 2020-05-06T13:50:38.463Z · score: 11 (4 votes) · LW · GW

Hmm, well I should say that my impression is that there's frustrating lack of consensus on practically everything in systems neuroscience, but "brain doesn't do backpropagation" seems about as close to consensus as anything. This Yoshua Bengio paper has a quick summary of the reasons:

The following difficulties can be raised regarding the biological plausibility of back-propagation: (1) the back-propagation computation (coming down from the output layer to lower hidden layers) is purely linear, whereas biological neurons interleave linear and non-linear operations, (2) if the feedback paths known to exist in the brain (with their own synapses and maybe their own neurons) were used to propagate credit assignment by backprop, they would need precise knowledge of the derivatives of the non-linearities at the operating point used in the corresponding feedforward computation on the feedforward path, (3) similarly, these feedback paths would have to use exact symmetric weights (with the same connectivity, transposed) of the feedforward connections, (4) real neurons communicate by (possibly stochastic) binary values (spikes), not by clean continuous values, (5) the computation would have to be precisely clocked to alternate between feedforward and back-propagation phases (since the latter needs the former’s results), and (6) it is not clear where the output targets would come from.

Then you ask the obvious followup question: "if not backprop, then what?" Well, this is unknown and controversial; the Yoshua Bengio paper above offers its own answer which I am disinclined to believe (but want to think about more). Of course there is more than one right answer; indeed, my general attitude is that if someone tells me a biologically-plausible learning mechanism, it's probably in use somewhere in the brain, even if it's only playing a very minor and obscure role in regulating heart rhythms or whatever, just because that's the way evolution tends to work.

But anyway, I expect that the lion's share of learning in the neocortex comes from just a few mechanisms. My favorite example is probably high-order sequence memory learning. There's a really good story for that:

  • At the lowest level—biochemistry—we have Why Neurons Have Thousands of Synapses, a specific and biologically-plausible mechanism for the creation and deactivation of synapses.

  • At the middle level—algorithms—we have papers like this and this and this where Dileep George takes pretty much that exact algorithm (which he calls "cloned hidden markov model"), abstracted away from the biological implementation details, and shows that it displays all sorts of nice behavior in practice.

  • At the highest level—behavior—we have observable human behaviors, like the fact that we can hear a snippet of a song, and immediately know how that snippet continues, but still have trouble remembering the song title. And no matter how well we know a song, we cannot easily sing the notes in reverse order. Both of these are exactly as expected from the properties of this sequence memory algorithm.

This sequence memory thing obviously isn't the whole story of what the neocortex does, but it fits together so well, I feel like it has to be one of the ingredients. :-)

Comment by steve2152 on How uniform is the neocortex? · 2020-05-05T15:31:57.500Z · score: 2 (1 votes) · LW · GW

The way I think of "activity is modulated dynamically" is:

We're searching through a space of generative models for the model that best fits the data and lead to the highest reward. The naive strategy would be to execute all the models, and see which one wins the competition. Unfortunately, the space of all possible models is too vast for that strategy to work. At any given time, only a subset of the vast space of all possible generative models is accessible, and only the models in that subset are able to enter the competition. What subset it is can be modulated by context, prior expectations ("you said this cloud is supposed to look like a dog, right?"), etc. I think (vaguely) that there are region-to-region connections within the brain that can be turned on and off, and different models require different configurations of that plumbing in order to fully express themselves. If there's a strong enough hint that some generative model is promising, that model will flex its muscles and fully actualize itself by creating the appropriate plumbing (region-to-region communication channels) to be properly active and able to flow down predictions.

Or something like that... :-)

Comment by steve2152 on How uniform is the neocortex? · 2020-05-05T15:30:53.661Z · score: 9 (5 votes) · LW · GW

Yeah I generally liked that discussion, with a few nitpicks, like I dislike the word "precision", because I think it's confidence levels attached to predictions of boolean variables (presence or absence of a feature), rather than a variances attached to predictions of real numbers. (I think this for various reasons including trying to think through particular examples, and my vague understanding of the associated neural mechanisms.)

I would state the fog example kinda differently: There are lots of generative models trying to fit the incoming data, and the "I only see fog" model is currently active, but the "I see fog plus a patch of clear road" model is floating in the background ready to jump in and match to data as soon as there's data that it's good at explaining.

I mean, "I am looking at fog" is actually a very specific prediction about visual input—fog has a specific appearance—so the "I am looking at fog" model is falsified (prediction error) by a clear patch. A better example of "low confidence about visual inputs" would be whatever generative models are active when you're very deep in thought or otherwise totally spaced out, ignoring your surroundings.