Posts

Matt Botvinick on the spontaneous emergence of learning algorithms 2020-08-12T07:47:13.726Z
[Link] Lex Fridman Interviews Karl Friston 2020-05-31T09:53:27.371Z
At what point should CFAR stop holding workshops due to COVID-19? 2020-02-25T09:59:17.910Z
CFAR: Progress Report & Future Plans 2019-12-19T06:19:58.948Z
Why are the people who could be doing safety research, but aren’t, doing something else? 2019-08-29T08:51:33.219Z
What's the optimal procedure for picking macrostates? 2019-08-26T09:34:15.647Z
If the "one cortical algorithm" hypothesis is true, how should one update about timelines and takeoff speed? 2019-08-26T07:08:19.634Z
Cognitive Benefits of Exercise 2019-08-14T21:40:35.145Z
adam_scholl's Shortform 2019-08-12T00:53:37.221Z

Comments

Comment by adam_scholl on Why indoor lighting is hard to get right and how to fix it · 2020-10-29T02:23:39.931Z · LW · GW

Yeah, makes sense. Fwiw, I have encountered one purportedly 97+ CRI lamp that looked awful to me. 

Comment by adam_scholl on Why indoor lighting is hard to get right and how to fix it · 2020-10-28T10:23:21.093Z · LW · GW

I really appreciate you writing this!

Just wanted to add that my informal impression from a few experiments is that the difference between 90 CRI bulbs and 95+ CRI bulbs is actually large. 

Comment by adam_scholl on adam_scholl's Shortform · 2020-10-11T18:31:08.591Z · LW · GW

Another (unlikely, but more likely than almost all other historical people) candidate for partial future revival: During the 79 AD eruption of Vesuvius, part of this man's brain was vitrified.

Comment by adam_scholl on My computational framework for the brain · 2020-09-16T07:16:56.812Z · LW · GW

Your posts about the neocortex have been a plurality of the posts I've been most excited reading this year. I am super interested in the questions you're asking, and it has long driven me nuts that I don't find these questions asked often in the neuroscience literature.

But there's an aspect of these posts I've found frustrating, which is something like the ratio of "listing candidate answers" to "explaining why you think those candidate answers are promising, relative to nearby alternatives."

Interestingly, I also have this gripe when reading Friston and Hawkins. And I feel like I also have this gripe about my own reasoning, when I think about this stuff—it feels phenomenologically like the only way I know how to generate hypotheses in this domain is by inducing a particular sort of temporary overconfidence.

I don't feel incentivized to do this nearly as much in other domains, and I'm not sure what's going on. My lead hypothesis is that in neuroscience, data is so abundant, and theories/frameworks so relatively scarce, that it's unusually helpful to ignore lots of things—e.g. via the "take as given x, y, z, and p" motion—in order to make conceptual progress. And maybe there's just so much available data here that it would be terribly sisiphean to try to justify all the things one takes as given when forming or presenting intuitions about underlying frameworks. (Indeed, my lead hypothesis for why so many neuroscientists seem to employ strategies like, "contribute to the 'understanding road systems' project by spending their career measuring the angles of stop-sign poles relative to the road," is that they feel it's professionally irresponsible, or something, to theorize about underlying frameworks without first trying to concretely falsify a sisiphean-rock-sized mountain of assumptions).

Still, I think some amount of this motion is clearly necessary to avoid accidentally deluding yourself, and the references in your posts make me think you do at least some of it already. So I guess I just want to politely—and super gratefully, I'm really glad you write these posts regardless! If trying to do this would turn you into a stop sign person, don't do it!—suggest that explicating these more might make it easier for readers to understand and come to share your intuitions.

I have more proto-questions about your model than I have time to flesh them out well enough to describe, but here are some that currently feel top-of-mind:

  • Say there exist genes that confer advantage in math-ey reasoning. By what mechanism is this advantage mediated, if the neocortex is uniform? One story, popular among the "stereotypes of early 2000s cognitive scientists" section of my models, is that brains have an "especially suitable for maths" module, and that genes induce various architectural changes which can improve or degrade its quality. What would a neocortical uniformist's story be here—that genes induce architectural changes which alter the quality of the One Learning Algorithm in general? If you explain it as genes having the ability to tweak hyperparameters or the gross wiring diagram in order to degrade or improve certain circuits' ability to run algorithms this domain-specific, is it still explanatorily useful to describe the neocortex as uniform?
    • My quick, ~90 min investigation into whether neuroscience as a field buys the neocortical uniformity hypothesis suggested it's fairly controversial. Do you know why? Are the objections mostly similar to those of Marcus et al.?
  • Do you have the intuition that aspects of the neocortical algorithm itself (or the subcortical algorithms themselves) might be safety-relevant? Or is your safety-relevance intuition mostly about the subcortical steering mechanism? (Fwiw, I have the former intuition—i.e., I'm suspicious that some of the features of the neocortical algorithm that cause humans to differ from "optimizers" exist for safety-relevant reasons).
  • In general I feel intensely frustrated with the focus in neuroscience on the implementational Marr Level, relative to the computational and algorithmic levels. I liked the mostly-computational overview here, and the algorithmic sketch in your Predictive Coding = RL + SL + Bayes + MPC post, but I feel bursting with implementational questions. For example:
    • As I understand it, you mention "PGM-type message-passing" as a candidate class of algorithm that might perform the "select the best from a population of models" function. Do you just mean you suspect there is something in the general vicinity of a belief propagation algorithm going on here, or is your intuition more specific? If the latter, is the Dileep George paper the main thing motivating that intuition?
    • I don't currently know whether the neuroscience lit contains good descriptions of how credit assignment is implemented. Do you? Do you feel like you have a decent guess, or know whether someone else does?
      • I have the same question about whatever mechanism approximates Bayesian priors—I keep encountering vague descriptions of it being encoded in dopamine distributions, but I haven't found a good explanation of how that might actually work.
  • Are you sure PP deemphasizes the "multiple simultaneous generative models" frame? I understood the references to e.g. the "cognitive economy" in Surfing Uncertainty to be drawing an analogy between populations of individuals exchanging resources in a market, and populations of models exchanging prediction error in the brain.
  • Have you thought much about whether there are parts of this research you shouldn't publish? I notice feeling slightly nervous every time I see you've made a new post, I think because I basically buy the "safety and capabilities are in something of a race" hypothesis, and fear that succeeding at your goal and publishing about it might shorten timelines.
Comment by adam_scholl on Matt Botvinick on the spontaneous emergence of learning algorithms · 2020-08-23T03:52:18.572Z · LW · GW

Gwern, I'm curious whether you would guess that something like mesa-optimization, broadly construed, is happening in GPT-3?

Comment by adam_scholl on Matt Botvinick on the spontaneous emergence of learning algorithms · 2020-08-23T00:08:42.897Z · LW · GW
This post primarily argues that a phenomenon is evidence for [learned models being likely to encode search algorithms]

I do mention interpreting the described results "as tentative evidence" about mesa-optimization at the end of the post, and this interpretation was why I wrote the post; fwiw, my impression remains that this interpretation is correct. But the large majority of the post is just me repeating or paraphrasing claims made by DeepMind researchers, rather than making claims myself; I wrote it this way intentionally, since I didn't feel I had sufficient domain knowledge to assess the researchers' claims well myself.

Comment by adam_scholl on Matt Botvinick on the spontaneous emergence of learning algorithms · 2020-08-22T23:23:36.728Z · LW · GW

I feel confused about why, given your model of the situation, the researchers were surprised that this phenomenon occurred, and seem to think it was a novel finding that it will inevitably occur given the three conditions described. Above, you mentioned the hypothesis that maybe they just "weren't very familiar with AI." Looking at the author list, and at their publications (1, 2, 3, 4, 5, 6, 7, 8), this seems implausible to me. While most of the eight co-authors are neuroscientists by training, three have CS degrees (one of whom is Demis Hassabis), and all but one have co-authored previous ML papers. It's hard for me to imagine their surprise was due simply to them lacking basic knowledge about RL?

And this OpenAI paper (whose authors I think you would describe as familiar with ML), which the summary of Wang et al. on the DeepMind website describes as "closely related work," and which appears to me to describe a very similar setup, describes their result in similar terms:

We structure the agent as a recurrent neural network, which receives past rewards, actions, and termination flags as inputs in addition to the normally received observations. Furthermore, its internal state is preserved across episodes, so that it has the capacity to perform learning in its own hidden activations. The learned agent thus also acts as the learning algorithm, and can adapt to the task at hand when deployed.

The OpenAI authors also seem to me to think they can gather evidence about the structure of the algorithm simply by looking at its behavior. Given a similar series of experiments (mostly bandit tasks, but also a maze solver), they conclude:

the dynamics of the recurrent network come to implement a learning algorithm entirely separate from the one used to train the network weights... the procedure the recurrent network implements is itself a full-fledged reinforcement learning algorithm, which negotiates the exploration-exploitation tradeoff and improves the agent’s policy based on reward outcomes... this learned RL procedure can differ starkly from the algorithm used to train the network’s weights.

They then run an experiment designed specifically to distinguish whether meta-RL was giving rise to a model-free system, or “a model-based system which learns an internal model of the environment and evaluates the value of actions at the time of decision-making through look-ahead planning,” and suggest the evidence implies the latter. This sounds like a description of search to me—do you think I'm confused?

I get the impression from your comments that you think it's naive to describe this result as "learning algorithms spontaneously emerge." You describe the lack of LW/AF pushback against that description as "a community-wide failure," and mention updating as a result toward thinking AF members “automatically believe anything written in a post without checking it.”

But my impression is that OpenAI describes their similar result in basically the same way. Do you think my impression is wrong? Or e.g. that their description is also misleading?

--

I've been feeling very confused lately about how people talk about "search," and have started joking that I'm a search panpsychist. Lots of interesting phenomenon look like piles of thermostats when viewed from the wrong angle, and I worry the conventional lens is deceptively narrow.

That said, when I condition on (what I understand to be) the conventional understanding, it's difficult for me to imagine how e.g. the maze-solver described in the OpenAI paper reliably and quickly locates the exit to new mazes, without doing something reasonably describable as searching for them.

And it seems to me that Wang et al. should be taken as evidence that "learning algorithms producing other search-performing learning algorithms" is convergently useful/likely to be a common feature of future systems, even if you don't think that's what happened in their paper, assuming you assign some credence to their hypothesis that this is what's going on in PFC, and to the hypothesis that search occurs in PFC.

If the primary difference between the DeepMind and OpenAI meta-RL architecture and the PFC/DA architecture is scale, then I think there's reasonable reason to suspect that something much like mesa-optimization will emerge in future meta-RL systems, even if it hasn't yet. That is, I interpret this result as evidence for the hypothesis that highly competent general-ish learners might tend to exhibit this feature, since (among other reasons) it increased my credence that it is already exhibited by the only existing member of that reference class.

Upthread, Evan mentions agreeing that this result is "not new evidence in favor of mesa-optimization." But he also mentions that Risks from Learned Optimization references these two papers, describing them as "the closest to producing mesa-optimizers of any existing machine learning research." I feel confused about how to reconcile these two claims. I didn't realize these papers were mentioned in Risks from Learned Optimization, but if I had, I think I would have been even more inclined to post this/try to ensure people knew about the results, since my (perhaps naive, perhaps not understanding ways this is disanalogous) prior is that the closest existing example to this problem might provide evidence about its nature or likelihood.

Comment by adam_scholl on adam_scholl's Shortform · 2020-08-21T08:59:04.787Z · LW · GW

In college, people would sometimes discuss mu-eliciting questions like, "What does it mean to be human?"

I came across this line in a paper tonight and laughed out loud, imagining it as an answer:

"Maximizing this objective is equivalent to minimizing the cumulative pseudo-regret."
Comment by adam_scholl on Matt Botvinick on the spontaneous emergence of learning algorithms · 2020-08-21T08:27:21.977Z · LW · GW

I appreciate you writing this, Rohin. I don’t work in ML, or do safety research, and it’s certainly possible I misunderstand how this meta-RL architecture works, or that I misunderstand what’s normal.

That said, I feel confused by a number of your arguments, so I'm working on a reply. Before I post it, I'd be grateful if you could help me make sure I understand your objections, so as to avoid accidentally publishing a long post in response to a position nobody holds.

I currently understand you to be making four main claims:

  1. The system is just doing the totally normal thing “conditioning on observations,” rather than something it makes sense to describe as "giving rise to a separate learning algorithm."
  2. It is probably not the case that in this system, “learning is implemented in neural activation changes rather than neural weight changes.”
  3. The system does not encode a search algorithm, so it provides “~zero evidence” about e.g. the hypothesis that mesa-optimization is convergently useful, or likely to be a common feature of future systems.
  4. The above facts should be obvious to people familiar with ML.

Does this summary feel like it reasonably characterizes your objection?

Comment by adam_scholl on Matt Botvinick on the spontaneous emergence of learning algorithms · 2020-08-21T02:38:32.736Z · LW · GW

That gwern essay was helpful, and I didn't know about it; thanks.

Comment by adam_scholl on Matt Botvinick on the spontaneous emergence of learning algorithms · 2020-08-18T21:20:02.622Z · LW · GW

The scenario I had in mind was one where death occurs as a result of damage caused by low food consumption, rather than by suicide.

Comment by adam_scholl on Matt Botvinick on the spontaneous emergence of learning algorithms · 2020-08-18T17:42:47.263Z · LW · GW
One way catastrophic alignment in this sense is difficult for humans is that the PFC cannot divorce itself from the DA; I'd expect that a failure mode leading to systematically low DA rewards would usually be corrected

I'm not sure such divorce is all that rare. For example, anorexia sometimes causes people to find food anti-rewarding (repulsive/inedible, even when they're dying and don't wish to), and I can imagine that being because PFC actually somehow alters DAs reward function.

That said, I do share the hunch that something like a "divorce resistance" trick occurs and is helpful. I took Kaj and Steve to be gesturing at something similar elsewhere in the thread. But I notice feeling confused about how exactly this trick works. Does it scale...?

I have the intuition that it doesn't—that as the systems increase in power, divorce will occur more easily. That is, I have the intuition that if PFC were trying, so to speak, to divorce itself from DA supervision, that it could probably find some easy-ish way to succeed, e.g. by reconfiguring itself to hide activity from DA, or to send reward-eliciting signals to DA regardless of what goal it was pursuing.

Comment by adam_scholl on Matt Botvinick on the spontaneous emergence of learning algorithms · 2020-08-18T16:11:41.608Z · LW · GW
I think it makes more sense to operationalize "catastrophic" here as "leading to systematically low DA reward

Thanks—I feel pretty convinced that this operationalization makes more sense than the one I proposed.

Comment by adam_scholl on Matt Botvinick on the spontaneous emergence of learning algorithms · 2020-08-18T00:32:30.161Z · LW · GW

That's a really interesting point, and I hadn't considered it. Thanks!

Comment by adam_scholl on Matt Botvinick on the spontaneous emergence of learning algorithms · 2020-08-17T20:32:02.105Z · LW · GW

Kaj, the point I understand you to be making is: "The inner RL algorithm in this scenario seems likely to be reliably aligned with the outer RL algorithm, since the former was selected specifically on the basis of it being good at accomplishing the latter's objective, and since if the former deviates from pursuing that objective it will receive less reward from the outer alg, leading it to reconfigure itself to be more aligned. And since the two algorithms operate on similar time scales, we should expect any such misalignment to be noticed/corrected quickly." Does this seem like a reasonable paraphrase?

It doesn't feel obvious to me that the outer layer will be able to reliably steer the inner layer in this sense, especially as the system becomes more powerful. For example, it seems plausible to me that the inner layer might come to optimize for its proxy estimations of outer reward more than for outer reward itself, and that those two things could become decoupled.

Comment by adam_scholl on Matt Botvinick on the spontaneous emergence of learning algorithms · 2020-08-17T19:49:18.620Z · LW · GW

Ah, I see. The high death rate was what made it seem often-catastrophic to me. Is your objection that the high death rate doesn't reflect something that might reasonably be described as "optimizing for one goal at the expense of all others"? E.g., because many of the deaths are suicides, in which case persistence may have been net negative from the perspective of the rest of their goals too? Or because deaths often result from people calibratedly taking risky but non-insane actions, who just happened to get unlucky with heart muscle integrity or whatever?

Comment by adam_scholl on Matt Botvinick on the spontaneous emergence of learning algorithms · 2020-08-17T03:38:38.729Z · LW · GW

Yeah, I wrote that confusingly, sorry; edited to clarify. I just meant that of the limited set of candidate examples I'd considered, (my model, which may well be wrong) of anorexia feels most straightforwardly like an example of something capable of causing catastrophic within-brain inner alignment failure. That is, it currently feels natural to me to model anorexia as being caused by an optimizer for thinness arising in brains, which can sometimes gain sufficient power that people begin to optimize for that goal at the expense of essentially all other goals. But I don't feel confident in this model.

Comment by adam_scholl on Matt Botvinick on the spontaneous emergence of learning algorithms · 2020-08-16T08:00:23.523Z · LW · GW

I agree, in the case of evolution/humans. In the text above, I meant to highlight what seemed to me like a relative lack of catastrophic *within-mind* inner alignment failures, e.g. due to conflicts between PFC and DA. Death of the organism feels to me like a reasonable way to operationalize "catastrophic" in these cases, but I can imagine other reasonable ways.

Comment by adam_scholl on Matt Botvinick on the spontaneous emergence of learning algorithms · 2020-08-13T21:20:20.364Z · LW · GW

As I understand it, your point about the distinction between "mesa" and "steered" is chiefly that in the latter case, the inner layer is continually receiving reward signal from the outer layer, which in effect heavily restricts the space of possible algorithms the outer layer might give rise to. Does that seem like a decent paraphrase?

One of the aspects of Wang et al.'s paper that most interested me was that the inner layer in their meta-RL model kept learning even once reward signal from the outer layer had ceased. It seems reasonable to me to hypothesize that in fact what's going on between PFC and DA is something closer to "subcortex-supervised learning," where PFCs input signals are quite regularly "labeled" by a DA-supervisor. But it doesn't feel intuitively obvious to me that the portion of PFC input which might be labeled in this way is high—e.g., I feel confused about what portion of the concepts currently active in my working memory while writing this paragraph might be labeled by DA—nor that it much restricts the space of possible algorithms that might arise in PFC.

Comment by adam_scholl on Matt Botvinick on the spontaneous emergence of learning algorithms · 2020-08-13T02:56:32.792Z · LW · GW

Gah, thanks! Fixed.

Comment by adam_scholl on Matt Botvinick on the spontaneous emergence of learning algorithms · 2020-08-13T00:50:33.457Z · LW · GW

I mean, it could both be the case that there exists catastrophic inner alignment failure between humans and evolution, and also that humans don't regularly experience catastrophic inner alignment failures internally.

In practice I do suspect humans regularly experience internal (within-brain) inner alignment failures, but given that suspicion I feel surprised by how functional humans manage to be. That is, I notice expecting that regular inner alignment failures would cause far more mayhem than I observe, which makes me wonder whether brains are implementing some sort of alignment-relevant tech.

Comment by adam_scholl on Matt Botvinick on the spontaneous emergence of learning algorithms · 2020-08-12T19:21:31.597Z · LW · GW

The thing I meant by "catastrophic" is "leading to the death of the organism." I'm suspicious that mesa-optimization is common in humans, although I don't feel confident of that. I can imagine it being the case that many examples of e.g. addiction, goodharting, OCD, and even just everyday "personal misalignment"-type problems of the sort IFS/IDC/multi-agent models of mind sometimes help with, are caused by phenomenon which might reasonably be described as inner alignment failures (although I can also imagine them being caused by more mundane processes).

But I think these things don't kill people very often? People do sometimes choose to die because of beliefs. And anorexia sometimes kills people, which currently feels to me like the most straightforward candidate example I've considered.

Things could be a lot worse. For example, it could be the case that mind-architectures that give rise to mesa-optimization simply aren't viable—that it always kills them. Or e.g. that it always leads to the organism optimizing for a set of goals which is unrecognizably different from the base objective. I don't think you see these things, and I'm interested in figuring out how evolution prevented them.

Comment by adam_scholl on sairjy's Shortform · 2020-08-11T07:06:51.140Z · LW · GW

Yeah, for similar reasons I allocate a small portion of my portfolio toward assets (including Nvidia) that might appreciate rapidly during slow takeoff, in the thinking that there might be some slow takeoff scenarios in which the extra resources prove helpful. My main reservation is Paul Christiano's argument that investment/divestment has more-than-symbolic effects.

Comment by adam_scholl on adam_scholl's Shortform · 2020-08-09T02:51:43.457Z · LW · GW

I made Twitter lists of researchers at DeepMind and OpenAI, and find checking them useful for tracking team zeitgeists.

Comment by adam_scholl on adam_scholl's Shortform · 2020-08-09T02:44:16.583Z · LW · GW

Thought LinkedIn's role/background breakdown of DeepMind employees was interesting. Fewer people listed as having neuroscience backgrounds than I would have predicted.

Comment by adam_scholl on Inner alignment in the brain · 2020-06-17T12:21:03.978Z · LW · GW

I found this post super interesting, and appreciate you writing it. I share the suspicion/hope that gaining better understanding of brains might yield safety-relevant insights.

I’m curious what you think is going on here that seems relevant to inner alignment. Is it that you’re modeling neocortical processes (e.g. face recognizers in visual cortex) as arising as a result of something akin to a search process conducted by similar subcortical processes (e.g. face recognizers in superior colliculus), and noting that there doesn’t seem to be much divergence between their objective functions, perhaps because of helpful features of subcortex-supervised learning like e.g. these subcortical input-dependent dynamic rewiring rules?

Comment by adam_scholl on LessWrong Coronavirus Agenda · 2020-03-25T00:16:32.841Z · LW · GW

I wouldn't describe any posts I've seen as conveying the idea sufficiently well for my taste, but would describe some—like this NY Times piece—as adequately conveying the most decision-relevant points.

When I started writing, there was almost no discussion online (aside from Wei Dai's comment here, and the posts it links to) about what factors might prove limiting for the provision of hospital care, or about the degree to which those limits might be exceeded. By the time I called off the project, the US President and ~every major newspaper were talking about it. I think this is great—I much prefer a world where this knowledge is widespread. But given how fast COVID-related discourse was evolving, I think I erred in trying to make loads of points in a single huge post, rather than publishing it in pieces as they became ready.

There is one potentially decision-relevant point that I hoped to make, that I still haven't seen discussed elsewhere: there may be two relevant hospital overflow thresholds. The ICU bed threshold and the ventilator threshold are fairly low; given our current expected supply in a crisis, we'll exceed them if more than about 70k people require them at once. But I think (not confident in this yet) that our capacity for distributing oxygen is something like 10x higher. And if that threshold gets exceeded, the infection fatality rate may rise by something like 10%. So on this model, while it would obviously be ideal to push the curve below both thresholds, it's imperative to at least flatten the curve beneath the oxygen threshold. Which is easier, since it's higher.

I'm not sure this model is accurate, and I haven't yet decided whether to try figuring it out/writing it up. I feel a bit hesitant, after having wasted 10 days or so underestimating the efficiency of the coronavirus modeling market, but it does seem useful to propagate if true. If someone else is interested in looking into it, I would happily talk them through what I've learned.

Comment by adam_scholl on LessWrong Coronavirus Agenda · 2020-03-22T20:24:53.745Z · LW · GW

Update: We decided not to finish this post, since the points we wished to convey have now mostly been covered well elsewhere. But Kyle may still write up his notes about the epidemiological parameters at some point.

Comment by adam_scholl on LessWrong Coronavirus Agenda · 2020-03-19T01:02:55.640Z · LW · GW

I'm currently working with Kyle Scott and Anna Salamon on an estimate of deaths due to hospital overflow (lack of access to oxygen, mechanical ventilation, ICU beds), which we'll hopefully post in the next few days. The post will review evidence about basic epidemiological parameters.

Comment by adam_scholl on How to fly safely right now? · 2020-03-07T12:38:46.647Z · LW · GW

This study suggests some airplane seats expose passengers to significantly more infection risk than others. I'm confused by the writing, but my understanding is that window seats are best.

I would also guess, though I can't tell if the paper is suggesting this, that you're at less risk if you don't use the bathroom, don't have row-mates, and sit where people are least likely to pass you to go to the bathroom. If true, one could potentially reduce risk significantly by buying e.g. three seats next to each other halfway between two bathrooms, limiting water intake before the flight and sitting near the window.

Comment by adam_scholl on At what point should CFAR stop holding workshops due to COVID-19? · 2020-02-27T21:47:43.100Z · LW · GW

I think the bodies probably do need to be in the same room for CFAR workshops to work, unfortunately.

Comment by adam_scholl on Jimrandomh's Shortform · 2020-02-15T19:21:12.892Z · LW · GW

I'm curious about your first and second hypothesis regarding obesity?

Comment by adam_scholl on We run the Center for Applied Rationality, AMA · 2019-12-27T07:44:14.153Z · LW · GW

Ben just to check, before I respond—would a fair summary of your position here be, "CFAR should write more in public, e.g. on LessWrong, so that A) it can have better feedback loops, and B) more people can benefit from its ideas?"

Comment by adam_scholl on We run the Center for Applied Rationality, AMA · 2019-12-24T07:10:57.462Z · LW · GW

To be clear, others at CFAR have spent time looking into these things, I think; Anna might be able to chime in with details. I just meant that I haven't personally.

Comment by adam_scholl on We run the Center for Applied Rationality, AMA · 2019-12-23T04:07:53.358Z · LW · GW

Thanks for spelling this out. My guess is that there are some semi-deep cruxes here, and that they would take more time to resolve than I have available to allocate at the moment. If Eli someday writes that post about the Nisbett and Wilson paper, that might be a good time to dive in further.

Comment by adam_scholl on We run the Center for Applied Rationality, AMA · 2019-12-23T03:59:31.263Z · LW · GW

(Unsure, but I'm suspicious that the distinction between these two things might not be clear).

Comment by adam_scholl on We run the Center for Applied Rationality, AMA · 2019-12-23T01:23:21.987Z · LW · GW

I just googled around for pictures of things I think are neat. I think ctenophores are neat, since they look like alien spaceships and maybe evolved neurons independently; I think it's neat that wind sometimes makes clouds do the vortex thing that canoe paddles make water do, etc.

Comment by adam_scholl on We run the Center for Applied Rationality, AMA · 2019-12-23T01:02:12.390Z · LW · GW

Yeah, same; I think this term has experienced some semantic drift, which is confusing. I meant to refer to pre-verbal intuitions in general, not just ones accompanied by physical sensation.

Comment by adam_scholl on We run the Center for Applied Rationality, AMA · 2019-12-23T00:55:30.511Z · LW · GW

I have an interest in making certain parts of philosophy more productive, and in turning some engineers into "people with more of some specific philosophical skills." I just meant that I'm not excited about most ways I can imagine of "making the average AIRCS participant's epistemics more like that of the average professional philosopher."

Comment by adam_scholl on We run the Center for Applied Rationality, AMA · 2019-12-23T00:42:06.911Z · LW · GW

CFAR does spend substantially less time circling now than it did a couple years ago, yeah. I think this is partly because Pete (who spent time learning about circling when he was younger, and hence found it especially easy to notice the lack of circling-type skill among rationalists, much as I spent time learning about philosophy when I was younger and hence found it especially easy to notice the lack of philosophy-type skill among AIRCS participants) left, and partly I think because many staff felt like their marginal skill returns from circling practice were decreasing, so they started focusing more on other things.

Comment by adam_scholl on We run the Center for Applied Rationality, AMA · 2019-12-23T00:29:50.263Z · LW · GW

Said I appreciate you pointing out that I used the term "extrospection" in a non-standard way—I actually didn't realize that. The way I've heard it used, which is probably idiosyncratic local jargon, it means something like the theory of mind analog of introspection: something like "feeling, yourself, something of what the person you're talking with is feeling." You obviously can't do this perfectly, but I think many people find that e.g. it's easier to gain information about why someone is sad, and about how it feels for them to be currently experiencing this sadness, if you use empathy/theory of mind/the thing I think people are often gesturing at when they talk about "mirror neurons," to try to emulate their sadness in your own brain. To feel a bit of it, albeit an imperfect approximation of it, yourself.

Similarly, I think it's often easier for one to gain information about why e.g. someone feels excited about pursuing a particular line of inquiry, if one tries to emulate their excitement in one's own brain. Personally, I've found this empathy/emulation skill quite helpful for research collaboration, because it makes it easier to trade information about people's vague, sub-verbal curiosities and intuitions about e.g. "which questions are most worth asking."

Circlers don't generally use this skill for research. But it is the primary skill, I think, that circling is designed to train, and my impression is that many circlers have become relatively excellent at it as a result.

Comment by adam_scholl on We run the Center for Applied Rationality, AMA · 2019-12-22T11:26:41.995Z · LW · GW

(I want to be clear that the above is an account of why I personally feel excited about CFAR having investigated circling. I think this account also reasonably describes the motivations of many key staff, and of CFAR's behavior as an institution. But CFAR struggles with communicating research intuitions, too; I think in this case these intuitions did not propagate fully among our staff, and as a result that we did employ a few people for a while whose primary interest in circling was more like "for its own sake," who sometimes discussed it in ways which felt epistemically unhealthy to me. I think people correctly picked up on this as worrying, and I don't want to suggest that didn't happen; just that there is, I think, a sensible reason why CFAR as an institution tends to investigate local blindspots by searching for non-locals with a patch, thereby alarming locals about our epistemic allegiance).

Comment by adam_scholl on We run the Center for Applied Rationality, AMA · 2019-12-22T11:26:23.819Z · LW · GW

I think a crisp summary here is: CFAR is in the business of helping create scientists, more than the business of doing science. Some of the things it makes sense to do to help create scientists look vaguely science-ish, but others don't. And this sometimes causes people to worry (understandably, I think) that CFAR isn't enthused about science, or doesn't understand its value.

Thing is, if you're looking to improve a given culture, one natural move is to explore that culture's blindspots. And exploring that culture's blindspots is, in many cases, I think, not going to look like an activity typical of that culture.

Here's an example: there's a particular bug that I encounter extremely often at AIRCS workshops, but rarely at other sorts of workshops. I don't yet feel like I have a great model of it, but it has something to do with not fully understanding how English words have referents at different levels of abstraction. It's the sort of confusion that I think reading A Human's Guide to Words often resolves in people, and which results in people asking questions like:

  • "Should I replace [my core goal x] with [this list of "ethical" goals I recently heard about]?"
  • "Why is the fact that I have a goal a good reason to optimize for it?"
  • "Are propositions like 'x is good' or 'y is beautiful' even meaningful claims?"

When I encounter this bug I often point to a nearby tree, and start describing it at different levels of abstraction. The word "tree" refers to a bunch of different related things: to a member of an evolutionarily related category of organisms, to the general sort of object humans tend to emit the phonemes "tree" to describe, to this particular mid-sized physical object here in front of us, to the particular arrangement of particles that composes the object, etc. And it's sensible to use the term "tree" anyway, as long as you're careful to track which level of abstraction you're referring to with a given proposition—i.e., as long as you're careful to be precise about exactly which map/territory correspondence you're asserting.

This is obvious to most science-minded people. But it's often less obvious that the same procedure, with the same carefulness, is needed to sensibly discuss concepts like "goal" and "good." Just as it doesn't make sense to discuss whether a given tree is "strong" without internally distinguishing between whether you mean "in terms of its likelihood to fall over" or "in terms of its molecular bonds," it doesn't make sense to discuss whether a goal is "good" without internally distinguishing between whether you mean "relative to societal consensus" or "relative to my current set of preferences" or "relative to the set of preferences I might come to have given more time to think."

This conversation often seems to help resolve the confusion. At some point, I may design a class about this, so that more such confusions can be resolved. But I expect that if I do, some of the engineers in the audience will get nervous, since it will look an awful lot like a philosophy class! (I already get this objection regularly one-on-one). That is, I expect some may wonder whether the AIRCS staff, which claim to be running workshops for engineers, are actually more enthusiastic about philosophy than engineering.

Truth is, we're not. Philosophy strikes me as, on the whole, an unusually unproductive field full of people with highly questionable epistemics. I certainly don't want to turn the engineers into philosophers—I just want to use a particular helpful insight from philosophy to patch a bug which, for whatever reason, seems to commonly afflict AIRCS participants.

CFAR faces this dilemma a lot. For example, we spent a bunch of time circling for a while, and this made many rationalists nervous—was CFAR as an institution, which claimed to be running workshops for science-minded, sequences-reading, law-based-reasoning-enthused rationalists, actually more enthusiastic about woo-laden authentic relating games?

We weren't. But we looked around, and noticed that lots of the promising people around us seemed particularly bad at extrospection—i.e., at simulating the felt senses of their conversational partners in their own minds. This seemed worrying, among other reasons because early-stage research intuitions (e.g. about which lines of inquiry feel exciting to pursue) often seem to be stored sub-verbally. So we looked to specialists in extraspection for a patch.

Comment by adam_scholl on We run the Center for Applied Rationality, AMA · 2019-12-22T08:42:26.129Z · LW · GW

Well, I think it can both be the case that a given staff member thinks the organization's mission is important, and also that due to their particular distribution of comparative advantages, current amount of burnout, etc., that it would be on net better for them to work elsewhere. And I think most of our turnover has resulted from considerations like this, rather than from e.g. people deciding CFAR's mission was doomed.

I think the concern about short median tenure leading to research loss makes sense, and has in fact occurred some. But I'm not all that worried about it, personally, for a few reasons:

  • This cost is reduced because we're in the teaching business. That is, relative to an organization that does pure research, we're somewhat better positioned to transfer institutional knowledge to new staff, since much of the relevant knowledge has already been heavily optimized for easy transferability.
  • There's significant benefit to turnover, too. I think the skills staff develop while working at CFAR are likely to be useful for work at a variety of orgs; I feel excited about the roles a number of former staff are playing elsewhere, and expect I'll be excited about future roles our current staff play elsewhere too.
  • Many of our staff already have substantial "work-related experience," in some sense, before they're hired. For example, I spent a bunch of time in college reading LessWrong, trying to figure out metaethics, etc., which I think helped me become a much better CFAR instructor than I might have been otherwise. I expect many lesswrongers, for example, have already developed substantial skill relevant to working effectively at CFAR.
Comment by adam_scholl on We run the Center for Applied Rationality, AMA · 2019-12-22T04:26:03.622Z · LW · GW

Yeah, I predict that if one showed Val or Pete the line about fitting naturally into CFAR’s environment without triggering antibodies, they would laugh hard and despairingly. There was definitely friction.

Comment by adam_scholl on We run the Center for Applied Rationality, AMA · 2019-12-22T04:12:01.235Z · LW · GW

I think it would depend a lot on which sort of individual life outcomes you wanted to compare. I have basically no idea where these programs stand, relative to CFAR, on things like increasing participant happiness, productivity, relationship quality, or financial success, since CFAR mostly isn't optimizing for producing effects in these domains.

I would be surprised if CFAR didn't come out ahead in terms of things like increasing participants' ability to notice confusion, communicate subtle intuitions, and navigate pre-paradigmatic technical research fields. But I'm not sure, since in general I model these orgs as having sufficiently different goals than us that I haven't spent much time learning about them.

Comment by adam_scholl on We run the Center for Applied Rationality, AMA · 2019-12-22T04:09:18.391Z · LW · GW

To be honest I haven't noticed much change, except obviously for the literal absence of Duncan (which is a very noticeable absence; among other things Duncan is an amazing teacher, imo better than anyone currently on staff).

Comment by adam_scholl on We run the Center for Applied Rationality, AMA · 2019-12-22T02:40:52.667Z · LW · GW

Thanks to your recommendation I recently read New Atlantis, by Francis Bacon, and it was so great! It's basically Bacon's list of things he wished society had, ranging from "clothes made of sea-water-green satin" and "many different types of beverages" to "research universities that employ full-time specialist scholars."

Comment by adam_scholl on We run the Center for Applied Rationality, AMA · 2019-12-22T01:23:01.135Z · LW · GW

I have a Google Doc full of ideas. Probably I'll never write most of these, and if I do probably much of the content will change. But here are some titles, as they currently appear in my personal notes:

  • Mesa-Optimization in Humans
  • Primitivist Priors v. Pinker Priors
  • Local Deontology, Global Consequentialism
  • Fault-Tolerant Note-Scanning
  • Goal Convergence as Metaethical Crucial Consideration
  • Embodied Error Tracking
  • Abnormally Pleasurable Insights
  • Burnout Recovery
  • Against Goal "Legitimacy"
  • Computational Properties of Slime Mold
  • Steelmanning the Verificationist Criterion of Meaning
  • Manual Tribe Switching
  • Manual TAP Installation
  • Keep Your Hobbies
Comment by adam_scholl on We run the Center for Applied Rationality, AMA · 2019-12-21T23:53:47.959Z · LW · GW

I expect there are a bunch which never hear about us due to language barrier, and/or because they're geographically distant from most of our alumni. But I would be surprised if there weren't also lots of geographically-near, epistemically-promising people who've just never happened to encounter someone recommending a workshop.