Posts

Reality-Revealing and Reality-Masking Puzzles 2020-01-16T16:15:34.650Z · score: 237 (72 votes)
We run the Center for Applied Rationality, AMA 2019-12-19T16:34:15.705Z · score: 118 (44 votes)
"Flinching away from truth” is often about *protecting* the epistemology 2016-12-20T18:39:18.737Z · score: 116 (103 votes)
Further discussion of CFAR’s focus on AI safety, and the good things folks wanted from “cause neutrality” 2016-12-12T19:39:50.084Z · score: 38 (38 votes)
CFAR's new mission statement (on our website) 2016-12-10T08:37:27.093Z · score: 7 (8 votes)
CFAR’s new focus, and AI Safety 2016-12-03T18:09:13.688Z · score: 31 (32 votes)
On the importance of Less Wrong, or another single conversational locus 2016-11-27T17:13:08.956Z · score: 106 (94 votes)
Several free CFAR summer programs on rationality and AI safety 2016-04-14T02:35:03.742Z · score: 18 (21 votes)
Consider having sparse insides 2016-04-01T00:07:07.777Z · score: 12 (17 votes)
The correct response to uncertainty is *not* half-speed 2016-01-15T22:55:03.407Z · score: 103 (94 votes)
Why CFAR's Mission? 2016-01-02T23:23:30.935Z · score: 41 (42 votes)
Why startup founders have mood swings (and why they may have uses) 2015-12-09T18:59:51.323Z · score: 57 (57 votes)
Two Growth Curves 2015-10-02T00:59:45.489Z · score: 35 (38 votes)
CFAR-run MIRI Summer Fellows program: July 7-26 2015-04-28T19:04:27.403Z · score: 23 (24 votes)
Attempted Telekinesis 2015-02-07T18:53:12.436Z · score: 109 (90 votes)
How to learn soft skills 2015-02-07T05:22:53.790Z · score: 72 (59 votes)
CFAR fundraiser far from filled; 4 days remaining 2015-01-27T07:26:36.878Z · score: 42 (47 votes)
CFAR in 2014: Continuing to climb out of the startup pit, heading toward a full prototype 2014-12-26T15:33:08.388Z · score: 65 (66 votes)
Upcoming CFAR events: Lower-cost bay area intro workshop; EU workshops; and others 2014-10-02T00:08:44.071Z · score: 17 (18 votes)
Why CFAR? 2013-12-28T23:25:10.296Z · score: 71 (74 votes)
Meetup : CFAR visits Salt Lake City 2013-06-15T04:43:54.594Z · score: 4 (5 votes)
Want to have a CFAR instructor visit your LW group? 2013-04-20T07:04:08.521Z · score: 17 (18 votes)
CFAR is hiring a logistics manager 2013-04-05T22:32:52.108Z · score: 12 (13 votes)
Applied Rationality Workshops: Jan 25-28 and March 1-4 2013-01-03T01:00:34.531Z · score: 20 (21 votes)
Nov 16-18: Rationality for Entrepreneurs 2012-11-08T18:15:15.281Z · score: 25 (30 votes)
Checklist of Rationality Habits 2012-11-07T21:19:19.244Z · score: 124 (122 votes)
Possible meetup: Singapore 2012-08-21T18:52:07.108Z · score: 6 (7 votes)
Center for Modern Rationality currently hiring: Executive assistants, Teachers, Research assistants, Consultants. 2012-04-13T20:28:06.071Z · score: 4 (9 votes)
Minicamps on Rationality and Awesomeness: May 11-13, June 22-24, and July 21-28 2012-03-29T20:48:48.227Z · score: 24 (27 votes)
How do you notice when you're rationalizing? 2012-03-02T07:28:21.698Z · score: 12 (13 votes)
Urges vs. Goals: The analogy to anticipation and belief 2012-01-24T23:57:04.122Z · score: 91 (86 votes)
Poll results: LW probably doesn't cause akrasia 2011-11-16T18:03:39.359Z · score: 47 (50 votes)
Meetup : Talk on Singularity scenarios and optimal philanthropy, followed by informal meet-up 2011-10-10T04:26:09.284Z · score: 4 (7 votes)
[Question] Do you know a good game or demo for demonstrating sunk costs? 2011-09-08T20:07:55.420Z · score: 5 (6 votes)
[LINK] How Hard is Artificial Intelligence? The Evolutionary Argument and Observation Selection Effects 2011-08-29T05:27:31.636Z · score: 14 (15 votes)
Upcoming meet-ups 2011-06-21T22:28:40.610Z · score: 3 (6 votes)
Upcoming meet-ups: 2011-06-11T22:16:09.641Z · score: 9 (12 votes)
Upcoming meet-ups: Buenos Aires, Minneapolis, Ottawa, Edinburgh, Cambridge, London, DC 2011-05-13T20:49:59.007Z · score: 29 (34 votes)
Mini-camp on Rationality, Awesomeness, and Existential Risk (May 28 through June 4, 2011) 2011-04-24T08:10:13.048Z · score: 39 (42 votes)
Learned Blankness 2011-04-18T18:55:32.552Z · score: 153 (139 votes)
Talk and Meetup today 4/4 in San Diego 2011-04-04T11:40:05.167Z · score: 6 (7 votes)
Use curiosity 2011-02-25T22:23:54.462Z · score: 60 (60 votes)
Make your training useful 2011-02-12T02:14:03.597Z · score: 97 (98 votes)
Starting a LW meet-up is easy. 2011-02-01T04:05:43.179Z · score: 39 (40 votes)
Branches of rationality 2011-01-12T03:24:35.656Z · score: 81 (81 votes)
If reductionism is the hammer, what nails are out there? 2010-12-11T13:58:18.087Z · score: 14 (17 votes)
Anthropologists and "science": dark side epistemology? 2010-12-10T10:49:41.139Z · score: 10 (11 votes)
Were atoms real? 2010-12-08T17:30:37.453Z · score: 64 (75 votes)
Help request: What is the Kolmogorov complexity of computable approximations to AIXI? 2010-12-05T10:23:55.626Z · score: 4 (5 votes)
Goals for which Less Wrong does (and doesn't) help 2010-11-18T22:37:36.984Z · score: 66 (64 votes)

Comments

Comment by annasalamon on Hanson vs Mowshowitz LiveStream Debate: "Should we expose the youth to coronavirus?" (Mar 29th) · 2020-03-29T20:16:20.149Z · score: 6 (3 votes) · LW · GW

​I think we are unlikely to hit herd immunity levels of infection in the US in the next 2 years. I want to see Robin and Zvi discuss whether they think that also or not, since this bears on the value of Robin's proposal (and lots of other things).

Comment by AnnaSalamon on [deleted post] 2020-02-17T22:18:08.885Z

Add lots of sleep and down-time, and activities with a clear feedback loop to the physical world (e.g. washing dishes or welding metals or something).

Comment by annasalamon on Mazes Sequence Roundup: Final Thoughts and Paths Forward · 2020-02-09T10:35:23.080Z · score: 17 (6 votes) · LW · GW

For anyone just tuning in and wanting to follow what I mean by “dominating and submitting,” I have in mind the kinds of interactions that Keith Johnstone describes in the “status” chapter of “Impro” (text here; excerpt and previous overcoming bias discussion here.)

This is the book that indirectly caused us to use the word “status” so often around here, but I feel the term “status” is a euphemism that brings model-distortions, versus discussing “dominating and submitting.” FWIW, Johnstone in the original passage says it is a euphemism, writing: “I should really talk about dominance and submission, but I’d create a resistance. Students who will agree readily to raising or lowering their status may object if asked to ‘dominate’ or ‘submit’.” (Hattip: Divia.)

Comment by annasalamon on Mazes Sequence Roundup: Final Thoughts and Paths Forward · 2020-02-09T10:23:39.146Z · score: 9 (5 votes) · LW · GW

The Snafu Principle, whereby communication is only fully possible between equals, leading to Situation Normal All F***ed Up.

This seems true to me in one sense of “equals” and false in another. It seems true to me that dominating and submitting prohibit real communication. It does not seem true to me that structures of authority (“This is my coffee shop; and so if you want to work here you’ll have to sign a contract with me, and then I’ll be able to stop hiring you later if I don’t want to hire you later”) necessarily prohibit communication, though. I can imagine contexts where free agents voluntarily decide to enter into an authority relationship (e.g., because I freely choose to work at Bob’s coffee shop until such time as it ceases to aid my and/or Bob’s goals), without dominating or submitting, and thereby with the possibility of communication.

Relatedly, folks who are peers can easily enough end up dominating/submitting-to each other, or getting stuck reinforcing lies to each other about how good each others’ poetry is or whatever, instead of communicating.

Do you agree that this is the true bit of the “communication is only possible between equals” claim, or do you have something else in mind?

Comment by annasalamon on Player vs. Character: A Two-Level Model of Ethics · 2020-01-20T05:46:33.016Z · score: 14 (6 votes) · LW · GW

I'm a bit torn here, because the ideas in the post seem really important/useful to me (e.g., I use these phrases as a mental pointer sometimes), such that I'd want anyone trying to make sense of the human situation to have access to them (via this post or a number of other attempts at articulating much the same, e.g. "Elephant and the Brain"). And at the same time I think there's some crucial misunderstanding in it that is dangerous and that I can't articulate. Voting for it anyhow though.

Comment by annasalamon on Reality-Revealing and Reality-Masking Puzzles · 2020-01-20T05:00:32.143Z · score: 49 (18 votes) · LW · GW

Responding partly to Orthonormal and partly to Raemon:

Part of the trouble is that group dynamic problems are harder to understand, harder to iterate on, and take longer to appear and to be obvious. (And are then harder to iterate toward fixing.)

Re: individuals having manic or psychotic episodes, I agree with what Raemon says. About six months into a year into CFAR’s workshop-running experience, a participant had a manic episode a couple weeks after a workshop in a way that seemed plausibly triggered partly by the workshop. (Interestingly, if I’m not mixing people up, the same individual later told me that they’d also been somewhat destabilized by reading the sequences, earlier on.) We then learned a lot about warning signs of psychotic or manic episodes and took a bunch of steps to mostly-successfully reduce the odds of having the workshop trigger these. (In terms of causal mechanisms: It turns out that workshops of all sorts, and stuff that messes with one’s head of all sorts, seem to trigger manic or psychotic episodes occasionally. E.g. Landmark workshops; meditation retreats; philosophy courses; going away to college; many different types of recreational drugs; and different small self-help workshops run by a couple people I tried randomly asking about this from outside the rationality community. So my guess is that it isn’t the “taking ideas seriously” aspect of CFAR as such, although I dunno.)

Re: other kinds of “less sane”:

(1) IMO, there has been a build-up over time of mentally iffy psychological habits/techniques/outlook-bits in the Berkeley “formerly known as rationality” community, including iffy thingies that affect the rate at which other iffy things get created (e.g., by messing with the taste of those receiving/evaluating/passing on new “mess with your head” techniques; and by helping people be more generative of “mess with your head” methods via them having had a chance to see several already which makes it easier to build more). My guess is that CFAR workshops have accidentally been functioning as a “gateway drug” toward many things of iffy sanity-impact, basically by: (a) providing a healthy-looking context in which people get over their concerns about introspection/self-hacking because they look around and see other happy healthy-looking people; and (b) providing some entry-level practice with introspection, and with “dialoging with one’s tastes and implicit models and so on”, which makes it easier for people to mess with their heads in other, less-vetted ways later.

My guess is that the CFAR workshop has good effects on folks who come from a sane-isn or at least stable-is outside context, attend a workshop, and then return to that outside context. My guess is that its effects are iffier for people who are living in the bay area, do not have a day job/family/other anchor, and are on a search for “meaning.”

My guess is that those effects have been getting gradually worse over the last five or more years, as a background level of this sort of thing accumulates.

I ought probably to write about this in a top-level post, and may actually manage to do so. I’m also not at all confident of my parsing/ontology here, and would quite appreciate help with it.

(2) Separately, AI risk seems pretty hard for people, including ones unrelated to this community.

(3) Separately, “taking ideas seriously” indeed seems to pose risks. And I had conversations with e.g. Michael Vassar back in ~2008 where he pointed out that this poses risks; it wasn’t missing from the list. (Even apart from tail risks, some forms of “taking ideas seriously” seem maybe-stupid in cases where the “ideas” are not grounded also in one’s inner simulator, tastes, viscera — much sense is there that isn’t in ideology-mode alone). I don’t know whether CFAR workshops increase or decrease peoples’ tendency to take ideas seriously in the problematic sense, exactly. They have mostly tried to connect peoples’ ideas and peoples’ viscera in both directions.

“How to take ideas seriously without [the taking ideas seriously bit] causing them to go insane” as such actually still isn’t that high on my priorities list; I’d welcome arguments that it should be, though.

I’d also welcome arguments that I’m just distinguishing 50 types of snow and that these should all be called the same thing from a distance. But for the moment for me the group-level gradual health/wholesomeness shifts and the individual-level stuff show up as pretty different.

Comment by annasalamon on Reality-Revealing and Reality-Masking Puzzles · 2020-01-17T19:27:42.449Z · score: 20 (7 votes) · LW · GW

There are some edge cases I am confused about, many of which are quite relevant to the “epistemic immune system vs Sequences/rationality” stuff discussed above:

Let us suppose a person has two faculties that are both pretty core parts of their “I” -- for example, deepset “yuck/this freaks me out” reactions (“A”), and explicit reasoning (“B”). Now let us suppose that the deepset “yuck/this freaks me out” reactor (A) is being used to selectively turn off the person’s contact with explicit reasoning in cases where it predicts that B “reasoning” will be mistaken / ungrounded / not conducive to the goals of the organism. (Example: a person’s explicit models start saying really weird things about anthropics, and then they have a less-explicit sense that they just shouldn’t take arguments seriously in this case.)

What does it mean to try to “help” a person in such as case, where two core faculties are already at loggerheads, or where one core faculty is already masking things from another?

If a person tinkers in such a case toward disabling A’s ability to disable B’s access to the world… the exact same process, in its exact same aspect, seems “reality-revealing” (relative to faculty B) and “reality-masking” (relative to faculty A).

Comment by annasalamon on Reality-Revealing and Reality-Masking Puzzles · 2020-01-17T19:27:23.700Z · score: 18 (4 votes) · LW · GW

To try yet again:

The core distinction between tinkering that is “reality-revealing” and tinkering that is “reality-masking,” is which process is learning to predict/understand/manipulate which other process.

When a process that is part of your core “I” is learning to predict/manipulate an outside process (as with the child who is whittling, and is learning to predict/manipulate the wood and pocket knife), what is happening is reality-revealing.

When a process that is not part of your core “I” is learning to predict/manipulate/screen-off parts of your core “I”s access to data, what is happening is often reality-masking.

(Multiple such processes can be occurring simultaneously, as multiple processes learn to predict/manipulate various other processes all at once.)

The "learning" in a given reality-masking process can be all in a single person's head (where a person learns to deceive themselves just by thinking self-deceptive thoughts), but it often occurs via learning to impact outside systems that then learn to impact the person themselves (like in the example of me as a beginning math tutor learning to manipulate my tutees into manipulating me into thinking I'd explained things clearly)).

The "reality-revealing" vs "reality-masking" distinction is in attempt to generalize the "reasoning" vs "rationalizing" distinction to processes that don't all happen in a single head.

Comment by annasalamon on Reality-Revealing and Reality-Masking Puzzles · 2020-01-17T12:42:39.358Z · score: 11 (5 votes) · LW · GW

I like your example about your math tutoring, where you "had a fun time” and “[weren’t] too results driven” and reality-masking phenomena seemed not to occur.

It reminds me of Eliezer talking about how the first virtue of rationality is curiosity.

I wonder how general this is. I recently read the book “Zen Mind, Beginner’s Mind,” where the author suggests that difficulty sticking to such principles as “don’t lie,” “don’t cheat,” “don’t steal,” comes from people being afraid that they otherwise won’t get a particular result, and recommends that people instead… well, “leave a line of retreat” wasn’t his suggested ritual, but I could imagine “just repeatedly leave a line of retreat, a lot” working for getting unattached.

Also, I just realized (halfway through typing this) that cousin_it and Said Achmiz say the same thing in another comment.

Comment by annasalamon on Reality-Revealing and Reality-Masking Puzzles · 2020-01-17T11:21:32.879Z · score: 9 (4 votes) · LW · GW

Thanks; you naming what was confusing was helpful to me. I tried to clarify here; let me know if it worked. The short version is that what I mean by a "puzzle" is indeed person-specific.

A separate clarification: on my view, reality-masking processes are one of several possible causes of disorientation and error; not the only one. (Sort of like how rationalization is one of several possible causes of people getting the wrong answers on math tests; not the only one.) In particular, I think singularity scenarios are sufficiently far from what folks normally expect that the sheer unfamiliarity of the situation can cause disorientation and errors (even without any reality-masking processes; though those can then make things worse).

Comment by annasalamon on Reality-Revealing and Reality-Masking Puzzles · 2020-01-17T10:56:35.962Z · score: 5 (2 votes) · LW · GW

The difficulties above were transitional problems, not the main effects.

Why do you say they were "transitional"? Do you have a notion of what exactly caused them?

Comment by annasalamon on Reality-Revealing and Reality-Masking Puzzles · 2020-01-17T10:17:53.429Z · score: 28 (9 votes) · LW · GW

A couple people asked for a clearer description of what a “reality-masking puzzle” is. I’ll try.

JamesPayor’s comment speaks well for me here:

There was the example of discovering how to cue your students into signalling they understand the content. I think this is about engaging with a reality-masking puzzle that might show up as "how can I avoid my students probing at my flaws while teaching" or "how can I have my students recommend me as a good tutor" or etc.

It's a puzzle in the sense that it's an aspect of reality you're grappling with. It's reality-masking in that the pressure was away from building true/accurate maps.

To say this more slowly:

Let’s take “tinkering” to mean “a process of fiddling with a [thing that can provide outputs] while having some sort of feedback-loop whereby the [outputs provided by the thing] impacts what fiddling is tried later, in such a way that it doesn’t seem crazy to say there is some ‘learning’ going on.”

Examples of tinkering:

  • A child playing with legos. (The “[thing that provides outputs]” here is the [legos + physics], which creates an output [an experience of how the legos look, whether they fall down, etc.] in reply to the child’s “what if I do this?” attempts. That output then affects the child’s future play-choices some, in such a way that it doesn’t seem crazy to say there is some “learning” happening.)
  • An person doodling absent-mindedly while talking on the phone, even if the doodle has little to no conscious attention;
  • A person walking. (Since the walking process (I think) contains at least a bit of [exploration / play / “what happens if I do this?” -- not necessarily conscious], and contains some feedback from “this is what happens when you send those signals to your muscles” to future walking patterns)
  • A person explicitly reasoning about how to solve a math problem
  • A family member A mostly-unconsciously taking actions near another family member B [while A consciously or unconscoiusly notices something about how the B responds, and while A has some conscious or unconscious link between [how B responds] and [what actions A takes in future].

By a “puzzle”, I mean a context that gets a person to tinker. Puzzles can be person-specific. “How do I get along with Amy?” may be a puzzle for Bob and may not be a puzzle for Carol (because Bob responds to it by tinkering, and Carol responds by, say, ignoring it). A kong toy with peanut butter inside is a puzzle for some dogs (i.e., it gets these dogs to tinker), but wouldn’t be for most people. Etc.

And… now for the hard part. By a “reality-masking puzzle”, I mean a puzzle such that the kind of tinkering it elicits in a given person will tend to make that person’s “I” somehow stupider, or in less contact with the world.

The usual way this happens is that, instead of the tinkering-with-feedback process gradually solving an external problem (e.g., “how do I get the peaut butter out of the kong toy?”), the tinkering-with-feedback process is gradually learning to mask things from part of their own mind (e.g. “how do I not-notice that I feel X”).

This distinction is quite related to the distinction between reasoning and rationalization.

However, it differs from that distinction in that “rationalization” usually refers to processes happening within a single person’s mind. And in many examples of “reality-masking puzzles,” the [process that figures out how to mask a bit of reality from a person’s “I”] is spread across multiple heads, with several different tinkering processes feeding off each other and the combined result somehow being partially about blinding someone.

I am actually not all that satisfied by the “reality-revealing puzzles” vs “reality-masking puzzles” ontology. It was more useful to me than what I’d had before, and I wanted to talk about it, so I posted it. But… I understand what it means for the evidence to run forwards vs backwards, as in Eliezer’s Sequences post about rationalization. I want a similarly clear-and-understood generalization of the “reasoning vs rationalizing” distinction that applies also to processes to spread across multiple heads. I don’t have that yet. I would much appreciate help toward this. (Incremental progress helps too.)

Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2020-01-11T19:21:22.265Z · score: 15 (7 votes) · LW · GW

No; that isn't the trouble; I could imagine us getting the money together for such a thing, since one doesn't need anything like a consensus to fund a position. The trouble is more that at this point the members of the bay area {formerly known as "rationalist"} "community" are divided into multiple political factions, or perhaps more-chaos-than-factions, which do not trust one another's judgment (even about pretty basic things, like "yes this person's actions are outside of reasonable behavioral norms). It is very hard to imagine an individual or a small committee that people would trust in the right way. Perhaps even more so after that individual or committee tried ruling against someone who really wanted to stay, and and that person attempted to create "fear, doubt, and uncertainty" or whatever about the institution that attempted to ostracize them.

I think something in this space is really important, and I'd be interested in investing significantly in any attempt that had a decent shot at helping. Though I don't yet have a strong enough read myself on what the goal ought to be.

Comment by annasalamon on AIRCS Workshop: How I failed to be recruited at MIRI. · 2020-01-08T20:54:25.030Z · score: 10 (5 votes) · LW · GW

Hi Mark,

This maybe doesn't make much difference for the rest of your comment, but just FWIW: the workshop you attended in Sept 2016 not part of the AIRCS series. It was a one-off experiment, funded by an FLI grant, called "CFAR for ML", where we ran most of a standard CFAR workshop and then tacked on an additional day of AI alignment discussion at the end.

The AIRCS workshops have been running ~9 times/year since Feb 2018, have been evolving pretty rapidly, and in recent iterations involve a higher ratio of either AI risk content, or content about how cognitive biases etc. that seem to arise in discussion about AI risk in particular. They have somewhat smaller cohorts for more 1-on-1 conversation (~15 participants instead of 23). They are co-run with MIRI, which "CFAR for ML" was not. They have a slightly different team and a slightly different beast.

Which... doesn't mean you wouldn't have had most of the same perceptions if you'd come to a recent AIRCS! You might well have. From a distance perhaps all our workshops are pretty similar. And I can see calling "CFAR for ML" "AIRCS", since it was in fact partially about AI risk and was aimed mostly at computer scientists, which is what "AIRCS" stands for. Still, we locally care a good bit of the distinctions between our programs, so I did want to clarify.

Comment by annasalamon on CFAR Participant Handbook now available to all · 2020-01-08T05:17:21.630Z · score: 34 (10 votes) · LW · GW

Some combination of: (a) lots of people still wanted it, and we're not sure our previous "idea inoculation" concerns are actually valid, and there's something to testing the idea of giving people what they want; and (perhaps more significantly) (b) we're making more of an overall push this year toward making our purpose and curriculum and strategy and activities and so on clear and visible so that we can dialog with people about our plans, and we figured that putting the handbook online might help with that.

Comment by annasalamon on AIRCS Workshop: How I failed to be recruited at MIRI. · 2020-01-08T05:14:59.479Z · score: 25 (8 votes) · LW · GW

Hi Arthur,

Thanks for the descriptions — it is interesting for me to hear about your experiences, and I imagine a number of others found it interesting too.

A couple clarifications from my perspective:

First: AIRCS is co-run by both CFAR and MIRI, and is not entirely a MIRI recruiting program, although it is partly that! (You might know this part already, but it seems like useful context.)

We are hoping that different people go on from AIRCS to a number of different AI safety career paths. For example:

  • Some people head straight from AIRCS to MIRI.
  • Some people attend AIRCS workshops multiple times, spaced across months or small years, while they gradually get familiar with AI safety and related fields.
  • Some people realize after an AIRCS workshop that AI safety is not a good fit for them.
  • Some people, after attending one or perhaps many AIRCS workshops, go on to do AI safety research at an organization that isn’t MIRI. All of these are good and intended outcomes from our perspective! AI safety could use more good technical researchers, and AIRCS is a long-term investment toward improving the number of good computer scientists (and mathematicians and others) who have some background in the field. (Although it is also partly aimed to help with MIRI's recruiting in particular.)

Separately, I did not mean to "give people a rule" to "not speak about AI safety to people who do not express interest." I mean, neither I nor AIRCS as an organization have some sort of official request that people follow a “rule” of that sort. I do personally usually follow a rule of that sort, though (with exceptions). Also, when people ask me for advice about whether to try to “spread the word” about AI risk, I often share that I personally am a bit cautious about when and how I talk with people about AI risk; and I often share details about that.

I do try to have real conversations with people that reply to their curiosity and/or to their arguments/questions/etc., without worrying too much which directions such conversations will update them toward.

I try to do this about AI safety, as about other topics. And when I do this about AI safety (or other difficult topics), I try to help people have enough “space” that they can process things bit-by-bit if they want. I think it is often easier and healthier to take in a difficult topic at one’s own pace. But all of this is tricky, and I would not claim to know the one true answer to how everyone should talk about AI safety.

Also, I appreciate hearing about the bits you found distressing; thank you. Your comments make sense to me. I wonder if we’ll be able to find a better format in time. We keep getting bits of feedback and making small adjustments, but it is a slow process. Job applications are perhaps always a bit awkward, but iterating on “how do we make it less awkward” does seem to yield slow-but-of-some-value modifications over time.

Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-25T11:20:57.579Z · score: 29 (8 votes) · LW · GW

I’ve worked closely with CFAR since it’s founding in 2012, for varying degrees of closely (ranging from ~25 hrs/week to ~60 hrs/week). My degree of involvement in CFAR’s high-level and mid-level strategic decisions has varied some, but at the moment is quite high, and is likely to continue to be quite high for at least the coming 12 months.

During work-type hours in which I’m not working for CFAR, my attention is mostly on MIRI on MIRI’s technical research. I do a good bit of work with MIRI (though I am not employed by MIRI -- I just do a lot of work with them), much of which also qualifies as CFAR work (e.g., running the AIRCS workshops and assisting with the MIRI hiring process; or hanging out with MIRI researchers who feel “stuck” about some research/writing/etc. type thing and want a CFAR-esque person to help them un-stick). I also do a fair amount of work with MIRI that does not much overlap CFAR (e.g. I am a MIRI board member).

Oliver remained confused after talking with me in April because in April I was less certain how involved I was going to be in upcoming strategic decisions. However, it turns out the answer was “lots.” I have a lot of hopes and vision for CFAR over the coming years, and am excited about hashing it out with Tim and others at CFAR, and seeing what happens as we implement; and Tim and others seem excited about this as well.

My attention oscillates some across the years between MIRI and CFAR, based partly on the needs of each organization and partly on e.g. there being some actual upsides to me taking a backseat under Pete as he (and Duncan and others) made CFAR into more of a functioning institution in ways I would’ve risked reflexively meddling with. But there has been much change in the landscape CFAR is serving, and it’s time I think for there to be much change also in e.g. our curriculum, our concept of “rationality”, our relationship to community, and how we run our internal processes -- and I am really excited to be able to be closely involved with CFAR this year, in alliance with Tim and others.

Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-25T05:21:17.939Z · score: 45 (10 votes) · LW · GW

My guesses, in no particular order:

  • Being a first employee is pretty different from being in a middle-stage organization. In particular, the opportunity to shape what will come has an appeal that can I think rightly bring in folks who you can’t always get later. (Folks present base rates for various reference classes below; I don’t know if anyone has one for “founding” vs “later” in small organizations?)

    • Relatedly, my initial guess back in ~2013 (a year in) was that many CFAR staff members would “level up” while they were here and then leave, partly because of that level-up (on my model, they’d acquire agency and then ask if being here as one more staff member was or wasn’t their maximum-goal-hitting thing). I was excited about what we were teaching and hoped it could be of long-term impact to those who worked here a year or two and left, as well as to longer-term people.
  • I and we intentionally hired for diversity of outlook. We asked ourselves: “does this person bring some component of sanity, culture, or psychological understanding -- but especially sanity -- that is not otherwise represented here yet?” And this… did make early CFAR fertile, and also made it an unusually difficult place to work, I think. (If you consider the four founding members of me, Julia Galef, Val, and Critch, I think you’ll see what I mean.)

  • I don’t think I was very easy to work with. I don’t think I knew how to make CFAR a very easy place to work either. I was trying to go with inside views even where I couldn’t articulate them and… really didn’t know how to create a good interface between that and a group of people. Pete and Duncan helped us do otherwise more recently I think, and Tim and Adam and Elizabeth and Jack and Dan building on it more since, with the result that CFAR is much more of a place now (less of a continuous “each person having an existential crisis all the time” than it was for some in the early days; more of a plod of mundane work in a positive sense). (The next challenge here, which we hope to accomplish this year, is to create a place that still has place-ness, and also has more visibility into strategy.)

  • My current view is that being at workshops for too much of a year is actually really hard on a person, and maybe not-good. It mucks up a person’s codebase without enough chance for ordinary check-sums to sort things back to normal again afterward. Relatedly, my guess is also that while stints at CFAR do level a person up in certain ways (~roughly as I anticipated back in 2013), they unfortunately also risk harming a person in certain ways that are related to “it’s not good to live in workshops or workshop-like contexts for too many weeks/months in a row, even though a 4-day workshop is often helpful” (which I did not anticipate in 2013). (Basically: you want a bunch of normal day-to-day work on which to check whether your new changes actually work well, and to settle back into your deeper or more long-term self. The 2-3 week “MIRI Summer Fellows Program” (MSFP) has had… some great impacts in terms of research staff coming out of the program, but also most of our least stable people additionally came out of that. I believe that this year we’ll be experimentally replacing it with repeated shorter workshops; we’ll also be trying a different rest days pattern for staff, and sabbatical months, as well as seeking stability/robustness/continuity in more cultural and less formal ways.)

Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-25T02:04:01.100Z · score: 12 (7 votes) · LW · GW

No; this would somehow be near-impossible in our present context in the bay, IMO; although Berkeley's REACH center and REACH panel are helpful here and solve part of this, IMO.

Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-25T02:02:05.276Z · score: 14 (4 votes) · LW · GW

:) There's something good about "common sense" that isn't in "effective epistemics", though -- something about wanting not to lose the robustness of the ordinary vetted-by-experience functioning patterns. (Even though this is really hard, plausibly impossible, when we need to reach toward contexts far from those in which our experiences were based.)

Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-25T02:01:17.346Z · score: 8 (4 votes) · LW · GW

To clarify: we're not joking about the need to get "what we do" and "what people think we do" more in alignment, via both communicating better and changing our organizational name if necessary. We put that on our "goals for 2020" list (both internally, and in our writeup). We are joking that CfBCSSS is an acceptable name (due to its length making it not-really-that).

(Eli works with us a lot but has been taking a leave of absence for the last few months and so didn't know that bit, but lots of us are not-joking about getting our name and mission clear.)

Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-24T21:03:58.650Z · score: 35 (12 votes) · LW · GW

I quite like the open questions that Wei Dai wrote there, and I expect I'd find progress on those problems to be helpful for what I'm trying to do with CFAR. If I had to outline the problem we're solving from scratch, though, I might say:

  • Figure out how to:
    • use reason (and stay focused on the important problems, and remember “virtue of the void” and “lens that sees its own flaws, and be quick where you can) without
    • going nutso, or losing humane values, and while:
    • being able to coordinate well in teams.

Wei Dai’s open problems feel pretty relevant to this!

I think in practice this goal leaves me with subproblems such as:

  • How do we un-bottleneck “original seeing” / hypothesis-generation;
  • What is the “it all adds up to normality” skill based in; how do we teach it;
  • Where does “mental energy” come from in practice, and how can people have good relationships to this;
  • What’s up with people sometimes seeming self-conscious/self-absorbed (in an unfortunate, slightly untethered way) and sometimes seeming connected to “something to protect” outside themselves?
    • It seems to me that “something to protect” makes people more robustly mentally healthy. Is that true? If so why? Also how do we teach it?
  • Why is it useful to follow “spinning plates” (objects that catch your interest for their own sake) as well as “hamming questions”? What’s the relationship between those two? (I sort of feel like they’re two halves of the same coin somehow? But I don’t have a model.)
  • As well as more immediately practical questions such as: How can a person do “rest days” well. What ‘check sums’ are useful for noticing when something breaks as you’re mucking with your head. Etc.
Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-24T20:25:55.022Z · score: 54 (13 votes) · LW · GW

Here’s a very partial list of blog post ideas from my drafts/brainstorms folder. Outside view, though, if I took the time to try to turn these in to blog posts, I’d end up changing my mind about more than half of the content in the process of writing it up (and then would eventually end up with blog posts with somewhat different these).

I’m including brief descriptions with the awareness that my descriptions may not parse at this level of brevity, in the hopes that they’re at least interesting teasers.

Contra-Hodgel

  • (The Litany of Hodgell says “That which can be destroyed by the truth should be”. Its contrapositive therefore says: “That which can destroy [that which should not be destroyed] must not be the full truth.” It is interesting and sometimes-useful to attempt to use Contra-Hodgel as a practical heuristic: “if adopting belief X will meaningfully impair my ability to achieve good things, there must be some extra false belief or assumption somewhere in the system, since true beliefs and accurate maps should just help (e.g., if “there is no Judeo-Christian God” in practice impairs my ability to have good and compassionate friendships, perhaps there is some false belief somewhere in the system that is messing with that).

The 50/50 rule

  • The 50/50 rule is a proposed heuristic claiming that about half of all progress on difficult projects will come from already-known-to-be-project-relevant subtasks -- for example, if Archimedes wishes to determine whether the king’s crown is unmixed gold, he will get about half his progress from diligently thinking about this question (plus subtopics that seem obviously and explicitly relevant to this question). The other half of progress on difficult projects (according to this heuristic) will come from taking an interest in the rest of the world, including parts not yet known to be related to the problem at hand -- in the Archimedes example, from Archimedes taking an interest in what happens to his bathwater.
  • Relatedly, the 50/50 rule estimates that if you would like to move difficult projects forward over long periods of time, it is often useful to spend about half of your high-energy hours on “diligently working on subtasks known-to-be-related to your project”, and the other half taking an interest in the world.

Make New Models, but Keep The Old

  • “... one is silver and the other’s gold.”
  • A retelling of: it all adds up to normality.

On Courage and Believing In.

  • Beliefs are for predicting what’s true. “Believing in”, OTOH, is for creating a local normal that others can accurately predict. For example: “In America, we believe in driving on the right hand side of the road” -- thus, when you go outside and look to predict which way people will be driving, you can simply predict (believe) that they’ll be driving on the right hand side.
  • Analogously, if I decide I “believe in” [honesty, or standing up for my friends, or other such things], I create an internal context in which various models within me can predict that my future actions will involve [honesty, or standing up for my friends, or similar].
  • It’s important and good to do this sometimes, rather than having one’s life be an accidental mess with nobody home choosing. It’s also closely related to courage.

Ethics for code colonies

  • If you want to keep caring about people, it makes a lot of sense to e.g. take the time to put your shopping cart back where it goes, or at minimum not to make up excuses about how your future impact on the world makes you too important to do that.
  • In general, when you take an action, you summon up black box code-modification that takes that action (and changes unknown numbers of other things). Life as a “code colony” is tricky that way.
  • Ethics is the branch of practical engineering devoted to how to accomplish things with large sets of people over long periods of time -- or even with one person over a long period of time in a confusing or unknown environment. It’s the art of interpersonal and intrapersonal coordination. (I mean, sometimes people say “ethics” means “following this set of rules here”. But people also say “math” means “following this algorithm whenever you have to divide fractions” or whatever. And the underneath-thing with ethics is (among other things, maybe) interpersonal and intra-personal coordination, kinda like how there’s an underneath-thing with math that is where those rules come from.)
  • The need to coordinate in this way holds just as much for consequentialists or anyone else.
  • It's kinda terrifying to be trying to do this without a culture. Or to be not trying to do this (still without a culture).

The explicit and the tacit (elaborated a bit in a comment in this AMA; but there’s room for more).

Cloaks, Questing, and Cover Stories

  • It’s way easier to do novel hypothesis-generation if you can do it within a “cloak”, without making any sort of claim yet about what other people ought to believe. (Teaching this has been quite useful on a practical level for many at AIRCS, MSFP, and instructor trainings -- seems worth seeing if it can be useful via text, though that’s harder.)

Me-liefs, We-liefs, and Units of Exchange

  • Related to “cloaks and cover stories” -- we have different pools of resources that are subject to different implicit contracts and commitments. Not all Bayesian evidence is judicial or scientific evidence, etc.. A lot of social coordination works by agreeing to only use certain pools of resources in agreement with certain standards of evidence / procedure / deference (e.g., when a person does shopping for their workplace they follow their workplace’s “which items to buy” procedures; when a physicist speaks to laypeople in their official capacity as a physicist, they follow certain procedures so as to avoid misrepresenting the community of physicists).
  • People often manage this coordination by changing their beliefs (“yes, I agree that drunk driving is dangerous -- therefore you can trust me not to drink and drive”). However, personally I like the rule “beliefs are for true things -- social transactions can make my requests of my behaviors but not of my beliefs.” And I’ve got a bunch of gimmicks for navigating the “be robustly and accurately seen as prosocial” without modifying one’s beliefs (“In my driving, I value cooperating with the laws and customs so as to be predictable and trusted and trustworthy in that way; and drunk driving is very strongly against our customs -- so you can trust me not to drink and drive.”)

How the Tao unravels

  • A book review of part of CS Lewis’s book “The abolition of man.” Elaborates CS Lewis’s argument that in postmodern times, people grab hold of part of humane values and assert it in contradiction with other parts of humane values, which then assert back the thing that they’re holding and the other party is missing, and then things fragment further and further. Compares Lewis’s proposed mechanism with how cultural divides have actually been going in the rationality and EA communities over the last ten years.
Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-24T18:22:09.551Z · score: 67 (17 votes) · LW · GW

Examples of some common ways that people sometimes find Singularity scenarios disorienting:

When a person loses their childhood religion, there’s often quite a bit of bucket error. A person updates on the true fact “Jehovah is not a good explanation of the fossil record” and accidentally confuses that true fact with any number of other things, such as “and so I’m not allowed to take my friends’ lives and choices as real and meaningful.”

I claimed above that “coming to take singularity scenarios seriously” seems in my experience to often cause even more disruption / bucket errors / confusions / false beliefs than does “losing a deeply held childhood religion.” I’d like to elaborate on that here by listing some examples of the kinds of confusions/errors I often encounter.

None of these are present in everyone who encounters Singularity scenarios, or even in most people who encounter it. Still, each confusion below is one where I’ve seen it or near-variants of it multiple times.

Also note that all of these things are “confusions”, IMO. People semi-frequently have them at the beginning and then get over them. These are not the POV I would recommend or consider correct -- more like the opposite -- and I personally think each stems from some sort of fixable thinking error.)

  • The imagined stakes in a singularity are huge. Common confusions related to this:
    • Confusion about whether it is okay to sometimes spend money/time/etc. on oneself, vs. having to give it all to attempting to impact the future.
    • Confusion about whether one wants to take in singularity scenarios, given that then maybe one will “have to” (move across the country / switch jobs / work all the time / etc.)
    • Confusion about whether it is still correct to follow common sense moral heuristics, given the stakes.
    • Confusion about how to enter “hanging out” mode, given the stakes and one’s panic. (“Okay, here I am at the beach with my friends, like my todo list told me to do to avoid burnout. But how is it that I used to enjoy their company? They seem to be making meaningless mouth-noises that have nothing to do with the thing that matters…”)
    • Confusion about how to take an actual normal interest in one’s friends’ lives, or one’s partner’s lives, or one’s Lyft drivers’ lives, or whatever, given that within the person’s new frame, the problems they are caught up in seem “small” or “irrelevant” or to have “nothing to do with what matters”.
  • The degrees of freedom in “what should a singularity maybe do with the future?” are huge. And people are often morally disoriented by that part. Should we tile the universe with a single repeated mouse orgasm, or what?
    • Are we allowed to want humans and ourselves and our friends to stay alive? Is there anything we actually want? Or is suffering bad without anything being better-than-nothing?
    • If I can’t concretely picture what I’d do with a whole light-cone (maybe because it is vastly larger than any time/money/resources I’ve ever personally obtained feedback from playing with) -- should I feel that the whole future is maybe meaningless and no good?
  • The world a person finds themselves in once they start taking Singularity scenarios seriously is often quite different from what the neighbors think, which itself can make things hard
    • Can I have a “real” conversation with my friends? Should I feel crazy? Should I avoid taking all this in on a visceral level so that I’ll stay mentally in the same world as my friends?
    • How do I keep regarding other peoples’ actions as good and reasonable? The imagined scales are very large, with the result that one can less assume “things are locally this way” is an adequate model.
    • Given this, should I get lost in “what about simulations / anthropics” to the point of becoming confused about normal day-today events?
  • In order to imagine this stuff, folks need to take seriously reasoning that is neither formal mathematics, nor vetted by the neighbors or academia, nor strongly based in empirical feedback loops.
    • Given this, shall I go ahead and take random piles of woo seriously also?

There are lots more where these came from, but I’m hoping this gives some flavor, and makes it somewhat plausible why I’m claiming that “coming to take singularity scenarios seriously can be pretty disruptive to common sense," and such that it might be nice to try having a "bridge" that can help people lose less of the true parts of common sense as their world changes (much as it might be nice for someone who has just lost their childhood religion to have a bridge to "okay, here are some other atheists, and they don't think that God is why they should get up in the morning and care about others, but they do still seem to think they should get up in the morning and care about others").

Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-24T17:47:54.530Z · score: 81 (19 votes) · LW · GW

My closest current stab is that we’re the “Center for Bridging between Common Sense and Singularity Scenarios.” (This is obviously not our real name. But if I had to grab a handle that gestures at our raison d’etre, at the moment I’d pick this one. We’ve been internally joking about renaming ourselves this for some months now.)

To elaborate: thinking about singularity scenarios is profoundly disorienting (IMO, typically worse than losing a deeply held childhood religion or similar). Folks over and over again encounter similar failure modes as they attempt this. It can be useful to have an institution for assisting with this -- collecting concepts and tools that were useful for previous waves who’ve attempted thought/work about singularity scenarios, and attempting to pass them on to those who are currently beginning to think about such scenarios.

Relatedly, the pattern of thinking required for considering AI risk and related concepts at all is pretty different from the patterns of thinking that suffice in most other contexts, and it can be useful to have a group that attempts to collect these and pass them forward.

Further, it can be useful to figure out how the heck to do teams and culture in a manner that can withstand the disruptions that can come from taking singularity scenarios seriously.

So, my best current angle on CFAR is that we should try to be a place that can help people through these standard failure modes -- a place that can try to answer the question “how can we be sane and reasonable and sensible and appropriately taking-things-seriously in the face of singularity scenarios,” and can try to pass on our answer, and can notice and adjust when our answer turns out to be invalid.

To link this up with our concrete activities:

AIRCS workshops / MSFP:

  • Over the last year, about half our staff workshop-days went into attempting to educate potential AI alignment researchers. These programs were co-run with MIRI. Workshops included a bunch about technical AI content; a bunch of practice thinking through “is there AI risk” and “how the heck would I align a superintelligence” and related things; and a bunch of discussion of e.g. how to not have “but the stakes are really really big” accidentally overwhelm one’s basic sanity skills (and other basic pieces of how to not get too disoriented).
  • Many program alumni attended multiple workshops, spaced across time, as part of a slow acculturation process: stare at AI risk; go back to one’s ordinary job/school context for some months while digesting in a back-burner way; repeat.
  • These programs aim at equipping people to contribute to Al alignment technical work at MIRI and elsewhere; in the last two years it’s helped educate a sizable number of MIRI hires and smaller but still important number of others (brief details in our 2019 progress report; more details coming eventually). People sometime try to gloss the impact of AIRCS as “outreach” or “causing career changes,” but, while I think it does in fact fulfill CEA-style metrics, that doesn’t seem to me like a good way to see its main purple -- helping folks feel their way toward being more oriented and capable around these topics in general, in a context where other researchers have done or are doing likewise.
  • They seem like a core activity for a “Center for bridging between common sense and singularity scenarios” -- both in that they tell us more about what happens when folks encounter AI risk, and in that they let us try to use what we think we know for good. (We hope it’s “good.”)

Mainline workshops, alumni reunions, alumni workshops unrelated to AI risk, etc.:

  • We run mainline workshops (which many people just call “CFAR workshops”), alumni reunions, and some topic-specific workshops for alumni that have nothing to do with AI risk (e.g., a double crux workshop). Together, this stuff constituted about 30% of our staff workshop-days over the last two years.
  • The EA crowd often asks me why we run these. (“Why not just run AI safety workshops, since that is the part of your work that has more shot at helping large numbers of people?”) The answer is that when I imagine removing the mainline workshops, CFAR begins to feel like a table without enough legs -- unstable, liable to work for awhile but then fall over, lacking enough contact with the ground.
  • More concretely: we’re developing and spreading a nonstandard mental toolkit (inner sim, double crux, Gendlin’s Focusing, etc.). That’s a tricky and scary thing to do. It’s really helpful to get to try it on a variety of people -- especially smart, thoughtful, reflective, articulate people who will let us know what seems like a terrible idea, or what causes them help in their lives, or disruption in their lives. The mainline workshops (plus follow-up sessions, alumni workshops, alumni reunions, etc.) let us develop this alleged “bridge” between common sense and singularity scenarios in a way that avoids overfitting it all to just “AI alignment work.” Which is basically to say that they let us develop and test our models of “applied rationality”.

“Sandboxes” toward trying to understand how to have a healthy culture in contact with AI safety:

  • I often treat the AIRCS workshops as “sandboxes”, and try within them to create small temporary “cultures” in which we try to get research to be able to flourish, or try to get people to be able to both be normal humans and slowly figure out how to approach AI alignment, or whatever. I find them a pretty productive vehicle for trying to figure out the “social context” thing, and not just the “individual thinking habits” thing. I care about this experimentation-with-feedback because I want MIRI and other longer-term teams to eventually have the right cultural base.

Our instructor training program, and our attempt to maintain a staff who is skilled at seeing what cognitive processes are actually running in people:

  • There’s a lot of trainable, transferable skill to seeing what people are thinking. CFAR staff have a bunch of this IMO, and we seem to me to be transferring a bunch of it to the instructor candidates too. We call it “seeking PCK”.
  • The “seeking PCK” skillset is obviously helpful for learning to “bridge between common sense and singularity scenarios” -- it helps us see what the useful patterns folks have are, and what the not-so-useful patterns folks have are, and what exactly is happening as we attempt to intervene (so that we can adjust our interventions).
  • Thus, improving and maintaining the “seeking PCK” skillset probably makes us faster at developing any other curriculum.
  • More mundanely, of course, instructor training also gives us guest instructors who can help us run workshops -- many of whom are also out and about doing other interesting things, and porting wisdom/culture/data back and forth between those endeavors and our workshops.

To explain what “bridging betwen common sense and singularity scenarios” has to do with “applied rationality” and the LW Sequences and so on:

  • The farther off you need to extrapolate, the more you need reasoning (vs being able to lean on either received wisdom, or known data plus empirical feedback loops). And singularity scenarios sure are far from the everyday life our heuristics are developed for, so singularity scenarios benefit more than most from trying to be the lens that sees its flaws, and from Sequences-style thinking more broadly.
Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-22T05:16:41.546Z · score: 55 (15 votes) · LW · GW

Re: 1—“Forked codebases that have a lot in common but are somewhat tricky to merge” seems like a pretty good metaphor to me.

The question I'd like to answer that is near your questions is: "What is the minimal patch/bridge that will let us use all of both codebases without running into merge conflicts?"

We do have a candidate answer to this question, which we’ve been trying out at AIRCS to reasonable effect. Our candidate answer is something like: an explicit distinction between “tacit knowledge” (inarticulate hunches, early-stage research intuitions, the stuff people access and see in one another while circling, etc.) and the “explicit” (“knowledge” worthy of the name, as in the LW codebase—the thing I believe Ben Pace is mainly gesturing at in his comment above).

Here’s how we explain it at AIRCS:

  • By “explicit” knowledge, we mean visible-to-conscious-consideration denotative claims that are piecewise-checkable and can be passed explicitly between humans using language.
    • Example: the claim “Amy knows how to ride a bicycle” is explicit.
  • By “tacit” knowledge, we mean stuff that allows you to usefully navigate the world (and so contains implicit information about the world, and can be observationally evaluated for how well people seem to navigate the relevant parts of the world when they have this knowledge) but is not made of explicit denotations that can be fully passed verbally between humans.
    • Example: however the heck Amy actually manages to ride the bicycle (the opaque signals she sends to her muscles, etc.) is in her own tacit knowledge. We can know explicitly “Amy has sufficient tacit knowledge to balance on a bicycle,” but we cannot explicitly track how she balances, and Amy cannot hand her bicycle-balancing ability to Bob via speech (although speech may help). Relatedly, Amy can’t check the individual pieces of her (opaque) motor patterns to figure out which ones are the principles by which she successfully stays up and which are counterproductive superstition.
  • I’ll give a few more examples to anchor the concepts:
    • In mathematics:
      • Explicit: which things have been proven; which proofs are valid.
      • Tacit: which heuristics may be useful for finding proofs; which theorems are interesting/important. (Some such heuristics can be stated explicitly, but I wouldn't call those statements “knowledge.” I can't verify that they're right in the way I can verify “Amy can ride a bike” or “2+3=5.”)
    • In science:
      • Explicit: specific findings of science, such as “if you take a given amount of hydrogen and decrease its volume by half, you double its pressure." The “experiment” and “conclusion” steps of the scientific method.
      • Tacit: which hypotheses are worth testing.
    • In Paul Graham-style startups:
      • Explicit: what metrics one is hitting, once one achieves an MVP.
      • Tacit: the way Graham’s small teams of cofounders are supposed to locate their MVP. (In calling this “tacit,” I don’t mean you can’t communicate any of this verbally. Of course they use words. But they way they use words is made of ad hoc spaghetti-code bits of attempt to get gut intuitions back and forth between a small set of people who know each other well. It is quite different from the scalable processes of explicit science/knowledge that can compile across large sets of people and long periods of time. This is why Graham claims that co-founder teams should have 2-4 people, and that if you hire e.g. 10 people to a pre-MVP startup, it won’t scale well.)

In the context of the AIRCS workshop, we share “The Tacit and the Explicit” in order to avoid two different kinds of errors:

  • People taking “I know it in my gut” as zero-value, and attempting to live via the explicit only. My sense is that some LessWrong users like Said_Achmiz tend to err in this direction. (This error can be fatal to early-stage research, and to one’s ability to discuss ordinary life/relationship/productivity “bugs” and solutions, and many other mundanely useful topics.)
  • People taking “I know it in my gut” as vetted knowledge, and attempting to build on gut feelings in the manner of knowledge. (This error can be fatal to global epistemology: “but I just feel that religion is true / the future can’t be that weird / whatever”).

We find ourselves needing to fix both those errors in order to allow people to attempt grounded original thinking about AI safety. They need to be able to have intuitions, and take those intuitions seriously enough to develop them / test them / let them breathe, without mistaking those intuitions for knowledge.

So, at the AIRCS workshop, we introduce the explicit (which is a big part of what I take Ben Pace to be gesturing at above actually) at the same time that we introduce the tacit (which is the thing that Ben Pace describes benefiting from at CFAR IMO). And we introduce a framework to try to keep them separate so that learning cognitive processes that help with the tacit will not accidentally mess with folks’ explicit, nor vice versa. (We’ve been introducing this framework at AIRCS for about a year, and I do think it’s been helpful. I think it’s getting to the point where we could try writing it up for LW—i.e., putting the framework more fully into the explicit.)

Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-22T05:14:09.435Z · score: 20 (10 votes) · LW · GW

With regard to whether our staff has read the sequences: five have, and have been deeply shaped by them; two have read about a third, and two have read little. I do think it’s important that our staff read them, and we decided to run this experiment with sabbatical months next year in part to ensure our staff had time to do this over the coming year.

Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-22T04:25:33.617Z · score: 30 (10 votes) · LW · GW

I agree very much with what Duncan says here. I forgot I need to point that kind of thing out explicitly. But a good bit of my soul-effort over the last year has gone into trying to inhabit the philosophical understanding of the world that can see as possibilities (and accomplish!) such things as integrity, legibility, accountability, and creating structures that work across time and across multiple people. IMO, Duncan had a lot to teach me and CFAR here; he is one of the core models I go to when I try to understand this, and my best guess is that it is in significant part his ability to understand and articulate this philosophical pole (as well as to do it himself) that enabled CFAR to move from the early-stage pile of un-transferrable "spaghetti code" that we were when he arrived, to an institution with organizational structure capable of e.g. hosting instructor trainings and taking in and making use of new staff.

Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-22T03:36:55.917Z · score: 42 (18 votes) · LW · GW

I wish someone would create good bay area community health. It isn't our core mission; it doesn't relate all that directly to our core mission; but it relates to the background environment in which CFAR and quite a few other organizations may or may not end up effective.

One daydream for a small institution that might help some with this health is as follows:

  1. Somebody creates the “Society for Maintaining a Very Basic Standard of Behavior”;
  2. It has certain very basic rules (e.g. “no physical violence”; “no doing things that are really about as over the line as physical violence according to a majority of our anonymously polled members”; etc.)
  3. It has an explicit membership list of folks who agree to both: (a) follow these rules; and (b) ostracize from “community events” (e.g. parties to which >4 other society members are invited) folks who are in bad standing with the society (whether or not they personally think those members are guilty).
  4. It has a simple, legible, explicitly declared procedure for determining who has/hasn’t entered bad standing (e.g.: a majority vote of the anonymously polled membership of the society; or an anonymous vote of a smaller “jury” randomly chosen from the society).

Benefits I’m daydreaming might come from this institution:

  • A. If the society had large membership, bad actors could be ostracized from larger sections of the community, and with more simplicity and less drama.
  • B. Also, we could do that while imposing less restraint on individual speech, which would make the whole thing less creepy. Like, if many many people thought person B should be exiled, and person A wanted to defer but was not herself convinced, she could: (a) defer explicitly, while saying that’s what she was doing; and meanwhile (b) speak her mind without worrying that she would destabilize the community’s ability to ever coordinate.
Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-22T02:51:49.904Z · score: 31 (9 votes) · LW · GW

My rough guess is “we survived; most of the differences I could imagine someone fearing didn’t come to pass”. My correction on that rough guess is: “Okay, but insofar as Duncan was the main holder of certain values, skills, and virtues, it seems pretty plausible that there are gaps now today that he would be able to see and that we haven’t seen”.

To be a bit more specific: some of the poles I noticed Duncan doing a lot to hold down while he was here were:

  • Institutional accountability and legibility;
  • Clear communication with staff; somebody caring about whether promises made were kept; somebody caring whether policies were fair and predictable, and whether the institution was creating a predictable context where staff, workshop participants, and others wouldn’t suddenly experience having the rug pulled out from under them;
  • Having the workshop classes start and end on time; (I’m a bit hesitant to name something this “small-seeming” here, but it is a concrete policy that supported the value above, and it is easier to track)
  • Revising the handbook into a polished state;
  • Having the workshop classes make sense to people, have clear diagrams and a clear point, etc.; having polish and visible narrative and clear expectations in the workshop;

AFAICT, these things are doing… alright in the absence of Duncan (due partly to the gradual accumulation of institutional knowledge), though I can see his departure in the organization. AFAICT also, Duncan gave me a good chunk of model of this stuff sometime after his Facebook post, actually -- and worked pretty hard on a lot of this before his departure too. But I would not fully trust my own judgment on this one, because the outside view is that people (in this case, me) often fail to see what they cannot see.

When I get more concrete:

  • Institutional accountability and legibility is I think better than it was;
  • Clean communication with staff, keeping promises, creating clear expectations, etc. -- better on some axes and worse on others -- my non-confident guess is better overall (via some loss plus lots of work);
  • Classes starting and ending on time -- at mainlines: slightly less precise class-timing but not obviously worse thereby; at AIRCS, notable decreases, with some cost;
  • Handboook revisions -- have done very little since he left;
  • Polish and narrative cohesion in the workshop classes -- it’s less emphasized but not obviously worse thereby IMO, due partly to the infusion of the counterbalancing “original seeing” content from Brienne that was perhaps easier to pull off via toning polish down slightly. Cohesion and polish still seem acceptable, and far far better than before Duncan arrived.

Also: I don’t know how to phrase this tactfully in a large public conversation. But I appreciate Duncan’s efforts on behalf of CFAR; and also he left pretty burnt out; and also I want to respect what I view as his own attempt to disclaim responsibility for CFAR going forward (via that Facebook post) so that he won’t have to track whether he may have left misleading impressions of CFAR’s virtues in people. I don’t want our answers here to mess that up. If you come to CFAR and turn out not to like it, it is indeed not Duncan’s fault (even though it is still justly some credit to Duncan if you do, since we are still standing on the shoulders of his and many others’ past work).

Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-21T20:57:15.144Z · score: 15 (8 votes) · LW · GW

I'll interpret this as "Which of the rationalist virtues do you think CFAR has gotten the most mileage from your practicing".

The virtue of the void. Hands down. Though I still haven't done it nearly as much as it would be useful to do it. Maybe this year?

If I instead interpret this as "Which of the rationalist virtues do you spend the most minutes practicing": curiosity. Which would be my runner-up for "CFAR got the most mileage from my practicing".

Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-21T18:10:28.904Z · score: 12 (5 votes) · LW · GW

I, too, believe that Critch played a large and helpful role here.

Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-21T14:16:54.136Z · score: 26 (8 votes) · LW · GW

Ben Pace writes:

In recent years, when I've been at CFAR events, I generally feel like at least 25% of attendees probably haven't read The Sequences, aren't part of this shared epistemic framework, and don't have any understanding of that law-based approach, and that they don't have a felt need to cache out their models of the world into explicit reasoning and communicable models that others can build on.

The “many alumni haven't read the Sequences” part has actually been here since very near the beginning (not the initial 2012 minicamps, but the very first paid workshops of 2013 and later). (CFAR began in Jan 2012.) You can see it in our old end-of-2013 fundraiser post, where we wrote “Initial workshops worked only for those who had already read the LW Sequences. Today, workshop participants who are smart and analytical, but with no prior exposure to rationality -- such as a local politician, a police officer, a Spanish teacher, and others -- are by and large quite happy with the workshop and feel it is valuable.” We didn't name this explicitly in that post, but part of the hope was to get the workshops to work for a slightly larger/broader/more cognitively diverse set than the set who for whom the original Sequences in their written form tended to spontaneously "click".

As to the “aren’t part of this shared epistemic framework” -- when I go to e.g. the alumni reunion, I do feel there are basic pieces of this framework at least that I can rely on. For example, even on contentious issues, 95%+ of alumni reunion participants seem to me to be pretty good at remembering that arguments should not be like soldiers, that beliefs are for true things, etc. -- there is to my eyes a very noticable positive difference between the folks at the alumni reunion and unselected-for-rationality smart STEM graduate students, say (though STEM graduate students are also notably more skilled than the general population at this, and though both groups fall short of perfection).

Still, I agree that it would be worthwhile to build more common knowledge and [whatever the “values” analog of common knowledge is called] supporting “a felt need to cache out their models of the world into explicit reasoning and communicable models that others can build on” and that are piecewise-checkable (rather than opaque masses of skills that are useful as a mass but hard to build across people and time). This particular piece of culture is harder to teach to folks who are seeking individual utility, because the most obvious payoffs are at the level of the group and of the long-term process rather than at the level of the individual (where the payoffs to e.g. goal-factoring and murphyjitsu are located). It also pays off more in later-stage fields and less in the earliest stages of science within preparadigm fields such as AI safety, where it’s often about shower thoughts and slowly following inarticulate hunches. But still.

Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-21T13:43:44.260Z · score: 17 (6 votes) · LW · GW

Agreed

Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-21T13:42:41.379Z · score: 20 (10 votes) · LW · GW

Ben Pace writes:

“... The Gwerns and the Wei Dais and the Scott Alexanders of the world won't have learned anything from CFAR's exploration.”

I’d like to distinguish two things:

  1. Whether the official work activities CFAR staff are paid for will directly produce explicit knowledge in the manner valued by the Gwern etc.
  2. Whether that CFAR work will help educate people who later produce explicit knowledge themselves in the manner valued by Gwern etc., and who wouldn't have produced that knowledge otherwise.

#1 would be useful but isn’t our primary goal (though I think we’ve done more than none of it). #2 seems like a component of our primary goal to me (“scientists” or “producers of folks who can make knowledge in this sense” isn’t all we’re trying to produce, but it’s part of it), and is part of what I would like to see us strengthen over the coming years.

To briefly list our situation with respect to whether we are accomplishing #2 (according to me):

  • There are in fact a good number of AI safety scientists in particular who seem to me to produce knowledge of this type, and who give CFAR some degree of credit for their present tendency to do this.
  • On a milder level, while CFAR workshops do not themselves teach most of the Sequences’ skills (which would exceed four days in length, among other difficulties), we do try to nudge participants into reading the Sequences (by referencing them with respect at the workshop, by giving all mainline participants and AIRCS participants paper copies of “How to actually change your mind” and HPMOR, and by explicitly claiming they are helpful for various things).
  • At the same time, I do think we should make Sequences-style thinking a more explicit element of the culture spread by CFAR workshops, and of the culture that folks can take for granted at e.g. alumni reunions (although it is there nevertheless to quite an extent).

(I edited this slightly to make it easier to read after Kaj had already quoted from it.)

Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-21T11:14:44.867Z · score: 46 (16 votes) · LW · GW

This is my favorite question of the AMA so far (I said something similar aloud when I first read it, which was before it got upvoted quite this highly, as did a couple of other staff members). The things I personally appreciate about your question are: (1) it points near a core direction that CFAR has already been intending to try moving toward this year (and probably across near-subsequent years; one year will not be sufficient); and (2) I think you asking it publicly in this way (and giving us an opportunity to make this intention memorable and clear to ourselves, and to parts of the community that may help us remember) will help at least some with our moving there.

Relatedly, I like the way you lay out the concepts.

Your essay (I mean, “question”) is rather long, and has a lot of things in it; and my desired response sure also has a lot of things in it. So I’m going to let myself reply via many separate discrete small comments because that’s easier.

(So: many discrete small comments upcoming.)

Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-20T22:45:52.194Z · score: 15 (6 votes) · LW · GW

That is, we were always focused on high-intensity interventions for small numbers of people -- especially the people who are the very easiest to impact (have free time; smart and reflective; lucky in their educational background and starting position). We did not expect things to generalize to larger sets.

(Mostly. We did wonder about books and things for maybe impacting the epistemics (not effectiveness) of some larger number of people a small amount. And I do personally think that if there ways to help with the general epistemic, wisdom, or sanity of larger sets of people, even if by a small amount, that would be worth meaningful tradeoffs to create. But we are not presently aiming for this (except in the broadest possible "keep our eyes open and see if we someday notice some avenue that is actually worth taking here" sense), and with the exception of helping to support Julia Galef's upcoming rationality book back when she was working here, we haven't ever attempted concrete actions aimed at figuring out how to impact larger sets of people).

I agree, though, that one should not donate to CFAR in order to increase the chances of an Elon Musk factory.

Comment by annasalamon on We run the Center for Applied Rationality, AMA · 2019-12-20T22:12:01.219Z · score: 31 (10 votes) · LW · GW

My model is that CFAR is doing the same activity it was always doing, which one may or may not want to call “research”.

I’ll describe that activity here.  I think it is via this core activity (plus accidental drift, or accidental hill-climbing in response to local feedbacks) that we have generated both our explicit curriculum, and a lot of the culture around here.

Components of this core activity (in no particular order):

  1. We try to teach specific skills to specific people, when we think those skills can help them.  (E.g. goal-factoring; murphyjitsu; calibration training on occasion; etc.)
  2. We keep our eyes open while we do #1.  We try to notice whether the skill does/doesn’t match the student’s needs.  (E.g., is this so-called “skill” actually making them worse at something that we or they can see?  Is there a feeling of non-fit suggesting something like that? What’s actually happening as the “skill” gets “learned”?)
    • We call this noticing activity “seeking PCK” and spend a bunch of time developing it in our mentors and instructors.
  3. We try to stay in touch with some of our alumni after the workshop, and to notice what the long-term impacts seem to be (are they actually practicing our so-called “skills”?  Does it help when they do? More broadly, what changes do we just-happen-by-coincidence to see in multiple alumni again and again, and are these positive or negative changes, and what might be causing them?
    • In part, we do this via the four follow-up calls that participants receive after they attend the mainline workshop; in part we do it through the alumni reunions, the kind of contact that comes naturally from being in the same communities, etc.
    • We often describe some of what we think we’re seeing, and speculate about where to go given that, in CFAR’s internal colloqium.
    • We pay particular attention to alumni who are grappling with existential risk or EA, partly because it seems to pose distinct difficulties that it would be nice if someone found solutions to.
  4. Spend a bunch of time with people who are succeeding at technical AI safety work, trying to understand what skills go into that.  Spend a bunch of time with people who are training to do technical AI safety work (often at the same time that people who can actually do such work are there), working to help transfer useful mindset (while trying also to pay attention to what’s happening.
    • Right now we do this mostly at the AIRCS and MSFP workshops.
  5. Spend a bunch of time engaging smart new people to see what skills/mindsets they would add to the curriculum, so we don’t get too stuck in a local optimum.
    • What this looks like recently:
      • The instructor training workshops are helping us with this.  Many of us found those workshops pretty generative, and are excited about the technique-seeds and cultural content that the new instructor candidates have been bringing.
      • The AIRCS program has also been bringing in highly skilled computer scientists, often from outside the rationality and EA community. My own thinking has changed a good bit in contact with the AIRCS experience.  (They are less explicitly articulate about curriculum than the instructor candidates; but they ask good questions, buy some pieces of our content, get wigged out by other pieces of our content in a non-random manner, answer follow-up questions in contact with that that sometimes reveal implicit causal models of how to think that seem correct to me, etc.  And so they are a major force for AIRCS curriculum generation in that way.)
    • Gaps in 5:
      • I do wish we had better contact with more and varied highly productive thinkers/markers of different sorts, as a feed-in to our curriculum.  We unfortunately have no specific plans to fix this gap in 2020 (and I don’t think it could fit without displacing some even-more-important planned shift -- we have limited total attention); but it would be good to do sometime over the next five years.  I keep dreaming of a “writers’ workshop” and an “artist’s workshop” and so on, aimed at seeing how our rationality stuff mutates when it hits people with different kinds of visibly-non-made-up productive skill.
  6. We sometimes realize that huge swaths of our curriculum are having unwanted effects and try to change it.  We sometimes realize that our model of “the style of thinking we teach” is out of touch with our best guesses about what’s good, and try to change it.
  7. We try to study any functional cultures that we see (e.g., particular functional computer science communities; particular communities found in history books), to figure out what magic was there.  We discuss this informally and with friends and with e.g. the instructor candidates.
  8. We try to figure out how thinking ever correlates with the world, and when different techniques make this better and worse in different context.  And we read the Sequences to remember that this is what we’re doing.
    • We could stand to do this one more; increasing this is a core planned shift for 2020.  But we’ve always done it some, including over the last few years.

The “core activity” exemplified in the above list is, of course, not RCT-style verifiable track records-y social science (which is one common meaning of “research”).  There is a lot of merit to that verifiable social science, but also a lot of slowness to it, and I cannot imagine using it to design the details of a curriculum, although I can imagine using it to see whether a curriculum has particular high-level effects.

We also still do some (but not as much as we wish we could do) actual data-tracking, and have plans to do modestly more of it over the coming year.  I expect this planned modest increase will be useful for our broader orientation but not much of a direct feed-in into curriculum, although it might help us tweak certain knobs upward or downward a little.

Comment by annasalamon on eigen's Shortform · 2019-12-02T06:53:22.351Z · score: 16 (4 votes) · LW · GW

I've reread portions of the Sequences, and have derived notable additional value from it. Particularly fruitful at one point (many years ago) was when I reread a bunch of the "Map and territory" stuff (Noticing Confusion; Mysterious Answers to Mysterious Questions; Fake Beliefs) while substituting in examples of "my beliefs about myself" in place of all of Eliezer's examples -- because somehow that was a different domain I hadn't trained the concepts on when I read it the first time.

I plan to probably do more such exercises soon. I've found "check where my trigger-action patterns are and aren't matching the normative patterns suggested by the Sequences, and design exercises to investigate this" pretty useful in general, and its been ~5 years since I've done it, which seems time for a re-do.

Comment by annasalamon on Rule Thinkers In, Not Out · 2019-02-28T00:02:57.342Z · score: 106 (48 votes) · LW · GW

I used to make the argument in the OP a lot. I applied it (among other applications) to Michael Vassar, who many people complained to me about (“I can’t believe he made obviously-fallacious argument X; why does anybody listen to him”), and who I encouraged them to continue listening to anyhow. I now regret this.

Here are the two main points I think past-me was missing:

1. Vetting and common knowledge creation are important functions, and ridicule of obviously-fallacious reasoning plays an important role in discerning which thinkers can (or can’t) help fill these functions.

(Communities — like the community of physicists, or the community of folks attempting to contribute to AI safety — tend to take a bunch of conclusions for granted without each-time-reexamining them, while trying to add to the frontier of knowledge/reasoning/planning. This can be useful, and it requires a community vetting function. This vetting function is commonly built via having a kind of “good standing” that thinkers/writers can be ruled out of (and into), and taking a claim as “established knowledge that can be built on” when ~all “thinkers in good standing” agree on that claim.

I realize the OP kind-of acknowledges this when discussing “social engineering”, so maybe the OP gets this right? But I undervalued this function in the past, and the term “social engineering” seems to me dismissive of a function that in my current view contributes substantially to a group’s ability to produce new knowledge.)

2. Even when a reader is seeking help brainstorming hypotheses (rather than vetting conclusions), they can still be lied-to and manipulated, and such lies/manipulations can sometimes disrupt their thinking for long and costly periods of time (e.g., handing Ayn Rand to the wrong 14-year-old; or, in my opinion, handing Michael Vassar to a substantial minority of smart aspiring rationalists). Distinguishing which thinkers are likely to lie or manipulate is a function more easily fulfilled by a group sharing info that rules thinkers out for past instances of manipulative or dishonest tactics (rather than by the individual listener planning to ignore past bad arguments and to just successfully detect every single manipulative tactic on their own).

So, for example, Julia Galef helpfully notes a case where Steven Pinker straightforwardly misrepresents basic facts about who said what. This is helpful to me in ruling out Steven Pinker as someone who I can trust not to lie to me about even straightforwardly checkable facts.

Similarly, back in 2011, a friend complained to me that Michael would cause EAs to choose the wrong career paths by telling them exaggerated things about their own specialness. This matched my own observations of what he was doing. Michael himself told me that he sometimes lied to people (not his words) and told them that the thing that would most help AI risk from them anyhow was for them to continue on their present career (he said this was useful because that way they wouldn’t rationalize that AI risk must be false). Despite these and similar instances, I continued to recommend people talk to him because I had “ruled him in” as a source of some good novel ideas, and I did this without warning people about the rest of it. I think this was a mistake. (I also think that my recommending Michael led to considerable damage over time, but trying to establish that claim would require more discussion than seems to fit here.)

To be clear, I still think hypothesis-generating thinkers are valuable even when unreliable, and I still think that honest and non-manipulative thinkers should not be “ruled out” as hypothesis-sources for having some mistaken hypotheses (and should be “ruled in” for having even one correct-important-and-novel hypothesis). I just care more about the caveats here than I used to.

Comment by annasalamon on Some Thoughts on My Psychiatry Practice · 2019-01-19T19:08:16.179Z · score: 5 (2 votes) · LW · GW

Thanks for the portraits; I appreciate getting to read it. I'm curious what would happen if you got one of them to read "The Elephant in the Brain". (No idea if it'd be good or bad. Just seems like it might have some chance at causing something different.)

Comment by annasalamon on CFAR 2017 Retrospective · 2017-12-19T20:59:34.864Z · score: 27 (10 votes) · LW · GW

I continue to think CFAR is among the best places to donate re: turning money into existential risk reduction (including this year -- basically because our good done seems almost linear in the number of free-to-participants programs we can run (because those can target high-impact AI stuff), and because the number of free-to-participants programs we can run is more or less linear in donations within the range in which donations might plausibly take us). If anyone wants my take on how this works, or on our last year or our upcoming year or anything like that, I'd be glad to talk: anna at rationality dot org.

Comment by annasalamon on In the presence of disinformation, collective epistemology requires local modeling · 2017-12-16T20:59:13.387Z · score: 57 (18 votes) · LW · GW

RyanCarey writes:

If you are someone of median intelligence who just want to carry out a usual trade like making shoes or something, you can largely get by with recieved wisdom.

AFAICT, this only holds if you're in a stable sociopolitical/economic context -- and, more specifically still, the kind of stable sociopolitical environment that provides relatively benign information-sources. Examples of folks who didn't fall into this category: (a) folks living in eastern Europe in the late 1930's (especially if Jewish, but even if not; regardless of how traditional their trade was); (b) folks living in the Soviet Union (required navigating a non-explicit layer of recieved-from-underground knowledge); (c) folks literally making shoes during time-periods in which shoe-making was disrupted by the industrial revolution. It is to my mind an open question whether any significant portion of the US/Europe/etc. will fall into the "can get by largely with received wisdom" reference class across the next 10 years. (They might. I just actually can't tell.)

Comment by annasalamon on CFAR's new mission statement (on our website) · 2016-12-10T21:36:19.357Z · score: 0 (0 votes) · LW · GW

This is fair; I had in mind basic high school / Newtonian physics of everyday objects. (E.g., "If I drop this penny off this building, how long will it take to hit the ground?", or, more messily, "If I drive twice as fast, what impact would that have on the kinetic energy with which I would crash into a tree / what impact would that have on how badly deformed my car and I would be if I crash into a tree?").

Comment by annasalamon on Further discussion of CFAR’s focus on AI safety, and the good things folks wanted from “cause neutrality” · 2016-12-10T18:48:21.092Z · score: 5 (5 votes) · LW · GW

We would indeed love to help those people train.

Comment by annasalamon on Further discussion of CFAR’s focus on AI safety, and the good things folks wanted from “cause neutrality” · 2016-12-10T18:44:57.202Z · score: 9 (9 votes) · LW · GW

Yes. Or will seriously attempt this, at least. It seems required for cooperation and good epistemic hygiene.

Comment by annasalamon on CFAR's new mission statement (on our website) · 2016-12-10T18:41:35.421Z · score: 1 (1 votes) · LW · GW

Thanks; good point; will add links.

Comment by annasalamon on CFAR's new mission statement (on our website) · 2016-12-10T18:40:10.562Z · score: 1 (1 votes) · LW · GW

In case there are folks following Discussion but not Main: this mission statement was released along with:

Comment by annasalamon on CFAR’s new focus, and AI Safety · 2016-12-10T18:32:31.300Z · score: 1 (1 votes) · LW · GW

Oh, sorry, the two new docs are posted and were in the new ETA:

http://lesswrong.com/lw/o9h/further_discussion_of_cfars_focus_on_ai_safety/ and http://lesswrong.com/r/discussion/lw/o9j/cfars_new_mission_statement_on_our_website/

Comment by AnnaSalamon on [deleted post] 2016-12-10T08:31:08.209Z

Apologies; the link is broken and I'm not sure how to edit or delete it; real link is: http://rationality.org/about/mission