Rob B's Shortform Feed 2019-05-10T23:10:14.483Z · score: 19 (3 votes)
Helen Toner on China, CSET, and AI 2019-04-21T04:10:21.457Z · score: 71 (25 votes)
New edition of "Rationality: From AI to Zombies" 2018-12-15T21:33:56.713Z · score: 79 (30 votes)
On MIRI's new research directions 2018-11-22T23:42:06.521Z · score: 57 (16 votes)
Comment on decision theory 2018-09-09T20:13:09.543Z · score: 70 (26 votes)
Ben Hoffman's donor recommendations 2018-06-21T16:02:45.679Z · score: 40 (17 votes)
Critch on career advice for junior AI-x-risk-concerned researchers 2018-05-12T02:13:28.743Z · score: 203 (70 votes)
Two clarifications about "Strategic Background" 2018-04-12T02:11:46.034Z · score: 76 (22 votes)
Karnofsky on forecasting and what science does 2018-03-28T01:55:26.495Z · score: 17 (3 votes)
Quick Nate/Eliezer comments on discontinuity 2018-03-01T22:03:27.094Z · score: 70 (22 votes)
Yudkowsky on AGI ethics 2017-10-19T23:13:59.829Z · score: 88 (39 votes)
MIRI: Decisions are for making bad outcomes inconsistent 2017-04-09T03:42:58.133Z · score: 7 (8 votes)
CHCAI/MIRI research internship in AI safety 2017-02-13T18:34:34.520Z · score: 5 (6 votes)
MIRI AMA plus updates 2016-10-11T23:52:44.410Z · score: 15 (13 votes)
A few misconceptions surrounding Roko's basilisk 2015-10-05T21:23:08.994Z · score: 56 (52 votes)
The Library of Scott Alexandria 2015-09-14T01:38:27.167Z · score: 62 (52 votes)
[Link] Nate Soares is answering questions about MIRI at the EA Forum 2015-06-11T00:27:00.253Z · score: 19 (20 votes)
Rationality: From AI to Zombies 2015-03-13T15:11:20.920Z · score: 85 (84 votes)
Ends: An Introduction 2015-03-11T19:00:44.904Z · score: 3 (3 votes)
Minds: An Introduction 2015-03-11T19:00:32.440Z · score: 4 (6 votes)
Biases: An Introduction 2015-03-11T19:00:31.605Z · score: 73 (115 votes)
Rationality: An Introduction 2015-03-11T19:00:31.162Z · score: 15 (16 votes)
Beginnings: An Introduction 2015-03-11T19:00:25.616Z · score: 4 (3 votes)
The World: An Introduction 2015-03-11T19:00:12.370Z · score: 3 (3 votes)
Announcement: The Sequences eBook will be released in mid-March 2015-03-03T01:58:45.893Z · score: 47 (48 votes)
A forum for researchers to publicly discuss safety issues in advanced AI 2014-12-13T00:33:50.516Z · score: 12 (13 votes)
Stuart Russell: AI value alignment problem must be an "intrinsic part" of the field's mainstream agenda 2014-11-26T11:02:01.038Z · score: 26 (31 votes)
Groundwork for AGI safety engineering 2014-08-06T21:29:38.767Z · score: 13 (14 votes)
Politics is hard mode 2014-07-21T22:14:33.503Z · score: 40 (72 votes)
The Problem with AIXI 2014-03-18T01:55:38.274Z · score: 29 (29 votes)
Solomonoff Cartesianism 2014-03-02T17:56:23.442Z · score: 34 (31 votes)
Bridge Collapse: Reductionism as Engineering Problem 2014-02-18T22:03:08.008Z · score: 54 (49 votes)
Can We Do Without Bridge Hypotheses? 2014-01-25T00:50:24.991Z · score: 11 (12 votes)
Building Phenomenological Bridges 2013-12-23T19:57:22.555Z · score: 67 (60 votes)
The genie knows, but doesn't care 2013-09-06T06:42:38.780Z · score: 57 (63 votes)
The Up-Goer Five Game: Explaining hard ideas with simple words 2013-09-05T05:54:16.443Z · score: 29 (34 votes)
Reality is weirdly normal 2013-08-25T19:29:42.541Z · score: 33 (48 votes)
Engaging First Introductions to AI Risk 2013-08-19T06:26:26.697Z · score: 20 (27 votes)
What do professional philosophers believe, and why? 2013-05-01T14:40:47.028Z · score: 31 (44 votes)


Comment by robbbb on how should a second version of "rationality: A to Z" look like? · 2019-08-24T12:13:15.639Z · score: 2 (1 votes) · LW · GW

For Facebook, I use FBPurity to block my news feed. Then if there are particular individuals I especially want to follow, I add them to a Facebook List.

Comment by robbbb on Partial summary of debate with Benquo and Jessicata [pt 1] · 2019-08-17T22:44:48.018Z · score: 9 (4 votes) · LW · GW

For 'things that aren't an accident but aren't necessarily conscious or endorsed', another option might be to use language like 'decision', 'action', 'choice', etc. but flagged in a way that makes it clear you're not assuming full consciousness. Like 'quasi-decision', 'quasi-action', 'quasi-conscious'... Applied to Zack's case, that might suggest a term like 'quasi-dissembling' or 'quasi-misleading'. 'Dissonant communication' comes to mind as another idea.

When I want to emphasize that there's optimization going on but it's not necessarily conscious, I sometimes speak impersonally of "Bob's brain is doing X", or "a Bob-part/agent/subagent is doing X".

Comment by robbbb on Partial summary of debate with Benquo and Jessicata [pt 1] · 2019-08-17T13:03:57.623Z · score: 29 (8 votes) · LW · GW

I personally wouldn't point to "When Will AI Exceed Human Performance?" as an exemplar on this dimension, because it isn't clear about the interesting implications of the facts it's reporting. Katja's take-away from the paper was:

In the past, it seemed pretty plausible that what AI researchers think is a decent guide to what’s going to happen. I think we've pretty much demonstrated that that’s not the case. I think there are a variety of different ways we might go about trying to work out what AI timelines are like, and talking to experts is one of them; I think we should weight that one down a lot.

I don't know whether Katja's co-authors agree with her about that summary, but if there's disagreement, I think the paper still could have included more discussion of the question and which findings look relevant to it.

The actual Discussion section makes the opposite argument instead, listing a bunch of reasons to think AI experts are good at foreseeing AI progress. The introduction says "To prepare for these challenges, accurate forecasting of transformative AI would be invaluable. [...] The predictions of AI experts provide crucial additional information." And the paper includes a list of four "key findings", none of which even raise the question of survey respondents' forecasting chops, and all of which are worded in ways that suggest we should in fact put some weight on the respondents' views (sometimes switching between the phrasing 'researchers believe X' and 'X is true').

The abstract mentions the main finding that undermines how believable the responses are, but does so in such a way that someone reading through quickly might come away with the opposite impression. The abstract's structure is:

To adapt public policy, we need to better anticipate [AI advances]. Researchers predict [A, B, C, D, E, and F]. Researchers believe [G and H]. These results will inform discussion amongst researchers and policymakers about anticipating and managing trends in AI.

If it slips past your attention that G and H are massively inconsistent, it's easy for the reader to come away thinking the abstract is saying 'Here's a list of of credible statements from experts about their area of expertise' as opposed to 'Here's a demonstration that what AI researchers think is not a decent guide to what's going to happen'.

Comment by robbbb on Occam's Razor: In need of sharpening? · 2019-08-06T22:11:52.359Z · score: 7 (3 votes) · LW · GW

Humans might not be a low-level atom, but obviously we have to privilege the hypothesis 'something human-like did this' if we've already observed a lot of human-like things in our environment.

Suppose I'm a member of a prehistoric tribe, and I see a fire in the distance. It's fine for me to say 'I have a low-ish prior on a human starting the fire, because (AFAIK) there are only a few dozen humans in the area'. And it's fine for me to say 'I've never seen a human start a fire, so I don't think a human started this fire'. But it's not fine for me to say 'It's very unlikely a human started that fire, because human brains are more complicated than other phenomena that might start fires', even if I correctly intuit how and why humans are more complicated than other phenomena.

The case of Thor is a bit more complicated, because gods are different from humans. If Eliezer and cousin_it disagree on this point, maybe Eliezer would say 'The complexity of the human brain is the biggest reason why you shouldn't infer that there are other, as-yet-unobserved species of human-brain-ish things that are very different from humans', and maybe cousin_it would say 'No, it's pretty much just the differentness-from-observed-humans (on the "has direct control over elemental forces" dimension) that matters, not the fact that it has a complicated brain.'

If that's a good characterization of the disagreement, then it seems like Eliezer might say 'In ancient societies, it was much more reasonable to posit mindless "supernatural" phenomena (i.e., mindless physical mechanisms wildly different from anything we've observed) than to posit intelligent supernatural phenomena.' Whereas the hypothetical cousin-it might say that ancient people didn't have enough evidence to conclude that gods were any more unlikely than mindless mechanisms that were similarly different from experience. Example question: what probability should ancient people have assigned to

The regular motion of the planets is due to a random process plus a mindless invisible force, like the mindless invisible force that causes recently-cooked food to cool down all on its own.


The regular motion of the planets is due to deliberate design / intelligent intervention, like the intelligent intervention that arranges and cooks food.
Comment by robbbb on AI Alignment Open Thread August 2019 · 2019-08-06T18:36:35.494Z · score: 3 (2 votes) · LW · GW

Also the discussion of deconfusion research in and , and the sketch of 'why this looks like a hard problem in general' in and .

Comment by robbbb on AI Alignment Open Thread August 2019 · 2019-08-06T18:30:47.722Z · score: 13 (4 votes) · LW · GW

MIRIx events are funded by MIRI, but we don't decide the topics or anything. I haven't taken a poll of MIRI researchers to see how enthusiastic different people are about formal verification, but AFAIK Nate and Eliezer don't see it as super relevant. See and the idea of a "safety-story" in for better attempts to characterize what MIRI is looking for.

ETA: From the end of the latter dialogue,

In point of fact, the real reason the author is listing out this methodology is that he's currently trying to do something similar on the problem of aligning Artificial General Intelligence, and he would like to move past “I believe my AGI won't want to kill anyone” and into a headspace more like writing down statements such as “Although the space of potential weightings for this recurrent neural net does contain weight combinations that would figure out how to kill the programmers, I believe that gradient descent on loss function L will only access a result inside subspace Q with properties P, and I believe a space with properties P does not include any weight combinations that figure out how to kill the programmer.”
Though this itself is not really a reduced statement and still has too much goal-laden language in it.

Rather than putting the emphasis on being able to machine-verify all important properties of the system, this puts the emphasis on having strong technical insight into the system; I usually think of formal proofs more as a means to that end. (Again caveating that some people at MIRI might think of this differently.)

Comment by robbbb on Feedback Requested! Draft of a New About/Welcome Page for LessWrong · 2019-06-02T01:34:36.448Z · score: 4 (2 votes) · LW · GW

A tricky thing about this is that there's an element of cognitive distortion in how most people evaluate these questions, and play-acting at "this distortion makes sense" can worsen the distortion (at the same time that it helps win more trust from people who have the distortion).

If it turned out to be a good idea to try to speak to this perspective, I'd recommend first meditating on a few reversal tests. Like: "Hmm, I wouldn't feel any need to add a disclaimer here if the text I was recommending were The Brothers Karamazov, though I'd want to briefly say why it's relevant, and I might worry about the length. I'd feel a bit worried about recommending a young adult novel, even an unusually didactic one, because people rightly expect YA novels to be optimized for less useful and edifying things than the "literary classics" reference class. The insights tend to be shallower and less common. YA novels and fanfiction are similar in all those respects, and they provoke basically the same feeling in me, so I can maybe use that reversal test to determine what kinds of disclaimers or added context make sense here."

Comment by robbbb on FB/Discord Style Reacts · 2019-06-01T22:43:33.318Z · score: 2 (1 votes) · LW · GW

(If I want to express stronger gratitude than that, I'd rather write it out.)

Comment by robbbb on FB/Discord Style Reacts · 2019-06-01T22:42:28.296Z · score: 2 (1 votes) · LW · GW

On slack, Thumbs Up, OK, and Horns hand signs meet all my minor needs for thanking people.

Comment by robbbb on Drowning children are rare · 2019-05-30T01:28:16.487Z · score: 6 (3 votes) · LW · GW

Can't individuals just list 'Reign of Terror' and then specify in their personalized description that they have a high bar for terror?

Comment by robbbb on Coherent decisions imply consistent utilities · 2019-05-14T19:46:56.205Z · score: 5 (3 votes) · LW · GW

We'd talked about getting a dump out as well, and your plan sounds great to me! The LW team should get back to you with a list at some point (unless they think of a better idea).

Comment by robbbb on Coherent decisions imply consistent utilities · 2019-05-14T03:44:21.316Z · score: 16 (8 votes) · LW · GW

I asked Eliezer if it made sense to cross-post this from Arbital, and did the cross-posting when he approved. I'm sorry it wasn't clear that this was a cross-post! I intended to make this clearer, but my idea was bad (putting the information on the sequence page) and I also implemented it wrong (the sequence didn't previously display on the top of this post).

This post was originally written as a nontechnical introduction to expected utility theory and coherence arguments. Although it begins in media res stylistically, it doesn't have any prereqs or context beyond "this is part of a collection of introductory resources covering a wide variety of technical and semitechnical topics."

Per the first sentence, the main purpose is for this to be a linkable resource for conversations/inquiry about human rationality and conversations/inquiry about AGI:

So we're talking about how to make good decisions, or the idea of 'bounded rationality', or what sufficiently advanced Artificial Intelligences might be like; and somebody starts dragging up the concepts of 'expected utility' or 'utility functions'. And before we even ask what those are, we might first ask, Why?

There have been loose plans for a while to cross-post content from Arbital to LW (maybe all of it; maybe just the best or most interesting stuff), but as I mentioned downthread, we're doing more cross-post experiments sooner than we would have because Arbital's been having serious performance issues.

Comment by robbbb on Coherent decisions imply consistent utilities · 2019-05-14T03:34:04.620Z · score: 5 (3 votes) · LW · GW

I assume you mean 'no one has this responsibility for Arbital anymore', and not that there's someone else who has this responsibility.

Comment by robbbb on Coherent decisions imply consistent utilities · 2019-05-14T02:01:21.742Z · score: 10 (4 votes) · LW · GW

Arbital has been getting increasingly slow and unresponsive. The LW team is looking for fixes or work-arounds, but they aren't familiar with the Arbital codebase. In the meantime, I've been helping cross-post some content from Arbital to LW so it's available at all.

Comment by robbbb on Any rebuttals of Christiano and AI Impacts on takeoff speeds? · 2019-05-12T01:28:50.738Z · score: 29 (8 votes) · LW · GW

MIRI folks are the most prominent proponents of fast takeoff, and we unfortunately haven't had time to write up a thorough response. Oli already quoted the quick comments I posted from Nate and Eliezer last year, and I'll chime in with some of the factors that I think are leading to disagreements about takeoff:

  • Some MIRI people (Nate is one) suspect we might already be in hardware overhang mode, or closer to that point than some other researchers in the field believe.
  • MIRI folks tend to have different views from Paul about AGI, some of which imply that AGI is more likely to be novel and dependent on new insights. (Unfair caricature: Imagine two people in the early 20th century who don't have a technical understanding of nuclear physics yet, trying to argue about how powerful a nuclear-chain-reaction-based bomb might be. If one side were to model that kind of bomb as "sort of like TNT 3.0" while the other is modeling it as "sort of like a small Sun", they're likely to disagree about whether nuclear weapons are going to be a small v. large improvement over TNT. Note I'm just using nuclear weapons as an analogy, not giving an outside-view argument "sometimes technologies are discontinuous, ergo AGI will be discontinuous".)

This list isn't at all intended to be sufficiently-detailed or exhaustive.

I'm hoping we have time to write up more thoughts on this before too long, because this is an important issue (even given that we're trying to minimize the researcher time we put into things other than object-level deconfusion research). I don't want MIRI to be a blocker on other researchers making progress on these issues, though — it would be bad if people put a pause on hashing out takeoff issues for themselves (or put a pause on alignment research that's related to takeoff views) until Eliezer had time to put out a blog post. I primarily wanted to make sure people know that the lack of a substantive response doesn't mean that Nate+Eliezer+Benya+etc. agree with Paul on takeoff issues now, or that we don't think this disagreement matters. Our tardiness is because of opportunity costs and because our views have a lot of pieces to articulate.

Comment by robbbb on Rob B's Shortform Feed · 2019-05-11T20:21:03.506Z · score: 2 (1 votes) · LW · GW


Comment by robbbb on Rob B's Shortform Feed · 2019-05-11T20:18:55.445Z · score: 2 (1 votes) · LW · GW

That counts! :) Part of why I'm asking is in case we want to build a proper LW glossary, and Rationality Cardinality could at least provide ideas for terms we might be missing.

Comment by robbbb on Rob B's Shortform Feed · 2019-05-10T23:19:00.628Z · score: 4 (2 votes) · LW · GW

Are there any other OK-quality rationalist glossaries out there? is the only one I know of. I vaguely recall there being one on at some point, but I might be misremembering.

Comment by robbbb on Rob B's Shortform Feed · 2019-05-10T23:13:24.150Z · score: 7 (3 votes) · LW · GW

The wiki glossary for the sequences / Rationality: A-Z ( ) is updated now with the glossary entries from the print edition of vol. 1-2.

New entries from Map and Territory:

anthropics, availability heuristic, Bayes's theorem, Bayesian, Bayesian updating, bit, Blue and Green, calibration, causal decision theory, cognitive bias, conditional probability, confirmation bias, conjunction fallacy, deontology, directed acyclic graph, elan vital, Everett branch, expected value, Fermi paradox, foozality, hindsight bias, inductive bias, instrumental, intentionality, isomorphism, Kolmogorov complexity, likelihood, maximum-entropy probability distribution, probability distribution, statistical bias, two-boxing

New entries from How to Actually Change Your Mind:

affect heuristic, causal graph, correspondence bias, epistemology, existential risk, frequentism, Friendly AI, group selection, halo effect, humility, intelligence explosion, joint probability distribution, just-world fallacy, koan, many-worlds interpretation, modesty, transhuman

A bunch of other entries from the M&T and HACYM glossaries were already on the wiki; most of these have been improved a bit or made more concise.

Comment by robbbb on Alignment Newsletter One Year Retrospective · 2019-05-06T06:02:49.888Z · score: 6 (3 votes) · LW · GW

One option that's smaller than link posts might be to mention in the AF/LW version of the newsletter which entries are new to AIAF/LW as far as you know; or make comment threads in the newsletter for those entries. I don't know how useful these would be either, but it'd be one way to create common knowledge 'this is currently the one and only place to discuss these things on LW/AIAF'.

Comment by robbbb on [Meta] Hiding negative karma notifications by default · 2019-05-06T01:54:21.736Z · score: 18 (6 votes) · LW · GW

Possible compromise idea: send everyone their karma upvotes along with downvotes regularly, but send the upvotes in daily batches and the downvotes in monthly batches. Having your downvotes sent to you at known, predictable times rather than in random bursts, and having the updates occur less often, might let users take in the relevant information without having it totally dominate their day-to-day experience of visiting the site. This also makes it easier to spot patterns and to properly discount very small aversive changes in vote totals.

On the whole, I'm not sure how useful this would be as a sitewide default. Some concerns:

  • It's not clear to me that karma on its own is all that useful or contentful. Ray recently noted that a comment of his had gotten downvoted somewhat, and that this had been super salient and pointed feedback for him. But I'm pretty sure that the 'downvote' Ray was talking about was actually just me turning a strong upvote into a normal upvote for minor / not-worth-independently-tracking reasons. Plenty of people vote for obscure or complicated or just-wrong reasons.
  • The people who get downvoted the most are likely to have less familiarity with LW norms and context, so they'll be especially ill-equipped to extract actionable information from downvotes. If all people are learning is '<confusing noisy social disapproval>', I'm not sure that's going to help them very much in their journey as a rationalist.

Upvotes tend to be a clearer signal in my experience, while needing to meet a lower bar. (Cf.: we have a higher epistemic bar for establishing a norm 'let's start insulting/criticizing/calling out our colleagues whenever they make a mistake' than for establishing a norm 'let's start complimenting/praising/thanking our colleagues whenever they do something cool', and it would be odd to say that the latter is categorically bad in any environment where we don't also establish the former norm.)

I'm not confident of what the right answer is; this is just me laying out some counter-considerations. I like Mako's comment because it's advocating for an important value, and expressing a not-obviously-wrong concern about that value getting compromised. I lean toward 'don't make down-votes this salient' right now. I'd like more clarity inside my head about how much the downvote-hiding worry is shaped like 'we need to make downvotes more salient so we can actually get the important intellectual work done' vs. 'we need to make downvotes more salient so we can better symbolize/resemble Rationality'.

Comment by robbbb on Open Thread May 2019 · 2019-05-03T05:06:27.709Z · score: 5 (3 votes) · LW · GW

! Hi! I am a biased MIRI person, but I quite dig all the things you mentioned. :)

Comment by robbbb on Habryka's Shortform Feed · 2019-05-02T22:09:44.951Z · score: 7 (4 votes) · LW · GW

I like this shortform feed idea!

Comment by robbbb on Habryka's Shortform Feed · 2019-05-01T18:06:57.710Z · score: 4 (2 votes) · LW · GW

Yeah, strong upvote to this point. Having an Arbital-style system where people's probabilities aren't prominently timestamped might be the worst of both worlds, though, since it discourages updating and makes it look like most people never do it.

I have an intuition that something socially good might be achieved by seeing high-status rationalists treat ass numbers as ass numbers, brazenly assign wildly different probabilities to the same proposition week-by-week, etc., especially if this is a casual and incidental thing rather than being the focus of any blog posts or comments. This might work better, though, if the earlier probabilities vanish by default and only show up again if the user decides to highlight them.

(Also, if a user repeatedly abuses this feature to look a lot more accurate than they really were, this warrants mod intervention IMO.)

Comment by robbbb on Habryka's Shortform Feed · 2019-04-30T23:36:36.309Z · score: 5 (3 votes) · LW · GW

Also, if you do something Arbital-like, I'd find it valuable if the interface encourages people to keep updating their probabilities later as they change. E.g., some (preferably optional) way of tracking how your view has changed over time. Probably also make it easy for people to re-vote without checking (and getting anchored by) their old probability assignment, for people who want that.

Comment by robbbb on Habryka's Shortform Feed · 2019-04-30T23:35:02.135Z · score: 4 (2 votes) · LW · GW

One small thing you could do is to have probability tools be collapsed by default on any AIAF posts (and maybe even on the LW versions of AIAF posts).

Also, maybe someone should write a blog post that's a canonical reference for 'the relevant risks of using probabilities that haven't already been written up', in advance of the feature being released. Then you could just link to that a bunch. (Maybe even include it in the post that explains how the probability tools work, and/or link to that post from all instances of the probability tool.)

Another idea: Arbital had a mix of (1) 'specialized pages that just include a single probability poll and nothing else'; (2) 'pages that are mainly just about listing a ton of probability polls'; and (3) 'pages that have a bunch of other content but incidentally include some probability polls'.

If probability polls on LW mostly looked like 1 and 2 rather than 3, then that might make it easier to distinguish the parts of LW that should be very probability-focused from the parts that shouldn't. I.e., you could avoid adding Arbital's feature for easily embedding probability polls in arbitrary posts (and/or arbitrary comments), and instead treat this more as a distinct kind of page, like 'Questions'.

You could still link to the 'Probability' pages prominently in your post, but the reduced prominence and site support might cause there to be less social pressure for people to avoid writing/posting things out of fears like 'if I don't provide probability assignments for all my claims in this blog post, or don't add a probability poll about something at the end, will I be seen as a Bad Rationalist?'

Comment by robbbb on Habryka's Shortform Feed · 2019-04-30T23:17:48.363Z · score: 2 (1 votes) · LW · GW

I've never checked my karma total on LW 2.0 to see how it's changed.

Comment by robbbb on Habryka's Shortform Feed · 2019-04-28T03:40:56.888Z · score: 5 (3 votes) · LW · GW
I am most worried that this will drastically increase the clutter of comment threads and make things a lot harder to parse. In particular if the order of the reacts is different on each comment, since then there is no reliable way of scanning for the different kinds of information.

I like the reactions UI above, partly because separating it from karma makes it clearer that it's not changing how comments get sorted, and partly because I do want 'agree'/'disagree' to be non-anonymous by default (unlike normal karma).

I agree that the order of reacts should always be the same. I also think every comment/post should display all the reacts (even just to say '0 Agree, 0 Disagree...') to keep things uniform. That means I think there should only be a few permitted reacts -- maybe start with just 'Agree' and 'Disagree', then wait 6+ months and see if users are especially clambering for something extra.

I think the obvious other reacts I'd want to use sometimes are 'agree and downvote' + 'disagree and upvote' (maybe shorten to Agree+Down, Disagree+Up), since otherwise someone might not realize that one and the same person is doing both, which loses a fair amount of this thing I want to be fluidly able to signal. (I don't think there's much value to clearly signaling that the same person agreed and upvoted or disagree and downvoted a thing.)

I would also sometimes click both the 'agree' and 'disagree' buttons, which I think is fine to allow under this UI. :)

Comment by robbbb on Speaking for myself (re: how the LW2.0 team communicates) · 2019-04-27T20:38:02.955Z · score: 9 (5 votes) · LW · GW

*disagrees with and approves of this relevant, interesting, and non-confused comment*

Comment by robbbb on Helen Toner on China, CSET, and AI · 2019-04-23T19:05:20.792Z · score: 5 (4 votes) · LW · GW
"How are we counting Chinese versus non-Chinese papers? Because often, it seems to be just doing it via, "Is their last name Chinese?" Which seems like it really is going to miscount." seems unreasonably skeptical. It's not too much harder to just look up the country of the university/organization that published the paper.

? "Skeptical" implies that this is speculation on Helen's part, whereas I took her to be asserting as fact that this is the methodology that some studies in this category use, and that this isn't a secret or anything. This may be clearer in the full transcript:

Julia Galef: So, I'm curious -- one thing that people often cite is that China publishes more papers on deep learning than the US does. Deep learning, maybe we explained that already, it's the dominant paradigm in AI that's generating a lot of powerful results.
Helen Toner: Mm-hmm.
Julia Galef: So, would you consider that, “number of papers published on deep learning,” would you consider that a meaningful metric?
Helen Toner: I mean, I think it's meaningful. I don't think it is the be-all and end-all metric. I think it contains some information. I think the thing I find frustrating about how central that metric has been is that usually it's mentioned with no sort of accompanying … I don't know. This is a very Rationally Speaking thing to say, so I'm glad I'm on this podcast and not another one…
But it's always mentioned without sort of any kind of caveats or any kind of context. For example, how are we counting Chinese versus non-Chinese papers? Because often, it seems to be just doing it via, "Is their last name Chinese," which seems like it really is going to miscount.
Julia Galef: Oh, wow! There are a bunch of people with Chinese last names working at American AI companies.
Helen Toner: Correct, many of whom are American citizens. So, I think I've definitely seen at least some measures that do that wrong, which seems just completely absurd. But then there's also, if you have a Chinese citizen working in an American university, how should that be counted? Is that a win for the university or is it win for China? It's very unclear.
And they also, these counts of papers have a hard time sort of saying anything about the quality of the papers involved. You can look at citations, but that's not a perfect metric. But it's better, for sure.
And then, lastly, they rarely say anything about the different incentives that Chinese and non-Chinese academics face in publishing. [...]
Comment by robbbb on Book review: The Sleepwalkers by Arthur Koestler · 2019-04-23T18:57:02.444Z · score: 4 (2 votes) · LW · GW

Maybe someday! :)

Comment by robbbb on Book review: The Sleepwalkers by Arthur Koestler · 2019-04-23T11:44:49.460Z · score: 10 (4 votes) · LW · GW

Overconfidence in sentences like "the moon has craters" may be a sin. (Though I'd disagree that this sin category warrants banning someone from talking about the moon's craters and trapping them within a building with threats of force for nine years. YMMV.)

Thinking that the sentence "the moon has craters" refers to the moon, and asserts of the moon that there are craters on it, doesn't seem like a sin at all to me, regardless of whether some scientific models (e.g., in QM) are sometimes useful for reasons we don't understand.

Comment by robbbb on Evidence other than evolution for optimization daemons? · 2019-04-21T21:02:01.510Z · score: 4 (2 votes) · LW · GW

"Catholicism predicts that all soulless optimizers will explicitly represent and maximize their evolutionary fitness function" is a pretty unusual view (even as Catholic views go)! If you want answers to take debates about God and free will into account, I suggest mentioning God/Catholicism in the title.

More broadly, my recommendation would be to read all of and flag questions and disagreements there before trying to square any AI safety stuff with your religious views.

Comment by robbbb on Slack Club · 2019-04-19T17:44:45.798Z · score: 18 (4 votes) · LW · GW

I agree with a bunch of these concerns. FWIW, it wouldn't surprise me if the current rationalist community still behaviorally undervalues "specialized jargon". (Or, rather than jargon, concept handles a la I don't have a strong view on whether rationalists undervalue of overvalue this kind of thing, but it seems worth commenting on since it's being discussed a lot here.

When I observe the reasons people ended up 'working smarter' or changing course in a good way, it often involves a new lens they started applying to something. I think one of the biggest problems the rationalist community faces is a lack of dakka and a lack of lead bullets. But I guess I want to caution against treating abstraction and execution as too much of a dichotomy, such that we have to choose between "novel LW posts are useful and high-status" and "conscientiousness and follow-through is useful and high-status" and see-saw between the two.

The important thing is cutting the enemy, and I think the kinds of problems that rationalists are in an especially good position to solve require individuals to exhibit large amounts of execution and follow-through while (on a timescale of years) doing a large number of big and small course-corrections to improve their productivity or change their strategy.

It might be that we're doing too much reflection and too much coming up with lenses. It might also be that we're not doing enough grunt work and not doing enough reflection and lenscrafting. Physical tasks don't care whether we're already doing an abnormal amount of one or the other; the universe just hands us problems of a certain difficulty, and if we fall short on any of the requirements then we fail.

It might also be that this varies by individual, such that it's best to just make sure people are aware of these different concerns so they can check which holds true in their own circumstance.

Comment by robbbb on "Intelligence is impossible without emotion" — Yann LeCun · 2019-04-10T21:47:53.444Z · score: 13 (4 votes) · LW · GW

My prior is that Yann LeCun tends to have unmysterious, thoughtful models of AI (example), even though I strongly disagree with (and am often confused by) his claims about AI safety. So when Yann says "emotion", I wonder if he means anything more than that they "can decide what they do" and have "some intrinsic drive that makes them [...] do particular things" as opposed to having "preprogrammed behavior".

Comment by robbbb on Comparison of decision theories (with a focus on logical-counterfactual decision theories) · 2019-03-18T05:13:53.675Z · score: 6 (3 votes) · LW · GW

Agents need to consider multiple actions and choose the one that has the best outcome. But we're supposing that the code representing the agent's decision only has one possible output. E.g., perhaps an agent is going to choose between action A and action B, and will end up choosing A. Then a sufficiently close examination of the agent's source code will reveal that the scenario "the agent chooses B" is logically inconsistent. But then it's not clear how the agent can reason about the desirability of "the agent chooses B" while evaluating its outcomes, if not via some mechanism for nontrivially reasoning about outcomes of logically inconsistent situations.

Comment by robbbb on Comparison of decision theories (with a focus on logical-counterfactual decision theories) · 2019-03-17T19:49:37.776Z · score: 6 (3 votes) · LW · GW

The comment starting "The main datapoint that Rob left out..." is actually by Nate Soares. I cross-posted it to LW from an email conversation.

Comment by robbbb on Question: MIRI Corrigbility Agenda · 2019-03-16T18:28:52.502Z · score: 4 (2 votes) · LW · GW

I've now also highlighted Scott's tip from "Fixed Point Exercises":

Sometimes people ask me what math they should study in order to get into agent foundations. My first answer is that I have found the introductory class in every subfield to be helpful, but I have found the later classes to be much less helpful. My second answer is to learn enough math to understand all fixed point theorems.
These two answers are actually very similar. Fixed point theorems span all across mathematics, and are central to (my way of) thinking about agent foundations.
Comment by robbbb on Question: MIRI Corrigbility Agenda · 2019-03-16T14:31:39.854Z · score: 4 (2 votes) · LW · GW

I'd expect Jessica/Stuart/Scott/Abram/Sam/Tsvi to have a better sense of that than me. I didn't spot any obvious signs that it's no longer a good reference.

Comment by robbbb on Question: MIRI Corrigbility Agenda · 2019-03-15T05:44:20.940Z · score: 5 (3 votes) · LW · GW

For corrigibility in particular, some good material that's not discussed in "Embedded Agency" or the reading guide is Arbital's Corrigibility and Problem of Fully Updated Deference articles.

Comment by robbbb on Question: MIRI Corrigbility Agenda · 2019-03-15T05:36:42.850Z · score: 14 (5 votes) · LW · GW

The only major changes we've made to the MIRI research guide since mid-2015 are to replace Koller and Friedman's Probabilistic Graphical Models with Pearl's Probabilistic Inference; replace Rosen's Discrete Mathematics with Lehman et al.'s Mathematics for CS; add Taylor et al.'s "Alignment for Advanced Machine Learning Systems", Wasserman's All of Statistics, Shalev-Shwartz and Ben-David's Understanding Machine Learning, and Yudkowsky's Inadequate Equilibria; and remove the Global Catastrophic Risks anthology. So the guide is missing a lot of new material. I've now updated the guide to add the following note at the top:

This research guide has been only lightly updated since 2015. Our new recommendation for people who want to work on the AI alignment problem is:
1. If you have a computer science or software engineering background: Apply to attend our new workshops on AI risk and to work as an engineer at MIRI. For this purpose, you don’t need any prior familiarity with our research.
If you aren’t sure whether you’d be a good fit for an AI risk workshop, or for an engineer position, shoot us an email and we can talk about whether it makes sense.
You can find out more about our engineering program in our 2018 strategy update.
2. If you’d like to learn more about the problems we’re working on (regardless of your answer to the above): See “Embedded Agency” for an introduction to our agent foundations research, and see our Alignment Research Field Guide for general recommendations on how to get started in AI safety.
After checking out those two resources, you can use the links and references in “Embedded Agency” and on this page to learn more about the topics you want to drill down on. If you want a particular problem set to focus on, we suggest Scott Garrabrant’s “Fixed Point Exercises.”
If you want people to collaborate and discuss with, we suggest starting or joining a MIRIx group, posting on LessWrong, applying for our AI Risk for Computer Scientists workshops, or otherwise letting us know you’re out there.
Comment by robbbb on "AlphaStar: Mastering the Real-Time Strategy Game StarCraft II", DeepMind [won 10 of 11 games against human pros] · 2019-03-13T18:07:48.572Z · score: 7 (3 votes) · LW · GW
After all, they didn't get any less publicity for reporting the system's other limitations either, like it only being able to play Protoss v. Protoss on a single map, or 10/11 of the agents having whole-camera vision.

They might well have gotten less publicity due to emphasizing those facts as much as they did.

Comment by robbbb on "AlphaStar: Mastering the Real-Time Strategy Game StarCraft II", DeepMind [won 10 of 11 games against human pros] · 2019-03-13T18:04:16.343Z · score: 5 (2 votes) · LW · GW

I mostly agree with this comment. My speculative best guess is that the main reason MaNa did better against the revised version of AlphaStar wasn't due to the vision limitations, but rather some combination of:

MaNa had more time to come up with a good strategy and analyze previous games.

MaNa had more time to warm up, and was generally in a better headspace.

The previous version of AlphaStar was unusually good, and the new version was an entirely new system, so the new version regressed to the mean a bit. (On the dimension "can beat human pros", even though it was superior on the dimension "can beat other AlphaStar strategies".)

Comment by robbbb on Considerateness in OpenAI LP Debate · 2019-03-12T22:24:22.462Z · score: 10 (2 votes) · LW · GW

Eliezer responded to Chollet's post about intelligence explosion here:

Comment by robbbb on Renaming "Frontpage" · 2019-03-12T00:53:11.596Z · score: 4 (2 votes) · LW · GW

Personal Blog ➜ Notebook

Messages ➜ Mailbox

Comment by robbbb on Renaming "Frontpage" · 2019-03-11T16:21:11.300Z · score: 12 (4 votes) · LW · GW

Frontpage ➜ Whiteboard

Art ➜ Canvas

Coordination ➜ Bulletin Board

Meta ➜ Website

Comment by robbbb on In My Culture · 2019-03-10T21:12:27.930Z · score: 6 (4 votes) · LW · GW

I like this comment.

Comment by robbbb on Renaming "Frontpage" · 2019-03-09T04:13:21.305Z · score: 3 (2 votes) · LW · GW

Oooh, I like this. Fewer top-level sections seems good to me.

Comment by robbbb on In My Culture · 2019-03-08T00:03:09.758Z · score: 4 (2 votes) · LW · GW

That was my draft 1. :P

Comment by robbbb on In My Culture · 2019-03-07T21:51:57.113Z · score: 15 (5 votes) · LW · GW

For my personal usage, the way I could imagine using it, "in my culture" sounds a bit serious and final. "Where I'm from, we do X" is nice if I want something to sound weighty and powerful and stable, but I just don't think I've figured myself out enough to do that much yet. There might also be a bit of confusion in that "in my culture" also has a structurally similar literal meaning.

"In Robopolis" seems to fix these problems for me, since it more clearly flags that I'm not talking about a literal culture, and it sounds more agnostic about whether this is a deep part of who I am vs. a passing fashion.