Posts

[Review] On the Chatham House Rule (Ben Pace, Dec 2019) 2019-12-10T00:24:57.206Z · score: 36 (10 votes)
[Review] Meta-Honesty (Ben Pace, Dec 2019) 2019-12-10T00:14:30.837Z · score: 24 (5 votes)
The Review Phase: Helping LessWrongers Evaluate Old Posts 2019-12-09T00:54:28.514Z · score: 45 (9 votes)
The Lesson To Unlearn 2019-12-08T00:50:47.882Z · score: 39 (12 votes)
Is the rate of scientific progress slowing down? (by Tyler Cowen and Ben Southwood) 2019-12-02T03:45:56.870Z · score: 34 (10 votes)
Useful Does Not Mean Secure 2019-11-30T02:05:14.305Z · score: 48 (13 votes)
AI Alignment Research Overview (by Jacob Steinhardt) 2019-11-06T19:24:50.240Z · score: 44 (9 votes)
How feasible is long-range forecasting? 2019-10-10T22:11:58.309Z · score: 43 (12 votes)
AI Alignment Writing Day Roundup #2 2019-10-07T23:36:36.307Z · score: 35 (9 votes)
Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More 2019-10-04T04:08:49.942Z · score: 168 (59 votes)
Follow-Up to Petrov Day, 2019 2019-09-27T23:47:15.738Z · score: 83 (27 votes)
Honoring Petrov Day on LessWrong, in 2019 2019-09-26T09:10:27.783Z · score: 138 (52 votes)
SSC Meetups Everywhere: Salt Lake City, UT 2019-09-14T06:37:12.296Z · score: 0 (0 votes)
SSC Meetups Everywhere: San Diego, CA 2019-09-14T06:34:33.492Z · score: 0 (0 votes)
SSC Meetups Everywhere: San Jose, CA 2019-09-14T06:31:06.068Z · score: 0 (0 votes)
SSC Meetups Everywhere: San José, Costa Rica 2019-09-14T06:25:45.112Z · score: 0 (0 votes)
SSC Meetups Everywhere: São José dos Campos, Brazil 2019-09-14T06:18:23.523Z · score: 0 (0 votes)
SSC Meetups Everywhere: Seattle, WA 2019-09-14T06:13:06.891Z · score: 0 (-1 votes)
SSC Meetups Everywhere: Seoul, South Korea 2019-09-14T06:08:26.697Z · score: 0 (0 votes)
SSC Meetups Everywhere: Sydney, Australia 2019-09-14T05:53:45.606Z · score: 0 (0 votes)
SSC Meetups Everywhere: Tampa, FL 2019-09-14T05:49:31.139Z · score: 0 (0 votes)
SSC Meetups Everywhere: Toronto, Canada 2019-09-14T05:45:15.696Z · score: 0 (-1 votes)
SSC Meetups Everywhere: Vancouver, Canada 2019-09-14T05:39:25.503Z · score: 0 (0 votes)
SSC Meetups Everywhere: Victoria, BC, Canada 2019-09-14T05:34:40.937Z · score: 0 (-1 votes)
SSC Meetups Everywhere: Vienna, Austria 2019-09-14T05:27:31.640Z · score: 2 (2 votes)
SSC Meetups Everywhere: Warsaw, Poland 2019-09-14T05:24:16.061Z · score: 0 (0 votes)
SSC Meetups Everywhere: Wellington, New Zealand 2019-09-14T05:17:28.055Z · score: 0 (0 votes)
SSC Meetups Everywhere: West Lafayette, IN 2019-09-14T05:11:28.211Z · score: 0 (0 votes)
SSC Meetups Everywhere: Zurich, Switzerland 2019-09-14T05:03:43.295Z · score: 0 (0 votes)
Rationality Exercises Prize of September 2019 ($1,000) 2019-09-11T00:19:51.488Z · score: 85 (25 votes)
Stories About Progress 2019-09-08T23:07:10.443Z · score: 31 (9 votes)
Political Violence and Distraction Theories 2019-09-06T20:21:23.801Z · score: 18 (7 votes)
Stories About Education 2019-09-04T19:53:47.637Z · score: 41 (16 votes)
Stories About Academia 2019-09-02T18:40:00.106Z · score: 33 (21 votes)
Peter Thiel/Eric Weinstein Transcript on Growth, Violence, and Stories 2019-08-31T02:44:16.833Z · score: 72 (30 votes)
AI Alignment Writing Day Roundup #1 2019-08-30T01:26:05.485Z · score: 34 (14 votes)
Why so much variance in human intelligence? 2019-08-22T22:36:55.499Z · score: 56 (21 votes)
Announcement: Writing Day Today (Thursday) 2019-08-22T04:48:38.086Z · score: 32 (12 votes)
"Can We Survive Technology" by von Neumann 2019-08-18T18:58:54.929Z · score: 35 (11 votes)
A Key Power of the President is to Coordinate the Execution of Existing Concrete Plans 2019-07-16T05:06:50.397Z · score: 116 (35 votes)
Bystander effect false? 2019-07-12T06:30:02.277Z · score: 19 (10 votes)
The Hacker Learns to Trust 2019-06-22T00:27:55.298Z · score: 79 (23 votes)
Welcome to LessWrong! 2019-06-14T19:42:26.128Z · score: 96 (50 votes)
Von Neumann’s critique of automata theory and logic in computer science 2019-05-26T04:14:24.509Z · score: 30 (11 votes)
Ed Boyden on the State of Science 2019-05-13T01:54:37.835Z · score: 64 (16 votes)
Why does category theory exist? 2019-04-25T04:54:46.475Z · score: 35 (7 votes)
Formalising continuous info cascades? [Info-cascade series] 2019-03-13T10:55:46.133Z · score: 17 (4 votes)
How large is the harm from info-cascades? [Info-cascade series] 2019-03-13T10:55:38.872Z · score: 23 (4 votes)
How can we respond to info-cascades? [Info-cascade series] 2019-03-13T10:55:25.685Z · score: 15 (3 votes)
Distribution of info-cascades across fields? [Info-cascade series] 2019-03-13T10:55:17.194Z · score: 15 (3 votes)

Comments

Comment by benito on The Review Phase: Helping LessWrongers Evaluate Old Posts · 2019-12-11T00:02:30.985Z · score: 2 (1 votes) · LW · GW

I do think including notes about what you think should be included in the book is still valuable, but is something it makes more sense to do after you've spent some time in "evaluate and add information" mode.

Yeah, that's the central question for the voting phase, which comes after the reviewing phase.

Comment by benito on Local Validity as a Key to Sanity and Civilization · 2019-12-10T02:06:32.954Z · score: 20 (4 votes) · LW · GW

I think about this post a lot, and sometimes in conjunction with my own post on common knowlege.

As well as it being a referent for when I think about fairness, it also ties in with how I think about LessWrong, Arbital and communal online endeavours for truth. The key line is:

For civilization to hold together, we need to make coordinated steps away from Nash equilibria in lockstep.

You can think of Wikipedia as being a set of communally editable web pages where the content of the page is constrained to be that which we can easily gain common knowledge of its truth. Wikipedia's information is only that which comes from verifiable sources, which is how they solve this problem - all the editors don't have to get in a room and talk forever if there's a simple standard of truth. (I mean, they still do, but it would blow up to an impossible level if the standard were laxer than this.)

I understand a key part of the vision for Arbital was that, instead of the common standard being verifiable facts, it was instead to build a site around verifiable steps of inference, or alternatively phrased, local validity. This would allow us to walk through argument space together without knowing whether the conclusions were true or false yet.

I think about this a lot, in terms of what steps a community can make together. I maybe will write a post on it more some day. I'm really grateful that Eliezer wrote this post.

Comment by benito on The Costly Coordination Mechanism of Common Knowledge · 2019-12-10T02:03:52.758Z · score: 6 (3 votes) · LW · GW

This my own post. I continue to talk and think a lot about the world from the perspective of solving coordination problems where facilitating the ability for people to build common knowledge is one of the central tools. I'm very glad I wrote the post, it made a lot of my own thinking more rigorous and clear.

Comment by benito on How did academia ensure papers were correct in the early 20th Century? · 2019-12-10T02:02:13.168Z · score: 15 (4 votes) · LW · GW

I wrote this post, and at the time I just wrote it because... well, I thought I'd be able to write a post with a grand conclusion about how science used to check the truth, and then point to how it changed, but I was so surprised to find that journals had not one sentence of criticism in them at all. So I wrote it up as a question post instead, framing my failure to answer the question as 'partial work' that 'helped define the question'.

In retrospect, I'm really glad I wrote the post, because it is a clear datapoint about how science does not work. I have updated down on the need for large amounts of public critique in figuring out what's true. I think about the datapoint from time to time, and expect to think about it more.

Comment by benito on Paul's research agenda FAQ · 2019-12-10T01:59:53.034Z · score: 2 (1 votes) · LW · GW

See my review here.

Comment by benito on Challenges to Christiano’s capability amplification proposal · 2019-12-10T01:58:28.409Z · score: 6 (3 votes) · LW · GW

This post is close in my mind to Alex Zhu's post Paul's research agenda FAQ. They each helped to give me many new and interesting thoughts about alignment. 

This post was maybe the first time I'd seen a an actual conversation about Paul's work between two people who had deep disagreements in this area - where Paul wrote things, someone wrote an effort-post response, and Paul responded once again. Eliezer did it again in the comments of Alex's FAQ, which also was a big deal for me in terms of learning.

Comment by benito on [Review] On the Chatham House Rule (Ben Pace, Dec 2019) · 2019-12-10T01:17:04.712Z · score: 2 (1 votes) · LW · GW

Ah, there was an extra 'the' in that sentence. Edited. Let me know if it's still unclear.

Comment by benito on On the Chatham House Rule · 2019-12-10T00:25:51.862Z · score: 2 (1 votes) · LW · GW

I reviewed this post here.

Comment by benito on Meta-Honesty: Firming Up Honesty Around Its Edge-Cases · 2019-12-10T00:14:33.886Z · score: 36 (4 votes) · LW · GW

Here are my thoughts.

  1. Being honest is hard, and there are many difficult and surprising edge-cases, including things like context failures, negotiating with powerful institutions, politicised narratives, and compute limitations.
  2. On top of the rule of trying very hard to be honest, Eliezer's post offers an additional general rule for navigating the edge cases. The rule is that when you’re having a general conversation all about the sorts of situations you would and wouldn’t lie, you must be absolutely honest. You can explicitly not answer certain questions if it seems necessary, but you must never lie.
  3. I think this rule is a good extension of the general principle of honesty, and appreciate Eliezer's theoretical arguments for why this rule is necessary.
  4. Eliezer’s post introduces some new terminology for discussions of honesty - in particular, the term 'meta-honesty' as the rule instead of 'honesty'.
  5. If the term 'meta-honesty' is common knowledge but the implementation details aren't, and if people try to use it, then they will perceive a large number of norm violations that are actually linguistic confusions. Linguistic confusions are not strongly negative in most fields, merely a nuisance, but in discussions of norm-violation (e.g. a court of law) they have grave consequences, and you shouldn't try to build communal norms on such shaky foundations.
  6. I and many other people this post was directed at, find it requires multiple readings to understand, so I think that if everyone reads this post, it will not be remotely sufficient for making the implementation details common knowledge, even if the term can become that.
  7. In general, I think that everyone should make sure it is acceptable, when asking "Can we operate under the norms of meta-honesty?" for the other person to reply "I'd like to taboo the term 'meta-honesty', because I'm not sure we'll be talking about the same thing if we use that term."
  8. This is a valuable bedrock for thinking about the true laws, but not currently at a stage where we can build simple deontological communal rules around. I’d like to see more posts advancing both fronts.

For more details on all that, I've written up the above at length here.

Comment by benito on Meta-Honesty: Firming Up Honesty Around Its Edge-Cases · 2019-12-09T23:26:44.537Z · score: 4 (2 votes) · LW · GW

Eliezer discusses the fact that replying “I’m fine” to “How are you?” is literally false. In case anyone’s interested, one answer I've taken to using in response to "How are you?" is “High variance," which is helpfully vague about the direction of the variance.

Comment by benito on Understanding “Deep Double Descent” · 2019-12-09T19:57:00.542Z · score: 8 (4 votes) · LW · GW

This post was on Hacker News for a while.

Comment by benito on BrienneYudkowsky's Shortform · 2019-12-08T20:48:59.929Z · score: 5 (2 votes) · LW · GW

Some of these reminded me of when Weft asked a few slightly related qustions previously.

Comment by benito on The Lesson To Unlearn · 2019-12-08T00:51:55.885Z · score: 10 (6 votes) · LW · GW

I found it a surprisingly fresh take, given so many shared starting assumptions. I really enjoy reading someone thinking aloud for themselves, even on a topic that's been talked about so much. And a surprisingly optimistic conclusion.

Comment by benito on How do you assess the quality / reliability of a scientific study? · 2019-12-05T03:39:39.987Z · score: 4 (2 votes) · LW · GW

Yeah, I think you're right. Edited.

Comment by benito on BrienneYudkowsky's Shortform · 2019-12-05T00:10:07.643Z · score: 9 (5 votes) · LW · GW

I was honestly a bit surprised how well you managed to pull the exact moment from my childhood where I learned the word 'monograph'. I read every page of a beautiful red book that contained all of the Sherlock Holmes stories, and I distinctly recall the line about having written a monograph on the subject of cigar ash, and being able to discern the different types.

Comment by benito on Caring less · 2019-12-02T19:35:02.771Z · score: 2 (1 votes) · LW · GW

Seconding Habryka.

Comment by benito on Competitive Markets as Distributed Backprop · 2019-12-02T19:16:52.339Z · score: 2 (1 votes) · LW · GW

Seconding Jacob.

Comment by benito on Research: Rescuers during the Holocaust · 2019-12-02T19:07:06.745Z · score: 2 (1 votes) · LW · GW

This is both fascinating and very valuable for understand human psychology, and id like to see it reviewed.

Comment by benito on The Bat and Ball Problem Revisited · 2019-12-02T19:05:52.708Z · score: 2 (1 votes) · LW · GW

Seconding Habryka. I’d really like to see this reviewed.

Comment by benito on Metaphilosophical competence can't be disentangled from alignment · 2019-12-02T19:00:13.728Z · score: 3 (2 votes) · LW · GW

Seconding Habryka. I have thought often about this post.

Comment by benito on Preliminary thoughts on moral weight · 2019-12-02T18:59:05.926Z · score: 2 (1 votes) · LW · GW

Seconding Ray. This was a bunch of important hypotheses about consciousness I had never heard of.

Comment by benito on Everything I ever needed to know, I learned from World of Warcraft: Goodhart’s law · 2019-12-02T18:57:26.003Z · score: 2 (1 votes) · LW · GW

Seconding Habryka.

Comment by benito on Clarifying "AI Alignment" · 2019-12-02T18:56:51.840Z · score: 4 (2 votes) · LW · GW

Nominating this primarily for Rohin’s comment on the post, which was very illuminating.

Comment by benito on The funnel of human experience · 2019-12-02T06:00:33.811Z · score: 4 (2 votes) · LW · GW

Seconding Habryka.

Comment by benito on How do you assess the quality / reliability of a scientific study? · 2019-12-02T04:39:53.200Z · score: 5 (5 votes) · LW · GW

This question has loads of great answers, with people sharing their hard-earned insights about how to engage with modern scientific papers and make sure to get the truth out of them, so I curated it.

Comment by benito on Will AI See Sudden Progress? · 2019-12-02T04:00:18.513Z · score: 2 (1 votes) · LW · GW

Seconding Rohin. 

I think this is basically the same nomination as the post "Arguments Against Fast Takeoff", it's all one conversation, but just wanted to nominate it to be clear.

Comment by benito on Public Positions and Private Guts · 2019-12-02T02:58:49.844Z · score: 7 (3 votes) · LW · GW

Seconding Kaj. In addition to what he said, I find this post especially helpful when thinking about group epistemics.

Comment by benito on Coordination Problems in Evolution: Eigen's Paradox · 2019-12-02T02:52:55.007Z · score: 4 (2 votes) · LW · GW

This post should be considered as nominated together with its sequel, also reviewing the same book. It's been a while since I read it, but as far as I recall this was also really interesting - it continues to extend with much concrete detail, the framework Eliezer laid out in Inadequate Equilibria, through examples in biology.

I'd really like to read someone else's review of this post, who also read the book, so I'm nominating it for review.

Comment by benito on Inadequate Equilibria vs. Governance of the Commons · 2019-12-02T01:59:40.724Z · score: 2 (1 votes) · LW · GW

This post is fascinating, and extends and ties in lots of concrete details with the framework Eliezer laid out in Inadequate Equilibria.

I'd really like to read someone else's review of this post, who also read the book, so I'm nominating it for review.

Comment by benito on Buck's Shortform · 2019-12-02T01:06:03.125Z · score: 7 (3 votes) · LW · GW

If you want, you might enjoy trying to guess what mistake I think I was making, before I spoil it for you.

Time to record my thoughts! I won't try to solve it fully, just note my reactions.

For example, the first time I went to the Hot Tubs of Berkeley, a hot tub rental place near my house, I saw a friend of mine there. I wondered how regularly he went there. Consider the hypotheses of "he goes here three times a week" and "he goes here once a month". The likelihood ratio is about 12x in favor of the former hypothesis. So if I previously was ten to one against the three-times-a-week hypothesis compared to the once-a-month hypothesis, I'd now be 12:10 = 6:5 in favor of it. This felt surprisingly high to me.

Well, firstly, I'm not sure that the likelihood ratio is 12x in favor of the former hypothesis. Perhaps likelihood of things clusters - like people either do things a lot, or they never do things. It's not clear to me that I have an even distribution of things I do twice a month, three times a month, four times a month, and so on. I'd need to think about this more.

Also, while I agree it's a significant update toward your friend being a regular there given that you saw them the one time you went, you know a lot of people, and if it's a popular place then the chances of you seeing any given friend is kinda high, even if they're all irregular visitors. Like, if each time you go you see a different friend, I think it's more likely that it's popular and lots of people go from time to time, rather than they're all going loads of times each.

Another example: A while ago I walked through six cars of a train, which felt like an unusually long way to walk. But I realized that I'm 6x more likely to see someone who walks 6 cars than someone who walks 1.

I don't quite get what's going on here. As someone from Britain, I regularly walk through more than 6 cars of a train. The anthropics just checks out.

Comment by benito on Benito's Shortform Feed · 2019-12-02T00:06:15.765Z · score: 29 (4 votes) · LW · GW

Good posts you might want to nominate in the 2018 Review

I'm on track to nominate around 30 posts from 2018, which is a lot. Here is a list of about 30 further posts I looked at that I think were pretty good but didn't make my top list, in the hopes that others who did get value out of the posts will nominate their favourites. Each post has a note I wrote down for myself about the post.

  • Reasons compute may not drive AI capabilities growth
    • I don’t know if it’s good, but I’d like it to be reviewed to find out.
  • The Principled-Intelligence Hypothesis
    • Very interesting hypothesis generation. Unless it’s clearly falsified, I’d like to see it get built on.
  • Will AI See Sudden Progress? DONE
    • I think this post should be considered paired with Paul’s almost-identical post. It’s all exactly one conversation.
  • Personal Relationships with Goodness
    • This felt like a clear analysis of an idea and coming up with some hypotheses. I don’t think the hypotheses really captures what’s going on, and most of the frames here seem like they’ve caused a lot of people to do a lot of hurt to themselves, but it seemed like progress in that conversation.
  • Are ethical asymmetries from property rights?
    • Again, another very interesting hypothesis.
  • Incorrect Hypotheses Point to Correct Observations ONE NOMINATION
    • This seems to me like close to an important point but not quite saying it. I don’t know if I got anything especially knew from its framing, but its examples are pretty good.
  • Whose reasoning can you rely on when your own is faulty?
    • I really like the questions, and should ask them more about the people I know.
  • Inconvenience is Qualitatively Bad ONE NOMINATION
    • I think that the OP is an important idea. I think my comment on it is pretty good (and the discussion below it), though I’ve substantially changed my position since then, and should write up my new worldview once my life calms down. I don’t think I should nominate it because I’m a major part of the discussion.
  • The abruptness of nuclear weapons
    • Clearly valuable historical case, simple effect model.
  • Book Review: Pearl’s Book of Why
    • Why science has made a taboo of causality feels like a really important question to answer when figuring out how much to trust academia and how to make institutions that successfully make scientific progress, and this post suggests some interesting hypotheses.
  • Functional Institutions Are the Exception ONE NOMINATION
    • Was a long meditation on an important idea, that I’ve found valuable to read. Agree with commenter that it’s sorely lacking in examples however.
  • Strategies of Personal Growth ONE NOMINATION
    • Oli curated it, he should consider nominating and saying what he found useful. It all seemed good but I didn’t personally get much from it.
  • Preliminary Thoughts on Moral Weight DONE
    • A bunch of novel hypotheses about consciousness in different animals that I’d never heard before, which seem really useful for thinking about the topic.
  • Theories of Pain
    • I thought this was a really impressive post, going around and building simple models of lots of different theories, and giving a bit of personal experience with the practitioners of the theories. It was systematic and goal oriented and clear.
  • Clarifying “AI Alignment” DONE
    • Rohin’s comment is the best part of this post, not sure how best to nominate it.
  • Norms of Membership for Voluntary Groups
    • There’s a deep problem here of figuring out norms in novel and weird and ambiguous environments in the modern world, especially given the internet, and this post is kind of like a detailed, empirical study of some standard clusters of norms, which I think is very helpful.
  • How Old is Smallpox?
    • Central example of “Things we learned on LessWrong in 2018”. Should be revised though.
  • “Cheat to Win”: Engineering Positive Social Feedback
    • Feels like a clear example of a larger and important strategy about changing the incentives on you. Not clear how valuable the pots is alone, but I like it a lot.
  • Track-Back Meditation
    • I don’t know why, but I think about this post a lot.
  • Meditations on Momentum
    • I feel like this does a lot of good intuition building work, and I think about this post from time to time in my own life. I think that Jan brought up some good points in the comments about not wanting to cause confusion about different technical concepts all being the same, so I’d like to see the examples reviewed to check they’re all about attachment effects and not conflating different effects.
  • On Exact Mathematical Formulae
    • This makes a really important point anyone learns in the study of mathematics, and I think is generally an important distinction to have understand between language and reality. Just because we have words for some things doesn’t make them more real than things we don’t have words for. The point is to look at reality, not to look at the words.
  • Recommendations vs Guidelines
    • I think back to this occasionally. Seems like quite a useful distinction, and maybe we should try to encourage people making more guidelines. Maybe we should build a wiki and have a page type ‘guideline’ where people contribute to make great guidelines.
  • On the Chatham House Rule DONE
    • This is one of the first posts that impressed upon me the deeply tangled difficulties of information security, something I’ve increasingly thought a lot about in the intervening years, and expect to think about even more in the future.
  • Different types (not sizes!) of infinity
    • Some important conceptual work fundamental to mathematics. Very short and insightful. Not sure if I should allow this though, because if I do am I just allowing all high-level discussion of math to be included?
  • Expected Pain Parameters
    • It feels like useful advice and potentially a valuable observation with which to view a deeper problem. But unclear on the last one, and not sure if this post should be nominated just on the first alone.
  • Research: Rescuers During the Holocaust DONE
    • The most interesting part about this is the claim that most people who housed Jews during the holocaust did it because the social situation made the moral decision very explicit and that they felt they only had one possible outcome, not because they were proactive moral planners. I would like to see an independent review of this info.
  • Lessons from the cold war on information hazards: why internal communication is critical DONE
    • Seems like an important historical lesson.
  • Problem Solving With Crayons and Mazes
    • I didn’t find it much useful. Oli was excited when he curated it, should poke him to consider nominating it.
  • Insights from “Strategy of Conflict”
    • Seems helpful but weird to nominate, as the book is short and this post explicitly doesn't contain all the key ideas in the book. I did learn from this that having lots of nukes is more stable than having a small number, and this has stuck with me.
  • The Bat and Ball Problem Revisited DONE
    • A curiosity-driven walk through what’s going on with the bat-and-the-ball problem by Kahneman.
  • Good Samaritans in Experiments
    • A highly opinionated and very engaging criticism of a study.
  • Hammertime Final Exam ONE NOMINATION
    • Was great, I’m not actually sure whether it fits into this review process?
  • Naming the Nameless DONE
    • Some people seemed to get a lot out of this, but I haven’t had the time to engage with it much.
    • Actually, just re-read it, and it's brilliant, and one of the best 5-10 of the year. Will nominate it myself if nobody else does.
  • How did academia ensure papers were correct in the early 20th century? DONE
    • I’m glad I put this down in writing. I found it useful myself. But others should figure out whether to nominate.
  • Competitive Markets as Distributed Backdrop DONE
    • I felt great about it when I read this post last time. I’ve not given it a careful re-read, would like to see it reviewed, but I think it’s likely I’ll rediscover it’s a very helpful abstraction.

AI alignment posts you might want to nominate

[Edit: On reflection, I think that the Alignment posts that do not also have implications for human rationality aren't important to go through this review process, and we'll likely create another way to review that stuff and make it into books.]

There was also a lot of top-notch AI alignment writing, but I mostly don’t feel well-placed to nominate it. I hope others can look through and nominate selections from these.

Comment by benito on From Personal to Prison Gangs: Enforcing Prosocial Behavior · 2019-12-01T23:50:14.933Z · score: 4 (2 votes) · LW · GW

Well I messed up. I even wrote a nomination comment for this post, but it was written this year. Silly me. I'll nominate it next year instead.

Comment by benito on Optimization Amplifies · 2019-12-01T23:48:01.033Z · score: 11 (3 votes) · LW · GW

I think the simple mathematical models here are very helpful in pointing to some intuitions about being confident systems will work even with major optimisation pressure applied, and why optimisation power makes things weird. I would like to see other researchers in alignment review this post, because I don't fully trust my taste on posts like these.

Comment by benito on Two types of mathematician · 2019-12-01T23:45:26.837Z · score: 2 (1 votes) · LW · GW

This is a very valuable effort in outlining a hypothesis, and using the author’s wide-ranging taste and knowledge to pull loads of sources together. Definitely helped me a bit think about mathematics and thought, and some of my friends too. I've especially thought about that Grothendieck quote a lot.

Comment by benito on Weird question: could we see distant aliens? · 2019-12-01T23:44:02.109Z · score: 4 (2 votes) · LW · GW

This clearly fits into “Things we learned on LW in 2018”.

This needs comments to be nominated too. It would be really awesome if someone could write a straightforward distillation of the arguments that lead to consensus on this issue between many of the commenters.

Comment by benito on Making yourself small · 2019-12-01T23:40:54.308Z · score: 2 (1 votes) · LW · GW

This was a really helpful post for me. I can see now that there’s a distinction between how much status you have and how much status you’re using/spending/playing at a given time, and I figured out some important things on this axis for how I personally want to act.

Comment by benito on Explicit and Implicit Communication · 2019-12-01T23:32:30.044Z · score: 2 (1 votes) · LW · GW

I stand by what I said in my curation notice. This post has a number of excellent datapoints/examples which required good insight to put together, and paint an important picture about when making things explicit is costly and unhelpful, and important thing to understand when thinking about rationality.

(I'm not sure what a review of this post would look like. I would like it if people tried to help make the ideas clearer and more actionable.)

Comment by benito on The Steering Problem · 2019-12-01T22:50:08.051Z · score: 4 (2 votes) · LW · GW

Reading this post was the first time I felt I understood what Paul's (and many others') research was motivated by. I think about it regularly, and it comes up in conversation a fair bit.

Comment by benito on Jobs Inside the API · 2019-12-01T22:47:41.679Z · score: 4 (2 votes) · LW · GW

I think about this post often when I think about automation.

Comment by benito on Beyond Astronomical Waste · 2019-12-01T22:46:34.919Z · score: 2 (1 votes) · LW · GW

This is a post that's stayed with me since it was published. The title is especially helpful as a handle. It is a simple reference for this idea, that there are deeply confusing philosophical problems that are central to our ability to attain most of the value we care about (and that this might be a central concern when thinking about AI).

It's not been very close to areas I think about a lot, so I've not tried to build on it much, and would be interested in a review from someone who thinks in more detail about these matters more, but I expect they'll agree it's a very helpful post to exist.

Comment by benito on Varieties Of Argumentative Experience · 2019-12-01T22:40:21.621Z · score: 4 (2 votes) · LW · GW

This post sums over a lot of argumentative experiences, and condenses them into an image to remember it by, which is a great way to try to help people understand communication. 

Many of Scott's posts provide a glimpse of this model, where he, say, shows why a particular sociology or medical study doesn't actually end a big debate, or responds to someone lower down the triangle by moving up a level and doing a literature review; but those are all in the context of very specific arguments, and aren't supposed to be about helping the reader look at this bigger picture. This post takes the lessons from all of those experiences and produces some high-level, general insights about how argument, communication and rationality works.

I think if you’d asked me to classify types of argument, I would’ve dived straight into the grey triangle at the top, and come back with some bad first-principles models (maybe looking a bit like my post with drawings about good communication), and I really appreciate that someone with such a varied experience of arguing on the internet did this more empirical version, with so many examples.

Comment by benito on Anti-social Punishment · 2019-12-01T22:32:35.462Z · score: 2 (1 votes) · LW · GW

An important concept, with the effect very clearly demonstrated by the study, and very helpful clarifying discussion.

I'm nominating this for the 2018 review in large part because I'd like to see an independent review of the study to check for standard reasons it would fail to replicate. I note that the author of the OP says the study authors were surprised by the result, which I think reduces my expectation of many standard forms of motivated cognition to find the particular result.

Comment by benito on Useful Does Not Mean Secure · 2019-12-01T08:24:21.923Z · score: 8 (4 votes) · LW · GW

Oh this is very interesting. I'm updating from

Eliezer strongly believes that discrete jumps will happen

 to

Eliezer believes you should generally not be making assumptions like "Oh I'm sure discrete jumps in capabilities won't happen", for the same reasons a security expert would not accept anything of the general form "Oh I'm sure nothing like that will ever happen" as a reasoning step in making sure a system is secure.

I mean, I guess it's obvious if you've read the security mindset dialogue, but I hadn't realised that was a central element to the capabilities gain debate.

Added: To clarify further: Eliezer has said that explicitly a few times, but only now did I realise it was potentially a deep crux of the broader disagreement between approaches. I thought it was just a helpful but not especially key example of not taking assumptions about AI systems.

Comment by benito on "Just Suffer Until It Passes" · 2019-12-01T07:49:34.342Z · score: 4 (2 votes) · LW · GW

+1

I would be excited for more posts like this, where the data gave you a surprising result / generated a new hypothesis.

Comment by benito on Useful Does Not Mean Secure · 2019-12-01T07:01:15.615Z · score: 4 (2 votes) · LW · GW

Oh my, I never expected to be in the newsletter for writing an object level post about alignment. How exciting.

Comment by benito on Birth order effect found in Nobel Laureates in Physics · 2019-12-01T00:49:40.624Z · score: 6 (3 votes) · LW · GW

This feels like a pretty central example of 'things we found out on lesswrong in 2018'. Great work all round, so I'm nominating it. Next year, I'll also nominate the further work on this that came out in 2019.

Comment by benito on Historical mathematicians exhibit a birth order effect too · 2019-12-01T00:49:01.135Z · score: 4 (2 votes) · LW · GW

This feels like a pretty central example of 'things we found out on lesswrong in 2018'. Great work all round, so I'm nominating it. Next year, I'll also nominate the further work on this that came out in 2019.

Comment by benito on Specification gaming examples in AI · 2019-12-01T00:47:17.295Z · score: 6 (3 votes) · LW · GW

It's really valuable to make records like this. Insofar as it's accurate, and any reviewers can add to it, that's great. (Perhaps in the future it could be a wiki page of some sort, but for now I think it was a very valuable thing that got posted in 2018.)

Comment by benito on List of previous prediction market projects · 2019-12-01T00:45:30.477Z · score: 4 (2 votes) · LW · GW

It's really valuable to make records like this. Insofar as it's accurate, and any reviewers can add to it, that's great. (Perhaps in the future it could be a wiki page of some sort, but for now I think it was a very valuable thing that got posted in 2018.)

Comment by benito on A voting theory primer for rationalists · 2019-11-30T19:08:47.729Z · score: 2 (1 votes) · LW · GW

This is a review stub, I need more time to write full reviews.

Note that this is just a nomination, there'll be a whole month for reviews. But the nomination deadline is Monday, so get them in quick! :)