Comment by benquo on Integrity and accountability are core parts of rationality · 2019-07-17T07:32:26.715Z · score: 4 (2 votes) · LW · GW

It seems to me that the way humans acquire language pretty strongly suggests that (2) is true. (1) seems probably false, depending on what you mean by incentives, though.

Comment by benquo on Integrity and accountability are core parts of rationality · 2019-07-17T07:24:22.584Z · score: 9 (2 votes) · LW · GW

To give a concrete example, I expect math prodigies to have the easiest time solving any given math problem, but even so, I don't expect that a system that punishes the students who don't complete their assignments correctly will serve the math prodigies well. This, even if under other, totally different circumstances it's completely appropriate to compel performance of arbitrary assignments through the threat of punishment.

Comment by benquo on Integrity and accountability are core parts of rationality · 2019-07-17T07:13:55.592Z · score: 28 (7 votes) · LW · GW

Thanks for checking - I'm trying to say something pretty different.

It seems like the frame of the OP is lumping together the kind of consistency that comes from using the native architecture to model the deep structure of reality (see also Geometers, Scribes, and the structure of intelligence), and the kind of consistency that comes from trying to perform a guaranteed level of service for an outside party (see also Unreal's idea of Dependability), and an important special case of the latter is rule-following as a form of submission or blame-avoidance. These are very different mental structures, respond very differently to incentives, and learn very different things from criticism. (Nightmare of the Perfectly Principled is my most direct attempt to point to this distinction.)

People who are trying to submit or avoid blame will try to alleviate the pressure of criticism with minimal effort, in ways that aren't connected to their other beliefs. On the other hand, people with structured models will sometimes leapfrog past the critic, or jump in another direction entirely, as Benito pointed out in A Sketch of Good Communication.

If we don't distinguish between these cases, then attempts to reason about the "optimal" attitude towards integrity or accountability will end up a lumpy, unsatisfactory linear compromise between the following policy goals:

  • Helping people with structurally integrated models notice tensions in their models that they can learn from.
  • Distinguishing people with structurally integrated models from those who (at least in the relevant domain) are mostly just trying not to stick out as wrong, so we can stop listening to the second group.
  • Establishing and enforcing the norms needed to coordinate actions among equals (e.g. shared expectations about promises).
  • Compelling a complicated performance from inferiors, or avoiding punishment by superiors trying to compel a complicated performance from you.
  • Converting people without structurally integrated models into people with structurally integrated models (or vice versa).

Depending on what problem you're trying to solve, habryka's statement that "if someone changes their stated principles in an unpredictable fashion every day (or every hour), then I think most of the benefits of openly stating your principles disappear" can be almost exactly backwards.

If your principles predictably change based on your circumstances, that's reasonably likely to be a kind of adversarial optimization similar to A/B testing of communication. They don't mean their literal content, at least.

But there's plenty of point in principles consistent with learning new things fast. In that case, change represents noise, which is costly, but much less costly than messaging optimized for extraction. And of course changing principles doesn't need to imply a change in behavior to match - your new principles can and should take into account the fact that people may have committed resources based on your old stated principles.

In summary, my objection is that habryka seems to be thinking of beliefs as a special case of promises, while I think that if we're trying to succeed based on epistemic rationality, we should be modeling promises as a special case of beliefs. For more detail on that, see Bindings and Assurances.

Comment by benquo on "Rationalizing" and "Sitting Bolt Upright in Alarm." · 2019-07-17T04:46:54.543Z · score: 12 (3 votes) · LW · GW

Another thing I'd add - putting this in its own comment to help avoid any one thread blowing up in complexity:

The orientation-towards-clarity problem is at the very least strongly analogous to, and most likely actually an important special case of, the AI alignment problem.

Friendliness is strictly easier with groups of humans, since the orthogonality thesis is false for humans - if you abuse us out of our natural values you end up with stupider humans and groups. This is reason for hope about FAI relative to UFAI, but also a pretty strong reason to prioritize developing a usable decision theory and epistemology for humans over using our crappy currently-available decision theory to direct resources in the short run towards groups trying to solve the problem in full generality.

AGI will, if ever, almost certainly be built - directly or indirectly - by a group of humans, and if that group is procedurally Unfriendly (as opposed to just foreign), there's no reason to expect the process to correct to FAI). For this reason, friendly group intelligence is probably necessary for solving the general problem of FAI.

Comment by benquo on "Rationalizing" and "Sitting Bolt Upright in Alarm." · 2019-07-17T04:33:16.549Z · score: 16 (3 votes) · LW · GW

This sounds really, really close. Thanks for putting in the work to produce this summary!

I think the my objection to the 5 Words post fits a pattern where I've had difficulty expressing a class of objection. The literal content of the post wasn't the main problem. The main problem was the emphasis of the post, in conjunction with your other beliefs and behavior.

It seemed like the hidden second half of the core claim was "and therefore we should coordinate around simpler slogans," and not the obvious alternative conclusion "and therefore we should scale up more carefully, with an uncompromising emphasis on some aspects of quality control." (See On the Construction of Beacons for the relevant argument.)

It seemed to me like there was some motivated ambiguity on this point. The emphasis seemed to consistently recommend public behavior that was about mobilization rather than discourse, and back-channel discussions among well-connected people (including me) that felt like they were more about establishing compatibility than making intellectual progress. This, even though it seems like you explicitly agree with me that our current social coordination mechanisms are massively inadequate, in a way that (to me obviously) implies that they can't possibly solve FAI.

I felt like if I pointed this kind of thing out too explicitly, I'd just get scolded for being uncharitable. I didn't expect, however, that this scolding would be accompanied by an explanation of what specific, anticipation-constraining, alternative belief you held. I've been getting better at pointing out this pattern (e.g. my recent response to habryka) instead of just shutting down due to a preverbal recognition of it. It's very hard to write a comment like this one clearly and without extraneous material, especially of a point-scoring or whining nature. (If it were easy I'd see more people writing things like this.)

Comment by benquo on Benito's Shortform Feed · 2019-07-17T04:00:00.916Z · score: 9 (2 votes) · LW · GW

The definitional boundaries of "abuser," as Scott notes, are in large part about coordinating around whom to censure. The definition is pragmatic rather than objective.*

If the motive for the definition of "lies" is similar, then a proposal to define only conscious deception as lying is therefore a proposal to censure people who defend themselves against coercion while privately maintaining coherent beliefs, but not those who defend themselves against coercion by simply failing to maintain coherent beliefs in the first place. (For more on this, see Nightmare of the Perfectly Principled.) This amounts to waging war against the mind.

Of course, in matter of actual fact we don't strongly censure all cases of consciously deceiving. In some cases (e.g. "white lies") we punish those who fail to lie, and those who call out the lie. I'm also pretty sure we don't actually distinguish between conscious deception and e.g. reflexively saying an expedient thing, when it's abundantly clear that one knows very well that the expedient thing to say is false, as Jessica pointed out here.

*It's not clear to me that this is a good kind of concept to have, even for "abuser." It seems to systematically force responses to harmful behavior to bifurcate into "this is normal and fine" and "this person must be expelled from the tribe," with little room for judgments like "this seems like an important thing for future partners to be warned about, but not relevant in other contexts." This bifurcation makes me less willing to disclose adverse info about people publicly - there are prominent members of the Bay Area Rationalist community doing deeply shitty, harmful things that I actually don't feel okay talking about beyond close friends because I expect people like Scott to try to enforce splitting behavior.

Comment by benquo on Integrity and accountability are core parts of rationality · 2019-07-16T23:47:59.736Z · score: 8 (4 votes) · LW · GW

I don't understand the relevance of your responses to my stated model. I'd like it if you tried to explain why your responses are relevant, in a way that characterizes what you think I'm saying more explicitly.

My other most recent comment tries to show what your perspective looks like to me, and what I think it's missing.

Comment by benquo on Integrity and accountability are core parts of rationality · 2019-07-16T19:54:37.296Z · score: 10 (4 votes) · LW · GW

This exchange has given me the feeling of pushing on a string, so instead of pretending that I feel like engaging on the object level will be productive, I'm going to try to explain why I don't feel that way.

It seems to me like you're trying to find an angle where our disagreement disappears. This is useful for papering over disagreements or pushing them off, which can be valuable when that reallocates attention from zero-sum conflict to shared production or trade relations. But that's not the sort of thing I'd hope for on a rationalist forum. What I'd expect there is something more like double-cruxing, trying to find the angle at which our core disagreement becomes most visible and salient.

Sentences like this seem like a strong tell to me:

I do think that a more continuous model is accurate here, though I share at least a bit of your sense (or at least what I perceive to be your sense) of there being some discrete shift between the two different modes of thinking.

While "I think you're partly wrong, but also partly right" is a position I often hold about someone I'm arguing with, it doesn't clarify things any more than "let's agree to disagree." It can set the frame for a specific effort to articulate what exactly I think is wrong under what circumstances. What I would have hoped to see from you would have been more like:

  • If you don't see why I care about pointing out this distinction, you could just ask me why you should care.
  • If you think you know why I care but disagree, you could explain what you think I'm missing.
  • If you're unsure whether you have a good sense of the disagreement, you could try explaining how you think our points of view differ.
Comment by benquo on Integrity and accountability are core parts of rationality · 2019-07-16T13:57:13.068Z · score: 6 (3 votes) · LW · GW

This seems like a proposal to use the same kinds of postural adjustments on a group that includes anatomically complete human beings, and lumps of clay. Even if there's a continuum between the two, if what you want to produce is the former, adjustments that work for the latter are going to be a bad idea.

If someone's inconsistencies are due to an internal confusion about what's true, that's a different situation requiring a different kind of response from the situation in which those inconsistencies are due to occasionally lying when they have an incentive to avoid disclosing their true belief structure. Both are different from one in which there simply isn't an approximately coherent belief structure to be represented.

Comment by benquo on Integrity and accountability are core parts of rationality · 2019-07-16T03:54:44.689Z · score: 4 (2 votes) · LW · GW

I'm not saying that people "should try" to use their beliefs to model and act in reality.

I'm saying that some people's minds are set up such that stated beliefs are by default reports about a set of structurally integrated (and therefore logically consistent) constraints on their anticipations. Others' minds seem to be concerned with making socially desirable assertions, where apparent consistency is a desideratum. The first group is going to have no trouble at all "acting in accordance with [their] stated beliefs about the world" so long as they didn't lie when they stated their beliefs, and the sort of accountability you're talking about seems a bit silly. The second group is going to have a great deal of trouble, and accountability will at best cause them to perform consistency when others are watching, not to take initiative based on their beliefs. (Cf. Guess culture screens for trying to cooperate.)

Comment by benquo on Integrity and accountability are core parts of rationality · 2019-07-16T03:20:48.881Z · score: 2 (1 votes) · LW · GW

This post seems like it's implying that integrity is a thing you might do a little more or less of on the margin depending on incentives. But some integrity is because people are trying to use their beliefs to model and act in reality, not just to be consistent.

Comment by benquo on Integrity and accountability are core parts of rationality · 2019-07-16T03:09:48.616Z · score: 6 (3 votes) · LW · GW

This post's focus on accountability kind of implies a continuous model of integrity, and I think there's an important near-discrete component which is, whether you're trying to use beliefs to navigate the world, or just social reality. Nightmare of the Perfectly Principled explored this a bit; enforcing consistency in a way that's not about trying to use beliefs doesn't help that much. For people who are trying to use their mental structures to model the world, pressure towards inconsistency doesn't register as a temptation to be nobly avoided, it registers as abuse and gaslighting.

Comment by benquo on Reclaiming Eddie Willers · 2019-07-15T05:16:44.719Z · score: 9 (2 votes) · LW · GW

Overall I think this post would benefit from some exploration of what you think people who are disparaging loyalty are implicitly trying to do. Is there an implied strategy there? What sort of world does that strategy build? How is it different from the strategy or strategies with a place for loyalty as a virtue?

This will necessarily involve speculating about others' covert motives, which is sometimes thought to be impolite, but I think it's fair to speculate about the motives of the sort of people who level covert plausibly-deniable death threats at you such as "a Hufflepuff among Slytherins will die as surely as among snakes."

Comment by benquo on Reclaiming Eddie Willers · 2019-07-15T05:05:15.211Z · score: 12 (4 votes) · LW · GW

Thanks for engaging with this difficult subject seriously and carefully. When I talk about keeping loyalty in your core identity, part of what I'm trying to point to is a tendency to interpret criticism of particular loyalty behaviors (e.g. the depiction of Eddie Willers) as an attack on your essence as a person. Sometimes that kind of criticism really is just an attempt to lower the prestige of the loyalty drive, other times the content of the critique is just a claim that some loyalties are misplaced, and very often things are going to contain some mixture of the two, and you have some choice about what part to focus on.

It's possible that unconditionally accepting your preference for justified-loyalty as a part of you might make it easier to accept such critiques. I expect that to work best if you're also willing to believe in an integrated way that they also serve who only stand and wait, i.e. able to go a while without external validation of the loyalty trait.

Comment by benquo on Reclaiming Eddie Willers · 2019-07-15T03:14:53.809Z · score: 19 (4 votes) · LW · GW

Another way to say this is that sometimes the world doesn't deserve Eddie Willers because it can't make the proper use of him. This is very unfair - it is literally abuse, in the original meaning of the term - I'm very sad about it, and consider it a morally urgent problem.

Comment by benquo on Reclaiming Eddie Willers · 2019-07-15T03:08:33.605Z · score: 6 (3 votes) · LW · GW

More on the pressure to be loyal to something. The things that seem actively helpful now are either actively leading the effort to refactor our civilization into something more value-aligned or claiming territory for a local value-aligned agent, or participating in efforts to do one or the other. But a lot of people might not be in a position to do either, and I wish I knew how to make them feel OK just holding off on all action except what they need to do to get by. I think the best case for cults like Hari Krishna is that they help obligate-loyal people do exactly that - just hang out, for the duration. Unfortunately, it seems like the conversion is permanent, not temporary, and I'd like to have the obligate-loyal people online again once it's safe for them.

Comment by benquo on Reclaiming Eddie Willers · 2019-07-15T02:53:30.389Z · score: 20 (8 votes) · LW · GW

The problem with loyalty is that it's only as good as the decision process by which loyalty is assigned and revoked. The in-story context in which Eddie Willers stands by the railroad is one in which the remaining technological and industrial base is being cannibalized for extractive purposes at an accelerating rate. If you can't, under some circumstances, revoke loyalty, then you're cooperatebot. If I'm in a zero-sum conflict with another agent, and I have the chance to cheaply destroy a cooperatebot that's robustly under its control, it's often decision-theoretically correct for me to do so. Scorched-earth tactics work on similar principles.

A heuristic of serving the best thing available doesn't fully solve this problem - sometimes there's nothing you can practically offer loyalty to that's not actively destructive. If you're not able to withdraw your labor in such circumstances, then you're stuck causing harm.

During WWII, someone like this living in Germany ends up helping the Axis war effort, unless they're ready to actually rebel against their government (which they usually won't be, it's dangerous and not for most people). It's much better if such a person is willing to slack off, when the alternative is to cause harm. Certainly keeping the trains running on time under such circumstances would not generically be a friendly move towards people like me.

On the other hand, someone like this living in an Allied country during WWII would have been helping the Allies win, and someone like this living in a truly robustly good society is of tremendous value to everyone around them.

I seriously do think that a large portion of what our society is doing constitutes a pointless war against nothing in particular, and some of it is specifically a war against minds, so this isn't just a nitpick, it's a core objection to keeping loyalty in your core identity.

No objection to identifying with being the sort of person who is loyal and dutiful when it's the right thing to do, if you make sure to cultivate the moral courage to do otherwise when that's better.

Comment by benquo on The AI Timelines Scam · 2019-07-14T12:37:26.685Z · score: 2 (1 votes) · LW · GW

The doc Jessicata linked has page numbers but no embedded text. Can you give a page number for that one?

Unlike your other quotes, it at least seems to say what you're saying it says. But it appears to start mid-sentence, and in any case I'd like to read it in context.

Comment by benquo on The AI Timelines Scam · 2019-07-14T02:33:59.736Z · score: 4 (2 votes) · LW · GW
If indeed Dreyfus meant to critique 1965's algorithms - which is not what I'm seeing, and certainly not what I quoted

It seems to me like that's pretty much what those quotes say - that there wasn't, at that time, algorithmic progress sufficient to produce anything like human intelligence.

Comment by benquo on "Rationalizing" and "Sitting Bolt Upright in Alarm." · 2019-07-14T01:47:34.990Z · score: 5 (2 votes) · LW · GW

It seemed to me like you were emphasizing (a), in a way that pushed to the background the difference between wishing we had a proper way to demand attention for deceptive speech that's not literally lying, and wishing we had a way to demand attention for the right response. As I tried to indicate in the parent comment, it felt more like a disagreement in tone than in explicit content.

I think this is the same implied disagreement expressed around your comment on my Sabbath post. It seems like you're thinking of each alarm as "extra," implying a need for a temporary boost in activity, while I'm modeling this particular class of alarm as suggesting that much and maybe most of one's work has effects in the wrong direction, so one should pause, and ignore a lot of object-level bids for attention until one's worked this out.

Comment by benquo on Diversify Your Friendship Portfolio · 2019-07-12T19:50:28.854Z · score: 12 (10 votes) · LW · GW

The most important claim here is that the thing that passes for friendship these days, among most people in your target audience, is best thought of not as a irreplaceable human relationship, or perhaps one of many ties to a richly embedded community, but an asset in a portfolio.

It can be helpful to have that, but it implies a pretty awful world in a lot of other ways, and I'm very sad to see advice about how to get along in that world that doesn't even mention the possibility of trying to make things better.

Comment by benquo on "Rationalizing" and "Sitting Bolt Upright in Alarm." · 2019-07-12T19:13:44.890Z · score: 33 (7 votes) · LW · GW

Something about the tone of this post seems like it's missing an important distinction. Targeted alarm is for finding the occasional, rare bad actor. As Romeo pointed out in his comment, we suffer from alarm fatigue. The kind of alarm that needs raising for self-propagating patterns of motivated reasoning is procedural or conceptual. People are mistakenly behaving (in some contexts) as though certain information sources were reliable. This is often part of a compartmentalized pattern; in other contexts, the same people act as though, not only do they personally know, but everybody knows, that those sources are not trustworthy.

To take a simple example, I grew up in a household with a television. That means that, at various times in the day, I was exposed to messages from highly paid expert manipulators trying to persuade me to consume expensive, poor-quality, addictive foods that were likely to damage my mind and body by spiking my blood sugar and lowering my discernment. I watched these messages because they were embedded in other messages exposing me to a sort of story superstimulus with elevated levels of violence and excitement, but mostly devoid of messages from my elders about what sorts of time-tested behaviors are adaptive for the community or individual.

If you try to tell people that TV is bad for kids, they'll maybe feel vaguely guilty, but not really process this as news, because "everybody knows," and go on behaving as though this was fine. If you manage to get through to them that TV ads are Out to Get You,  this might get their attention, but only by transmitting an inappropriately concentrated sense of threat - or an unproductive general paranoia.

I feel like, in the emotional vocabulary of this post, the problem is how to inform my parents that they should be scared of some particular television commercial or show, with the proper level of urgency, without making a literally false accusation. But the actual problem is that large parts of our world are saturated with this sort of thing - friends have TVs, billboards are everywhere, my parents were already acculturated to some extent by TV, and there are other less immediately obvious unfriendly acculturators like school.

The behavior I'd have wanted my parents to exhibit would probably have started with working out - with friends and community members - and with me and my sister - and first, with each other - a shared model and language for talking about the problem, before we started to do anything about it. Not to blame the proximate target and treat each case as a distinct emergency against a presumed backdrop of normality.

The bad news is that vastly powerful cultural forces are deeply unsafe. The good news is that, without any specific defenses against these, or even any clear idea of their shape, we've mostly been doing okay anyway. The bad news is that that's beginning to change.

Stopgap solutions that can be implemented immediately look less like rationing screen time, and more like celebrating a communal Sabbath with clear, traditional standards.

This is why I began my post Effective Altruism is Self-Recommending with a general account of pyramid and Ponzi schemes - not to single out Effective Altruism as especially nasty, but to explain that such schemes are not only destructive, but extremely common and often legitimated by the authorities. The response I'm trying for is more like "halt, melt, catch fire."

(Ended up cross-posting this comment as a blog post.)

Comment by benquo on Benito's Shortform Feed · 2019-07-12T05:09:23.904Z · score: 5 (2 votes) · LW · GW

I think this is true for people who've been through a modern school system, but probably not a human universal.

Comment by benquo on The AI Timelines Scam · 2019-07-12T02:10:22.339Z · score: 28 (9 votes) · LW · GW
I don't actually know the extent to which Bernie Madoff actually was conscious that he was lying to people. What I do know is that he ran a pyramid scheme.

The eponymous Charles Ponzi had a plausible arbitrage idea backing his famous scheme; it's not unlikely that he was already in over his head (and therefore desperately trying to make himself believe he'd find some other way to make his investors whole) by the time he found out that transaction costs made the whole thing impractical.

Comment by benquo on Schism Begets Schism · 2019-07-10T20:15:24.531Z · score: 5 (7 votes) · LW · GW

It seems like the Martin Luther case is an example of open disagreement begetting schism. If LessWrong can't deal with open disagreement, what's it even doing?

Comment by benquo on Raemon's Shortform · 2019-07-09T01:18:19.140Z · score: 5 (3 votes) · LW · GW

Attention is scarce and there are lots of optimization processes going on, so if you think the future is big relative to the present, interventions that increase the optimization power serving your values are going to outperform direct interventions. This doesn't imply that we should just do infinite meta, but it does imply that the value of direct object-level improvements will nearly always be via how they affect different optimizing processes.

Comment by benquo on Raemon's Shortform · 2019-07-08T12:34:49.811Z · score: 12 (7 votes) · LW · GW

On the object level, the three levels you described are extremely important:

  • harming the ingroup
  • harming the outgroup (who you may benefit from trading with)
  • harming powerless people who don't have the ability to trade or collaborate with you

I'm basically never talking about the third thing when I talk about morality or anything like that, because I don't think we've done a decent job at the first thing. I think there's a lot of misinformation out there about how well we've done the first thing, and I think that in practice utilitarian ethical discourse tends to raise the message length of making that distinction, by implicitly denying that there's an outgroup.

I don't think ingroups should be arbitrary affiliation groups. Or, more precisely, "ingroups are arbitrary affiliation groups" is one natural supergroup which I think is doing a lot of harm, and there are other natural supergroups following different strategies, of which "righteousness/justice" is one that I think is especially important. But pretending there's no outgroup is worse than honestly trying to treat foreigners decently as foreigners who can't be counted on to trust us with arbitrary power or share our preferences or standards.

Sometimes we should be thinking about what internal norms to coordinate around (which is part of how the ingroup is defined), and sometimes we should be thinking about conflicts with other perspectives or strategies (how we treat outgroups). The Humility Argument for Honesty and Against Neglectedness Considerations are examples of an idea about what kinds of norms constitute a beneficial-to-many supergroup, while Should Effective Altruism be at war with North Korea? was an attempt to raise the visibility of the existence of outgroups, so we could think strategically about them.

Comment by benquo on Raemon's Shortform · 2019-07-08T12:30:03.400Z · score: 9 (4 votes) · LW · GW

This feels like the most direct engagement I've seen from you with what I've been trying to say. Thanks! I'm not sure how to describe the metric on which this is obviously to-the-point and trying-to-be-pin-down-able, but I want to at least flag an example where it seems like you're doing the thing.

Comment by benquo on Blatant lies are the best kind! · 2019-07-07T03:57:19.398Z · score: 9 (2 votes) · LW · GW

"this kind of behavior" = the blame machinery that gets activated when lying is mentioned, i.e. "Depending on the power dynamics of the situation, the blame can fall on the liar, or on the person calling them out."

Comment by benquo on Blatant lies are the best kind! · 2019-07-07T03:55:02.229Z · score: 5 (2 votes) · LW · GW

No, I meant it to be straightforward. Oops!

Comment by benquo on Blatant lies are the best kind! · 2019-07-07T02:29:59.791Z · score: 7 (3 votes) · LW · GW

Those feel like important surface-level points, though I'd phrase the second one a bit differently. But the underlying models used to generate those claims are more of what I wanted to get across. Here are a couple pointers to the kinds of things I think are core content I was trying to work through an example of:

  • A clearer idea of how the different kinds of simulacrum relate to each other, and how some bring others into existence.
  • The interaction between speech's denotative meaning, direct effects as an act, and indirect effects as a way of negotiating norms. (E.g. The way we argue for a point isn't just about whether a claim is true or false, but also about how reasoning works. Expressing anger that someone's violated a norm isn't just a statement about the act, but about the norm, and about the knowability of the relation between the two.)
  • There are kinds of motivated distortions of thinking that are bad, not because there is or might be a direct victim of harm, but because they change what we're doing when we're talking, in a way that makes some kinds of important coordination much harder.
Comment by benquo on Blatant lies are the best kind! · 2019-07-07T02:21:50.913Z · score: 5 (2 votes) · LW · GW

Ah, sorry, I thought that inference would be obvious by the time the reader started the second line of dialogue. Thanks for letting me know it wasn't! I feel stuck between repeating the line with Noa's name attached (which feels clunky to me), using a worse title, and the current situation.

Comment by benquo on Jimrandomh's Shortform · 2019-07-05T00:37:28.696Z · score: 12 (3 votes) · LW · GW
In particular, I'd expect the people in bullshit jobs to have been unusually competent, smart, or powerful before they were put in the bullshit job, and this is not in fact what I think actually happens.

Moral Mazes claims that this is exactly what happens at the transition from object-level work to management - and then, once you're at the middle levels, the main traits relevant to advancement (and value as an ally) are the ones that make you good at coalitional politics, favor-trading, and a more feudal sort of loyalty exchange.

Comment by benquo on Causal Reality vs Social Reality · 2019-07-04T23:08:36.202Z · score: 26 (6 votes) · LW · GW

I think that in this context it might be helpful for me to mention that I've recently seriously considered giving up on LessWrong, not because of overt bans or censorship, but because of my impression that the nudges I do see reflect some badly misplaced priorities.

These kinds of nudges both reflect the sort of judgment that might be tested later in higher-stakes situations (say, something actually controversial enough for the right call to require a lot of social courage on the mods' part), and serve as a coordination mechanism by which people illegibly negotiate norms for later use.

I ended up deciding to contact the mods privately to see if we could double-crux on this, since "try at all" is an important thing to do before "give up" for a forum with as much talent and potential as this one. I'm only mentioning this in here because I think these kinds of things tend to be handled illegibly in ways that make them easy to miss when modeling things like chilling effects.

Comment by benquo on Self-consciousness wants to make everything about itself · 2019-07-04T17:37:56.261Z · score: 2 (1 votes) · LW · GW

What's the alternative? A state in which collective culpability is zero ... while people continue to do wrong?

Comment by benquo on Self-consciousness wants to make everything about itself · 2019-07-04T12:49:57.098Z · score: 5 (3 votes) · LW · GW

Orthodox Judaism specifically claims that if enough people behave righteously enough and follow the law well enough at the same time, this will usher in a Messianic era, in which much of the liturgy and ritual obligations and customs (such as the *Ashamnu*) will be abolished. Collective responsibility, and a keen sense that we are very, very far from reliably correct behavior, is not the same as a total lack of hope.

Comment by benquo on Blatant lies are the best kind! · 2019-07-04T11:55:38.039Z · score: 6 (3 votes) · LW · GW

It wasn't an actual conversation between multiple people, this is just how it felt intuitive to try to explain the issue. When it's an actual chat transcript I say so :)

Blatant lies are the best kind!

2019-07-03T20:45:56.948Z · score: 24 (14 votes)
Comment by benquo on Causal Reality vs Social Reality · 2019-07-01T05:12:48.274Z · score: 9 (4 votes) · LW · GW

Your argument doesn't make sense unless whatever "clamoring in the streets" stands in for metaphorically is an available action to the people you're referring to. It seems to me like the vast majority of people are neither in an epistemic position where they can reasonably think that they know that there's a good chance of curing aging, nor do they have any idea how to go about causing the relevant research to happen.

They do know how to increase the salience of "boo death," but so far in the best case that seems to result in pyramids, which don't work and never could, and even then only for the richest.

Comment by benquo on Apocalypse, corrupted · 2019-06-27T14:20:42.062Z · score: 2 (1 votes) · LW · GW

How sure are you that hunter-gatherers are much closer to the edge than the typical person in our society?

A better comparison might be people in cold / food-scarce vs warm / food-abundant areas.

Comment by Benquo on [deleted post] 2019-06-26T13:53:08.362Z
There are probably around 20 major characteristics I wish each LW user had (such as "be able to think in probabilites" and "be able to generate hypotheses for confusing phenomena"), and most of them can be improved with "regular learning and practice", and nudges, rather than overcoming weird adversarial anti-inductive dynamics.

Why would this matter at all for any purpose that might related to the use of rivalrous goods in an environment where there's no solution to adversarial epistemics? What's your model for how that could work?

Comment by benquo on Causal Reality vs Social Reality · 2019-06-26T12:29:25.497Z · score: 25 (9 votes) · LW · GW
They seen grandma getting sicker and sicker, suffering more and more, and they feel outrage: why have we not solved this yet?

You expect them to get angry - at whom in particular? - because grandma keeps getting older? For tens of thousands of years of human history, the only alternative to this has been substantially worse for grandma. Unless she wants to die and you're talking about euthanasia, but no additional medical research is needed for that. There is no precedent or direct empirical evidence that anything else is possible.

Maybe people are wrong for ignoring speculative arguments that anti-aging research is possible, but that's a terrible example of people being bound by social reality.

Comment by benquo on Drowning children are rare · 2019-06-25T00:39:07.623Z · score: 4 (2 votes) · LW · GW

To clarify a bit - I'm more confused about how to make the original post more clearly scope-limited, than about how to improve my commenting policy.

Evan's criticism in large part deals with the facts that there are specific possible scenarios I didn't discuss, which might make more sense of e.g. GiveWell's behavior. I think these are mostly not coherent alternatives, just differently incoherent ones that amount to changing the subject.

It's obviously not possible to discuss every expressible scenario. A fully general excuse like "maybe the Illuminati ordered them to do it as part of a secret plot," for instance, doesn't help very much, since that posits an exogenous source of complications that isn't very strongly constrained by our observations, and doesn't constrain our future anticipations very well. We always have to allow for the possibility that something very weird is going on, but I think "X or Y" is a reasonable short hand for "very likely, X or Y" in this context.

On the other hand, we can't exclude scenarios arbitrarily. It would have been unreasonable for me, on the basis of the stated cost-per-life-saved numbers, to suggest that the Gates Foundation is, for no good reason, withholding money that could save millions of lives this year, when there's a perfectly plausible alternative - that they simply don't think this amazing opportunity is real. This is especially plausible when GiveWell itself has said that its cost per life saved numbers don't refer to some specific factual claim.

"Maybe partial funding because AI" occurred to enough people that I felt the need to discuss it in the long series (which addressed all the arguments I'd heard up to that point), but ultimately it amounts to a claim that all the discourse about saving "dozens of lives" per donor is beside the point since there's a much higher-leverage thing to allocate funds to - in which case, why even engage with the claim in the first place?

Any time someone addresses a specific part of a broader issue, there will be countless such scope limitations, and they can't all be made explicit in a post of reasonable length.

Comment by benquo on Drowning children are rare · 2019-06-24T19:08:54.667Z · score: 9 (4 votes) · LW · GW

I think I can summarize my difficulties with this comment a bit better now.

(1) It's quite long, and brings up many objections that I dealt with in detail in the longer series I linked to. There will always be more excuses someone can generate that sound facially plausible if you don't think them through. One has to limit scope somehow, and I'd be happy to get specific constructive suggestions about how to do that more clearly.

(2) You're exaggerating the extent to which Open Philanthropy Project, Good Ventures, and GiveWell, have been separate organizations. The original explanation of the partial funding decision - which was a decision about how to recommend allocating Good Ventures's capital - was published under the GiveWell brand, but under Holden's name. My experience working for the organizations was broadly consistent with this. If they've since segmented more, that sounds like an improvement, but doesn't help enough with the underlying revealed preferences problem.

Comment by benquo on No, it's not The Incentives—it's you · 2019-06-23T02:02:25.571Z · score: 2 (1 votes) · LW · GW
If you intervened on ~100 entering PhD students and made them committed to always not following the incentives where they are bad, I predict that < 10% of them will become professors -- maybe an expected 2 of them would.

And how many if you didn't intervene?

So you can't say "why don't the academics just not follow the incentives"; any such person wouldn't have made it into academia.

How do you reconcile this with the immediately prior sentence?

Comment by Benquo on [deleted post] 2019-06-21T15:28:03.469Z

What sort of solutions might work?

Duncan's suggestion here seems like it has the right mood - treating discussion of things someone might feel attacked by as an important enough class to commit resources to, and including the point of view of the people who feel attacked. Third parties are needed in such cases. Imposing all the work on a small fixed class of moderators seems like it imposes a high burden on a few people.

One thing I've had occasion to want a couple times is something like an epistemic court. I have within the past several months felt a strong need for shared institutions that allow me to sue or be sued for being knowably wrong. Unlike state courts, I don't see any need for a body that can award damages, just one that can make judgments. Without this, if someone claims I have a blind spot, it's very hard for me to know when to actually terminate my own attempt to find it, since "no, YOU have a blind spot!" is sometimes true, but very hard to be subjectively confident of.

In any case, my intuition that courts would be helpful I think has something important in common with Duncan's intuition that more active moderation would be helpful. There's something wrong with the sort of debate club norms we have now. We're focused more on making valid arguments than finding the truth, which leaves us vulnerable to large classes of trolling.

I think there's been an implicit procedural-liberal bias to much discussion of moderation, where it's assumed that we can agree on rules in lieu of a shared perspective. But this doesn't actually work for getting to the truth, because it's vulnerable to both manufactured spurious grievances, and illegible attacks that evade the detection of legible rules, without any real mechanism for adjudicating when we want to classify conflicts as one or the other (or both, or some third thing).

A lot of why I've been skeptical of the idea of a generic forum over the last few years, is that it seems to me like people who are trying to figure something specific out - who have a perspective which in some concrete interested way wants to be made more correct - are going to have a huge advantage at filtering constructive from unconstructive comments, vs people who are trying to comply with the rules of good thinking. Cf. Something to Protect.

Comment by Benquo on [deleted post] 2019-06-21T15:16:40.489Z
My model of Benquo (in particular after a recent comment thread) is somewhat skeptical that it's good idea to treat conflict and action asymmetrically.

I strongly believe it's wrong to apply a higher burden to criticism of calls to action (or arguments offered in that context), than to the calls to action themselves. The frame in which we're lumping everything someone feels personally attacked by together as "conflict" basically gives everyone proposing something an unprincipled veto, letting them reclassify any criticism as "conflict" by framing the criticism as an attack on them or their allies.

I agree that people have a justified expectation that criticism actually is meant as an attack, but that just means we have to solve a hard problem. If we bounce off it instead, then this isn't really a rationality site, it's just a weird social club with shared rationality-related applause lights.

Comment by Benquo on [deleted post] 2019-06-21T15:09:06.358Z

This schema seems like it has some very important gaps. As we've discussed elsewhere, there's a need for criticism that isn't mainly about one person being bad - for instance, a call to action might be based on wrong ideas or factual errors. Even if this is a call to action around which some people have built their identities or social standing, criticizing it is not intrinsically the same thing as the kind of call to conflict you defined here.

If these are in practice the same thing, then that's a huge problem.

There's another legitimate type of criticism, which is, "so and so has violated community standards," which is both a claim about their behavior and about what the community standards are and ought to be. It's not obvious that "punishment" should follow in all cases, even if they actually did violate community standards, if those standards were unclear. In any case, a step we have to pass through before enforcement - if we want to have standards at all and not just mob rule - needs to be clarifying specific cases, and it might make sense to be much more lenient in cases where the mods aren't already on board.

There's a third class of criticism that's not about being "bad" - though it overlaps a bit with the first two - which is, a specific sort of epistemic defense - pointing out a pattern of communication that is seeking to induce errors. Obviously if we can't talk about that as prominently as we can talk about any other given thing, that's a huge security vulnerability.

Comment by benquo on Reason isn't magic · 2019-06-19T01:53:16.664Z · score: 14 (5 votes) · LW · GW

Copied it over myself, thanks for the suggestion

Reason isn't magic

2019-06-18T04:04:58.390Z · score: 104 (28 votes)
Comment by benquo on No, it's not The Incentives—it's you · 2019-06-17T21:23:42.519Z · score: 9 (5 votes) · LW · GW

It seems pretty fucked up to take positive proposals at face value given that context.

Comment by benquo on No, it's not The Incentives—it's you · 2019-06-16T06:35:12.946Z · score: 12 (3 votes) · LW · GW

You're the one bringing up the question of whether someone's a bad person.

Drowning children are rare

2019-05-28T19:27:12.548Z · score: 8 (43 votes)

A War of Ants and Grasshoppers

2019-05-22T05:57:37.236Z · score: 17 (5 votes)

Towards optimal play as Villager in a mixed game

2019-05-07T05:29:50.826Z · score: 40 (12 votes)

Hierarchy and wings

2019-05-06T18:39:43.607Z · score: 26 (11 votes)

Blame games

2019-05-06T02:38:12.868Z · score: 43 (9 votes)

Should Effective Altruism be at war with North Korea?

2019-05-05T01:50:15.218Z · score: 16 (12 votes)

Totalitarian ethical systems

2019-05-03T19:35:28.800Z · score: 36 (12 votes)

Authoritarian Empiricism

2019-05-03T19:34:18.549Z · score: 40 (13 votes)

Excerpts from a larger discussion about simulacra

2019-04-10T21:27:40.700Z · score: 43 (15 votes)

Blackmailers are privateers in the war on hypocrisy

2019-03-14T08:13:12.824Z · score: 24 (17 votes)

Moral differences in mediocristan

2018-09-26T20:39:25.017Z · score: 21 (8 votes)

Against the barbell strategy

2018-09-20T15:19:08.185Z · score: 20 (19 votes)

Interpretive Labor

2018-09-05T18:36:49.566Z · score: 28 (16 votes)

Zetetic explanation

2018-08-27T00:12:14.076Z · score: 79 (43 votes)

Model-building and scapegoating

2018-07-27T16:02:46.333Z · score: 23 (7 votes)

Culture, interpretive labor, and tidying one's room

2018-07-26T20:59:52.227Z · score: 29 (13 votes)

There is a war.

2018-05-24T06:44:36.197Z · score: 52 (24 votes)

Talents

2018-05-18T20:30:01.179Z · score: 47 (12 votes)

Oops Prize update

2018-04-20T09:10:00.873Z · score: 42 (9 votes)

Humans need places

2018-04-19T19:50:01.931Z · score: 113 (28 votes)

Kidneys, trade, sacredness, and space travel

2018-03-01T05:20:01.457Z · score: 51 (13 votes)

What strange and ancient things might we find beneath the ice?

2018-01-15T10:10:01.010Z · score: 32 (12 votes)

Explicit content

2017-12-02T00:00:00.946Z · score: 14 (8 votes)

Cash transfers are not necessarily wealth transfers

2017-12-01T10:10:01.038Z · score: 110 (42 votes)

Nightmare of the Perfectly Principled

2017-11-02T09:10:00.979Z · score: 32 (8 votes)

Poets are intelligence assets

2017-10-25T03:30:01.029Z · score: 26 (9 votes)

Seeding a productive culture: a working hypothesis

2017-10-18T09:10:00.882Z · score: 28 (9 votes)

Defense against discourse

2017-10-17T09:10:01.023Z · score: 64 (21 votes)

On the construction of beacons

2017-10-16T09:10:00.866Z · score: 58 (18 votes)

Sabbath hard and go home

2017-09-27T07:49:40.482Z · score: 78 (47 votes)

Why I am not a Quaker (even though it often seems as though I should be)

2017-09-26T07:00:28.116Z · score: 61 (31 votes)

Bad intent is a disposition, not a feeling

2017-05-01T01:28:58.345Z · score: 12 (15 votes)

Actors and scribes, words and deeds

2017-04-26T05:12:29.199Z · score: 6 (8 votes)

Effective altruism is self-recommending

2017-04-21T18:37:49.111Z · score: 71 (52 votes)

An OpenAI board seat is surprisingly expensive

2017-04-19T09:05:04.032Z · score: 5 (6 votes)

OpenAI makes humanity less safe

2017-04-03T19:07:51.773Z · score: 18 (20 votes)

Against responsibility

2017-03-31T21:12:12.718Z · score: 13 (12 votes)

Dominance, care, and social touch

2017-03-29T17:53:20.967Z · score: 3 (4 votes)

The D-Squared Digest One Minute MBA – Avoiding Projects Pursued By Morons 101

2017-03-19T18:48:55.856Z · score: 1 (2 votes)

Threat erosion

2017-03-15T23:32:30.000Z · score: 1 (2 votes)

Sufficiently sincere confirmation bias is indistinguishable from science

2017-03-15T13:19:05.357Z · score: 19 (19 votes)

Bindings and assurances

2017-03-13T17:06:53.672Z · score: 1 (2 votes)

Humble Charlie

2017-02-27T19:04:37.578Z · score: 2 (3 votes)

Against neglectedness considerations

2017-02-24T21:41:52.144Z · score: 1 (2 votes)

GiveWell and the problem of partial funding

2017-02-14T10:48:38.452Z · score: 2 (3 votes)

The humility argument for honesty

2017-02-05T17:26:41.469Z · score: 4 (5 votes)

Honesty and perjury

2017-01-17T08:08:54.873Z · score: 4 (5 votes)

[LINK] EA Has A Lying Problem

2017-01-11T22:31:01.597Z · score: 13 (13 votes)