Correctly Calibrated Trust

habryka4

Correctly Calibrated Trust

post by habryka (habryka4) · 2023-06-24T19:48:05.702Z · LW · GW · 3 comments

This is a link post for https://forum.effectivealtruism.org/posts/KE4Ga3zHQsczooQi7/correctly-calibrated-trust

  Short version:
        [Just read the bold to get a really short version]
  Longer version:
    Part 1: What fuzzy proxies are people using and why would they be systematically overweighted?
      Getting funding from OP and LTFF
      Writing or publishing widely-read things: books, highly upvoted posts on Less Wrong / the Forum, podcasts
      Being around EA without a negative reputation
    Part 2: What to do about it
      Individual
        Ask around
      Community
        Gossip
        Transparency
None
3 comments

Chana from the CEA Community Health team posted this to the EA Forum, where it sadly seems to have not gotten a lot of traction. I actually think it's a quite important post, so I am signal-boosting it here. On the surface level it talks a lot about EA, but I a lot of it also straightforwardly implies to the AI Alignment or Rationality communities, and as such are also of relevance to a lots of readers on LessWrong.

Below I shamelessly copied over the whole post content (except the footnotes, since they were hard to copy-paste):

This post comes from finding out that Asya Bergal was having thoughts about this and was maybe going to write a post, thoughts I was having along similar lines, and a decision to combine energy and use the strategy fortnight as an excuse to get something out the door. A lot of this is written out of notes I took from a call with her, so she get credit for a lot of the concrete examples and the impetus for writing a post shaped like this.

Interested in whether this resonates with people's experience!

Short version:

[Just read the bold to get a really short version]

There’s a lot of “social sense of trust” in EA, in my experience. There’s a feeling that people, organizations and projects are broadly good and reasonable (often true!) that’s based on a combination of general vibes, EA branding and a few other specific signals of approval, as well as an absence of negative signals. I think that it’s likely common to overweight those signals of approval and the absence of disapproval.

Especially post-FTX, I’d like us to be well calibrated on what the vague intuition we download from the social web is telling us, and place trust wisely.

[“Trust” here is a fuzzy and under-defined thing that I’m not going to nail down - I mean here something like a general sense that things are fine and going well]

Things like getting funding, being highly upvoted on the forum, being on podcasts, being high status and being EA-branded are fuzzy and often poor proxies for trustworthiness and of relevant people’s views on the people, projects and organizations in question^[1] [EA(p) · GW(p)].

Negative opinions (anywhere from “that person not so great” to “that organization potentially quite sketch, but I don't have any details”) are not necessarily that likely to find their way to any given person for a bunch of reasons, and we don’t have great solutions to collecting and acting on character evidence that doesn't come along with specific bad actions. It’s easy to overestimate what you would know if there’s a bad thing to know.

If it’s decision relevant or otherwise important to know how much to trust a person or organization, I think it’s a mistake to rely heavily on the above indicators, or on the “general feeling” in EA. Instead, get data if you can, and ask relevant people their actual thoughts - you might find them surprisingly out of step with what the vibe would indicate.

I’m pretty unsure what we can or should do as a community about this, but I have a few thoughts at the bottom, and having a post about it as something to point to might help.

Longer version:

I think you'll get plenty out of this if you read the headings and read more under each heading if something piques your curiosity

Part 1: What fuzzy proxies are people using and why would they be systematically overweighted?

(I don’t know how common these mistakes are, or that they apply to you, the specific reader of the post. I expect them to bite harder if you’re newer or less connected, but I also expect that it’s easy to be somewhat biased in the same directions even if you have a lot of context. I’m hoping this serves as contextualization for the former and a reminder / nudge for the latter.)

Getting funding from OP and LTFF

Seems easy to expect that if someone got funding from Open Phil or the Long Term Future Fund, that’s a reasonable signal about the value of their work or the competence or trustworthiness or other virtues of the person running it. It obviously is Bayesian evidence, but I expect this to be extremely noisy.

These organisations engage in hits-based philanthropy - as I understand it, they don’t expect most of the grants they make to be especially valuable (but the amount and way this is true varies by funder - Linch describes them as having different brands [EA · GW])
- A lot of funding is given speculatively, with enough hope that there will be something useful to make it worth doing, but that might not be as high of a bar as one might expect
- (I’m not a funder, I don’t know how much a “lot” is - maybe a funder will jump in to say)
Sometimes a funder isn’t excited about a particular research program but is excited about the skill building it would provide or excited about the person that’s running it and letting them explore something
There might be concerns about trustworthiness or competence, but they’re outweighed by signals in the other direction, the project being low risk, or the weaknesses being irrelevant to the specific kind of work
- [I think this also applies in the case of hiring, especially for smaller and scrappier organizations]
My guess (though I haven’t run this claim by funders) is that, especially when funding is abundant and/or grantmaker capacity is low, there’s an explicit decision to let a lot more false positives through
Crucially, in many cases, the general public won’t get these details, and won’t be able to distinguish between grants without these concerns and grants with them
- I guess LTFF has gone through a couple stages, where first it didn’t do very detailed writeups, then did [EA(p) · GW(p)], and now doesn’t again^[2] [EA(p) · GW(p)]. So whether people get this information is an active choice about how to use time and resources, and we could debate the value of moving more towards a regime like the 2019 LTFF writeups, which were quite detailed (I talk about this more at the bottom.)
- I haven’t been “around” for all the waves of this, so if I’m just wrong about what details people tend to have about grants, let me know. Skimming through Open Phil’s public grants, they do not tend to have information of this kind.
Ways it is Bayesian evidence
- Indeed if the concerns about a person or organization were concerning enough, that would tank a grant.
  - At least in some cases people do consider the second order effects of increasing the power, influence and resources of organizations or people when they grant to them
- (Reminder that I’m not a grantmaker) Presumably people who do have a good track record or are more trusted find it easier to get access to grants^[3] [EA(p) · GW(p)]

But on the whole, if we’re descriptively trying to figure out how information flows in EA, I’d advise most people (up to you to figure out if you’re in this bunch), to give less trust based on grants.

Writing or publishing widely-read things: books, highly upvoted posts on Less Wrong / the Forum, podcasts

Same deal as above except maybe even noisier.

I’ll sometimes notice that a post on the forum is highly upvoted or written by someone who’s a big deal, and assume it’s the vibe now, which then updates me that therefore the writer is representing a successful wave in EA and therefore presumably has good thoughts and judgment and presumably everyone else thinks so too. Only then to talk to other folks and realize that lots of people didn’t like it for a variety of reasons but are never going to take the time to write a comment about it. Important lesson for me!
The Forum, like the rest of the internet, is not real life.
- There’s apparently some wild power laws of what percent of people write what percent of comments (which is common also in places like twitter).
- Being online or giving podcasts selects for people who are e.g. comfortable airing their thoughts in public, who have the time to do so, who are very online. What’s considered good online doesn’t track what’s good in other contexts, and it also doesn’t track what’s considered good by different pockets of people.
Skill in something as specific as “writing / speaking publicly” can be quite uncorrelated from skill or virtue in other arenas.
- Relatedly, it can be really valuable to hear people’s thoughts without thinking that there should be much in the way of updating on anything else about them, which is relevant for how much to update if someone is invited onto a podcast or to speak somewhere.
  - Though to what extent we think giving invites of that kind is or should be thought of as a general endorsement seems complicated and the answer isn’t zero.
I think as with funding, people upvote posts for a variety of reasons, and definitely not just because they think it makes good and important points. It’s been long posited that EAs love to upvote criticism even when it’s bad because they want to signal that feedback and critique are welcome. I think this is sometimes true also of posts about things that are emotionally charged - the reaction to that writing isn’t going to track the overall reaction to the person or their ideas.
I also think EA, like every other social community, has subcultures, where if everyone you talk to likes the post or the podcast, or takes a person or argument seriously, it’s easy to assume that’s representative of EA more broadly.

Being around EA without a negative reputation

Poking at this is important to me. I think it’s easy to overupdate that if you haven’t heard negative things, they’re not there. In fact, there’s a lot of reasons negative opinions won’t have made their way to you, or to the people who might benefit most from hearing about them (this last thing is something I’d like to improve but is far from a trivial problem to solve).

It’s often costly to talk badly about things, in public or private - there’s a higher bar for saying negative things than positive ones, and not all information reaches that bar. People understandably get offended or defensive and you might end up in a protracted argument. You might seem unfair or disagreeable.
- This goes double if you have heard gossip or get a bad vibe but don’t have evidence; you’d have to do an actual investigation to figure out what’s true, and without it it can seem like you’re smearing people/organizations unnecessarily
And in fact there’s a real ethical issue here - it’s bad to smear people incorrectly, and it’s bad for a community to have a witch hunt feel about it. Gossip can be hugely toxic and corrosive, and make it harder to be in this community at all. But at the same time vibes have real information.
Information cascades: if one person heard a bad thing and tells people, it can quickly become “understood” that something is bad out of proportion to the actual evidence. That makes the social vibe a worse tracker of trustworthiness and virtue directly.
- Also, in response, people might reasonably want to say fewer bad things, which also prevents information from being integrated into the social web
- Relatedly, the balance between “give people useful information” and “allow people to have fresh starts when appropriate” is blurry. It makes sense to call references to see if you want to work with someone, but at what point should a bad work experience stop hindering someone if they’ve improved since then and will the social graph correctly update at that point?
A lot of negative information about people and organizations is private or confidential
The gossip networks are really imperfect - as I said above, EA has subcultures of people who talk to each other often and it’s really easy to think “everyone knows [EA(p) · GW(p)]” something when they don’t (especially across social groups or across different hubs and geographic locations)
- Quoting myself from that shortform:
  “I think feel very free to ask around to get these takes and see what you find - it's been a learning experience for me, for sure…I think "some smart people in EA think this is totally wrongheaded" is a good prior for basically anything going on in EA.”
Pluralistic ignorance - everyone thinks that everyone else thinks someone or something is great so doesn’t want to speak up, but in fact that’s what everyone else thinks everyone else thinks so the fact that people have doubts or concerns doesn’t end up being said
Ben Millwood pointed out a lot of similar things in his post The illusion of consensus about EA celebrities [EA · GW]

Another way of saying this is that the social miasma about a person or group can be pretty dumb [EA · GW] or uninformed.

Things I didn’t have enough to say about or would take more thought to flesh out but could go in this list are status, EA branding, and chains of trust, where you don't trust people but know people who trust them (or so you think) and so on.

Part 2: What to do about it

Individual

Ask around

On an individual level, I’m interested in people noticing when they’re updating on the above pieces of evidence, deciding how much they endorse that, and, when needed, looking for better sources of evidence.

For instance, at risk of stating the obvious, if you’re figuring out what job to take, it being a “cool org to work at” should be supported by asking about their theory of change (and deciding what you think about it) and talking to people who have worked there or who have context on it, or people who have talked to those people who you trust. (Among other [EA · GW] strategies [EA · GW])

When I was last applying to jobs, I asked around a bunch about the workplace and the management style of the managers (often at the encouragement of the people who I was applying to work with) and I think this was very important.

In other decision relevant contexts - like hiring, mentoring, granting - asking around and digging into what you think you know will likely give you a better sense of how much to trust than you’ll get from the overall milieu^[4] [EA(p) · GW(p)]. I’m in favor of using trust as a two-argument function [EA(p) · GW(p)], where we talk about trusting X in particular ways Y or in particular domains Z.

Spitballing, but in some cases you might want to ask the Community Health team if we have information we can share. If we do, it might be helpful; if we don’t, we can just say we either have no information or none we can share (we aim to not distinguish between these^[5] [EA(p) · GW(p)].

Strong caveat / warning: I don’t know if this idea scales at all. People or organizations or our team might end up getting too many requests for information and will have to say no to a lot of them. I don’t know what to do about that!

(And of course you’ll have to decide for yourself how much to trust the views you’re given when you do this kind of asking around)

Community

On a community wide level, I feel extremely unsure.

Gossip

One thing we could try out is doing gossip better. Gossip has a ton of potential upside - finding out and being able to act on predatory or immoral behavior, understanding what kinds of situations you might be walking into, making a better community via norm setting. And a bunch of downsides - poorly done, a mistake can follow you around for longer than it should, it can make for a toxic and neurotic sense that you don’t know what people are saying about you, feeling constantly evaluated, even in non-professional settings, and so on.

For getting more of the pros and fewer of the cons, I like the notion of epistemically rigorous gossip, where you:

Include a description of how you know, when you found out, when the relevant facts took place and ideally from who, so information isn’t double counted
Include your credence in its veracity, and what you would do in the position the person you’re talking to is in
(ideally) speak in the form of concrete observations to help the person you’re talking to make up their own mind
(ideally) own the biases or incentives or relevant social relationships that are relevant to the gossip

I also think people should downgrade the feeling of everyone already knowing something and raise issues they think are important to the relevant people.

Transparency

A wilder move is going more Bridgewater (an investment firm famous for lots of feedback and explicit ratings of people’s abilities^[6] [EA(p) · GW(p)]) / radical transparency, where it’s just more normal to say what you think about grants, people and organizations without that being a big deal, and without people overupdating on negative things, so that we all have a fuller and more nuanced picture (“collapsing a bunch of the weird social uncertainty that makes things insane”, as Habryka puts it [EA · GW])

So many obvious downsides to the strongest version of that: it creates a high barrier for being in the community, it’s emotionally rough on people, people who would otherwise do good things aren’t interested in being rated publicly or consenting to “apply for an Open Phil grant” = “they’ll publicly say that you don’t seem that competent but your idea is good so they’re funding you”. It’s an actively bad idea if people are going to overupdate such that it’s better to just not try things or have less experience so that you don’t have negative things said about you. And anything done in public is going to be seeable by people outside the community with different norms, so there’s not a lot of control over that. It also doesn’t work if the incentives are off - ie if people are overly punished for giving feedback or under-incentivized to be accurate.

But there's a lot that interests me about it, and it might be a good idea directionally, on the margin or in non-public settings. I’m for instance interested in the people I work with having a decent model of my strengths and weaknesses, and I’m really impressed at people more transparent about those things than I am. It’s scary but useful to be able to talk openly about those things.

Where we’re at
I and others on the Community Health team have done some thinking [EA(p) · GW(p)] about collecting fuzzier information and character evidence and acting on it appropriately. It helps to be able to ask people to call to mind concerns they have and flag them, or to be more on the lookout for serious concerns, even if it’s mostly in vibe. But overall, at the moment, it seems some combination of very time-consuming, emotionally costly, and just overall a hard problem to solve (while for instance not overupdating or overreacting or having your time misallocated to every concern you could hear). This is very much not a solved problem. If people have thoughts about it, I’m interested to hear!

3 comments

Comments sorted by top scores.

comment by romeostevensit · 2023-06-25T00:33:38.253Z · LW(p) · GW(p)

I think the issue with gossip is that it easily slippery slopes on selection effects for 'juice', that is to say, in the game of telephone, the errors that get magnified at each step are non-random but selected for licentiousness.

comment by Dagon · 2023-06-24T21:03:38.553Z · LW(p) · GW(p)

I suspect there's a too-high prior for trust here. Before you seriously consider joining a group, it's worth looking for evidence that it's well above the median in terms of respect and mental health of members.

comment by Said Achmiz (SaidAchmiz) · 2023-06-25T00:36:24.605Z · LW(p) · GW(p)

In short: trust, but verify.

Correctly Calibrated Trust

Contents

Short version:

Longer version:

Part 1: What fuzzy proxies are people using and why would they be systematically overweighted?

Getting funding from OP and LTFF

Writing or publishing widely-read things: books, highly upvoted posts on Less Wrong / the Forum, podcasts

Being around EA without a negative reputation

Part 2: What to do about it

Individual

Community

3 comments