Correctly Calibrated Trust

post by habryka (habryka4) · 2023-06-24T19:48:05.702Z · LW · GW · 3 comments

This is a link post for https://forum.effectivealtruism.org/posts/KE4Ga3zHQsczooQi7/correctly-calibrated-trust

Contents

  Short version:
        [Just read the bold to get a really short version]
  Longer version:
    Part 1: What fuzzy proxies are people using and why would they be systematically overweighted?
      Getting funding from OP and LTFF
      Writing or publishing widely-read things: books, highly upvoted posts on Less Wrong / the Forum, podcasts
      Being around EA without a negative reputation
    Part 2: What to do about it
      Individual
        Ask around
      Community
        Gossip
        Transparency
None
3 comments

Chana from the CEA Community Health team posted this to the EA Forum, where it sadly seems to have not gotten a lot of traction. I actually think it's a quite important post, so I am signal-boosting it here. On the surface level it talks a lot about EA, but I a lot of it also straightforwardly implies to the AI Alignment or Rationality communities, and as such are also of relevance to a lots of readers on LessWrong. 

Below I shamelessly copied over the whole post content (except the footnotes, since they were hard to copy-paste):


 This post comes from finding out that Asya Bergal was having thoughts about this and was maybe going to write a post, thoughts I was having along similar lines, and a decision to combine energy and use the strategy fortnight as an excuse to get something out the door. A lot of this is written out of notes I took from a call with her, so she get credit for a lot of the concrete examples and the impetus for writing a post shaped like this. 

Interested in whether this resonates with people's experience!

Short version:

[Just read the bold to get a really short version]

There’s a lot of “social sense of trust” in EA, in my experience. There’s a feeling that people, organizations and projects are broadly good and reasonable (often true!) that’s based on a combination of general vibes, EA branding and a few other specific signals of approval, as well as an absence of negative signals. I think that it’s likely common to overweight those signals of approval and the absence of disapproval. 

Especially post-FTX, I’d like us to be well calibrated on what the vague intuition we download from the social web is telling us, and place trust wisely. 

[“Trust” here is a fuzzy and under-defined thing that I’m not going to nail down - I mean here something like a general sense that things are fine and going well]

Things like getting funding, being highly upvoted on the forum, being on podcasts, being high status and being EA-branded are fuzzy and often poor proxies for trustworthiness and of relevant people’s views on the people, projects and organizations in question[1] [EA(p) · GW(p)]. 

Negative opinions (anywhere from “that person not so great” to “that organization potentially quite sketch, but I don't have any details”) are not necessarily that likely to find their way to any given person for a bunch of reasons, and we don’t have great solutions to collecting and acting on character evidence that doesn't come along with specific bad actions. It’s easy to overestimate what you would know if there’s a bad thing to know.

If it’s decision relevant or otherwise important to know how much to trust a person or organization, I think it’s a mistake to rely heavily on the above indicators, or on the “general feeling” in EA. Instead, get data if you can, and ask relevant people their actual thoughts - you might find them surprisingly out of step with what the vibe would indicate.

I’m pretty unsure what we can or should do as a community about this, but I have a few thoughts at the bottom, and having a post about it as something to point to might help.

Longer version:

I think you'll get plenty out of this if you read the headings and read more under each heading if something piques your curiosity

Part 1: What fuzzy proxies are people using and why would they be systematically overweighted?

(I don’t know how common these mistakes are, or that they apply to you, the specific reader of the post. I expect them to bite harder if you’re newer or less connected, but I also expect that it’s easy to be somewhat biased in the same directions even if you have a lot of context. I’m hoping this serves as contextualization for the former and a reminder / nudge for the latter.)

Getting funding from OP and LTFF

Seems easy to expect that if someone got funding from Open Phil or the Long Term Future Fund, that’s a reasonable signal about the value of their work or the competence or trustworthiness or other virtues of the person running it. It obviously is Bayesian evidence, but I expect this to be extremely noisy.

But on the whole, if we’re descriptively trying to figure out how information flows in EA, I’d advise most people (up to you to figure out if you’re in this bunch), to give less trust based on grants.

Writing or publishing widely-read things: books, highly upvoted posts on Less Wrong / the Forum, podcasts

Same deal as above except maybe even noisier. 

Being around EA without a negative reputation

Poking at this is important to me. I think it’s easy to overupdate that if you haven’t heard negative things, they’re not there. In fact, there’s a lot of reasons negative opinions won’t have made their way to you, or to the people who might benefit most from hearing about them (this last thing is something I’d like to improve but is far from a trivial problem to solve).

Another way of saying this is that the social miasma about a person or group can be pretty dumb [EA · GW] or uninformed.

Things I didn’t have enough to say about or would take more thought to flesh out but could go in this list are status, EA branding, and chains of trust, where you don't trust people but know people who trust them (or so you think) and so on.

Part 2: What to do about it

Individual

Ask around

On an individual level, I’m interested in people noticing when they’re updating on the above pieces of evidence, deciding how much they endorse that, and, when needed, looking for better sources of evidence. 

For instance, at risk of stating the obvious, if you’re figuring out what job to take, it being a “cool org to work at” should be supported by asking about their theory of change (and deciding what you think about it) and talking to people who have worked there or who have context on it, or people who have talked to those people who you trust. (Among other [EA · GWstrategies [EA · GW])

When I was last applying to jobs, I asked around a bunch about the workplace and the management style of the managers (often at the encouragement of the people who I was applying to work with) and I think this was very important.

In other decision relevant contexts - like hiring, mentoring, granting - asking around and digging into what you think you know will likely give you a better sense of how much to trust than you’ll get from the overall milieu[4] [EA(p) · GW(p)]. I’m in favor of using trust as a two-argument function [EA(p) · GW(p)], where we talk about trusting X in particular ways Y or in particular domains Z. 

Spitballing, but in some cases you might want to ask the Community Health team if we have information we can share. If we do, it might be helpful; if we don’t, we can just say we either have no information or none we can share (we aim to not distinguish between these[5] [EA(p) · GW(p)].

Strong caveat / warning: I don’t know if this idea scales at all. People or organizations or our team might end up getting too many requests for information and will have to say no to a lot of them. I don’t know what to do about that!

(And of course you’ll have to decide for yourself how much to trust the views you’re given when you do this kind of asking around)

Community

On a community wide level, I feel extremely unsure. 

Gossip

One thing we could try out is doing gossip better. Gossip has a ton of potential upside - finding out and being able to act on predatory or immoral behavior, understanding what kinds of situations you might be walking into, making a better community via norm setting. And a bunch of downsides - poorly done, a mistake can follow you around for longer than it should, it can make for a toxic and neurotic sense that you don’t know what people are saying about you, feeling constantly evaluated, even in non-professional settings, and so on. 

For getting more of the pros and fewer of the cons, I like the notion of epistemically rigorous gossip, where you:

I also think people should downgrade the feeling of everyone already knowing something and raise issues they think are important to the relevant people.

Transparency

A wilder move is going more Bridgewater (an investment firm famous for lots of feedback and explicit ratings of people’s abilities[6] [EA(p) · GW(p)]) / radical transparency, where it’s just more normal to say what you think about grants, people and organizations without that being a big deal, and without people overupdating on negative things, so that we all have a fuller and more nuanced picture (“collapsing a bunch of the weird social uncertainty that makes things insane”, as Habryka puts it [EA · GW])

So many obvious downsides to the strongest version of that: it creates a high barrier for being in the community, it’s emotionally rough on people, people who would otherwise do good things aren’t interested in being rated publicly or consenting to “apply for an Open Phil grant” = “they’ll publicly say that you don’t seem that competent but your idea is good so they’re funding you”. It’s an actively bad idea if people are going to overupdate such that it’s better to just not try things or have less experience so that you don’t have negative things said about you. And anything done in public is going to be seeable by people outside the community with different norms, so there’s not a lot of control over that. It also doesn’t work if the incentives are off - ie if people are overly punished for giving feedback or under-incentivized to be accurate.

But there's a lot that interests me about it, and it might be a good idea directionally, on the margin or in non-public settings. I’m for instance interested in the people I work with having a decent model of my strengths and weaknesses, and I’m really impressed at people more transparent about those things than I am. It’s scary but useful to be able to talk openly about those things.

Where we’re at
I and others on the Community Health team have done some thinking [EA(p) · GW(p)] about collecting fuzzier information and character evidence and acting on it appropriately. It helps to be able to ask people to call to mind concerns they have and flag them, or to be more on the lookout for serious concerns, even if it’s mostly in vibe. But overall, at the moment, it seems some combination of very time-consuming, emotionally costly, and just overall a hard problem to solve (while for instance not overupdating or overreacting or having your time misallocated to every concern you could hear). This is very much not a solved problem. If people have thoughts about it, I’m interested to hear!

3 comments

Comments sorted by top scores.

comment by romeostevensit · 2023-06-25T00:33:38.253Z · LW(p) · GW(p)

I think the issue with gossip is that it easily slippery slopes on selection effects for 'juice', that is to say, in the game of telephone, the errors that get magnified at each step are non-random but selected for licentiousness.

comment by Dagon · 2023-06-24T21:03:38.553Z · LW(p) · GW(p)

I suspect there's a too-high prior for trust here.  Before you seriously consider joining a group, it's worth looking for evidence that it's well above the median in terms of respect and mental health of members.