What Money Cannot Buy

post by johnswentworth · 2020-02-01T20:11:05.090Z · LW · GW · 49 comments

Contents

49 comments

Paul Graham:

The problem is, if you're not a hacker, you can't tell who the good hackers are. A similar problem explains why American cars are so ugly. I call it the design paradox. You might think that you could make your products beautiful just by hiring a great designer to design them. But if you yourself don't have good taste, how are you going to recognize a good designer? By definition you can't tell from his portfolio. And you can't go by the awards he's won or the jobs he's had, because in design, as in most fields, those tend to be driven by fashion and schmoozing, with actual ability a distant third. There's no way around it: you can't manage a process intended to produce beautiful things without knowing what beautiful is. American cars are ugly because American car companies are run by people with bad taste.

I don’t know how much I believe this claim about cars, but I certainly believe it about software. A startup without a technical cofounder will usually produce bad software, because someone without software engineering skills does not know how to recognize such skills in someone else. The world is full of bad-to-mediocre “software engineers” who do not produce good software. If you don’t already know a fair bit about software engineering, you will not be able to distinguish them from the people who really know what they’re doing.

Same with user interface design. I’ve worked with a CEO who was good at UI; both the process and the results were visibly superior to others I’ve worked with. But if you don’t already know what good UI design looks like, you’d have no idea - good design is largely invisible.

Yudkowsky makes the case that the same applies to security: you can’t build a secure product with novel requirements without having a security expert as a founder. The world is full of “security experts” who do not, in fact, produce secure systems - I’ve met such people. (I believe they mostly make money by helping companies visibly pretend to have made a real effort at security, which is useful in the event of a lawsuit.) If you don’t already know a fair bit about security, you will not be able to distinguish such people from the people who really know what they’re doing.

But to really drive home the point, we need to go back to 1774.

As the American Revolution was heating up, a wave of smallpox was raging on the other side of the Atlantic. An English dairy farmer named Benjamin Jesty was concerned for his wife and children. He was not concerned for himself, though - he had previously contracted cowpox. Cowpox was contracted by milking infected cows, and was well known among dairy farmers to convey immunity against smallpox.

Unfortunately, neither Jesty’s wife nor his two children had any such advantage. When smallpox began to pop up in Dorset, Jesty decided to take drastic action. He took his family to a nearby farm with a cowpox-infected cow, scratched their arms, and wiped pus from the infected cow on the scratches. Over the next few days, their arms grew somewhat inflamed and they suffered the mild symptoms of cowpox - but it quickly passed. As the wave of smallpox passed through the town, none of the three were infected. Throughout the rest of their lives, through multiple waves of smallpox, they were immune.

The same technique would be popularized twenty years later by Edward Jenner, marking the first vaccine and the beginning of modern medicine.

The same wave of smallpox which ran across England in 1774 also made its way across Europe. In May, it reached Louis XV, King of France. Despite the wealth of a major government and the talents of Europe’s most respected doctors, Louis XV died of smallpox on May 10, 1774.

The point: there is knowledge for which money cannot substitute. Even if Louis XV had offered a large monetary bounty for ways to immunize himself against the pox, he would have had no way to distinguish Benjamin Jesty from the endless crowd of snake-oil sellers and faith healers and humoral balancers. Indeed, top medical “experts” of the time would likely have warned him away from Jesty.

The general pattern:

Now, presumably we can get around this problem by investing the time and effort to become an expert, right? Nope! Where there are snake-oil salesmen, there will also be people offering to teach their secret snake-oil recipe, so that you too can become a master snake-oil maker.

So… what can we do?

The cheapest first step is to do some basic reading on a few different viewpoints and think things through for yourself. Simply reading the “correct horse battery staple” xkcd will be sufficient to recognize a surprising number of really bad “security experts”. It probably won’t get you to a level where you can distinguish the best from the middling - I don’t think I can currently distinguish the best from the middling security experts. But it’s a start.

More generally: it’s often easier to tell which of multiple supposed experts is correct, than to figure everything out from first principles yourself. Besides looking at the object-level product, this often involves looking at incentives in the broader system - see e.g. Inadequate Equilibria. Two specific incentive-based heuristics:

That said, remember the main message: there is no full substitute for being an expert yourself. Heuristics about incentives can help, but they’re leaky filters at best.

Which brings us to the ultimate solution: try it yourself. Spend time in the field, practicing the relevant skills first-hand; see both what works and what makes sense. Collect data; run trials. See what other people suggest and test those things yourself. Directly study which things actually produce good results.

49 comments

Comments sorted by top scores.

comment by Sniffnoy · 2020-02-01T22:16:28.738Z · LW(p) · GW(p)

Given a bunch of people who disagree, some of whom are actual experts and some of whom are selling snake oil, expertise yourself, there are some further quick-and-dirty heuristics you can use to tell which of the two groups is which. I think basically my suggestion can be best summarized at "look at argument structure".

The real experts will likely spend a bunch of time correct popular misconceptions, which the fakers may subscribe to. By contrast, the fakers will generally not bother "correcting" the truth to their fakery, because why would they? They're trying to sell to unreflective people who just believe the obvious-seeming thing; someone who actually bothered to read corrections to misconceptions at any point is likely too savvy to be their target audience.

Sometimes though you do get actual arguments. Fortunately, it's easier to evaluate arguments than to determine truth oneself -- of course, this is only any good if at least one of the parties is right! If everyone is wrong, heuristics like this will likely be no help. But in an experts-and-fakers situation, where one of the groups is right and the other pretty definitely wrong, you can often just use heuristics like "which side has arguments (that make some degree of sense) that the other side has no answer to (that makes any sense)?". If we grant the assumption that one of the two sides is right, then it's likely to be that one.

When you actually have a lot of back-and-forth arguing -- as you might get in politics, or, as you might get in disputes between actual experts -- the usefulness of this sort of thing can drop quickly, but if you're just trying to sort out fakers from those with actual knowledge, I think it can work pretty well. (Although honestly, in a dispute between experts, I think the "left a key argument unanswered" is still a pretty big red flag.)

Replies from: gworley, jmh, None, Vaniver
comment by Gordon Seidoh Worley (gworley) · 2020-02-03T20:08:47.597Z · LW(p) · GW(p)
The real experts will likely spend a bunch of time correct popular misconceptions, which the fakers may subscribe to. By contrast, the fakers will generally not bother "correcting" the truth to their fakery, because why would they? They're trying to sell to unreflective people who just believe the obvious-seeming thing; someone who actually bothered to read corrections to misconceptions at any point is likely too savvy to be their target audience.

Using this as a heuristic would often backfire on you as stated, because there's a certain class of snake oil salesmen who use the conceit of correcting popular misconceptions to sell you on their own, unpopular misconceptions (and of course the product that fits them!). To me it looks like it's exploiting the same kind of psychological mechanism that powers conspiracy theories, where the world is seen as full of hidden knowledge that "they" don't want you to know because the misinformation is letting "them" get rich or whatever. And I think part of the reason this works is that it pattern matches to cases where it turned out someone who thought everyone else was wrong really was right, even if they are rare.

In short, you are more likely to be encountering a snake oil salesman than a Galileo or a Copernicus or a Darwin, so spending a lot of time "correcting" popular misconceptions is probably not a reliable signal of real competence and not fakery.

Replies from: Sniffnoy
comment by Sniffnoy · 2020-02-03T20:55:24.801Z · LW(p) · GW(p)

This is a good point (the redemption movement comes to mind as an example), but I think the cases I'm thinking of and the cases you're describing look quite different in other details. Like, the bored/annoyed expert tired of having to correct basic mistakes, vs. the salesman who wants to initiate you into a new, exciting secret. But yeah, this is only a quick-and-dirty heuristic, and even then only good for distinguishing snake oil; it might not be a good idea to put too much weight on it, and it definitely won't help you in a real dispute ("Wait, both sides are annoyed that the other is getting basic points wrong!"). As Eliezer put it -- you can't learn physics by studying psychology!

comment by jmh · 2020-02-02T14:52:24.049Z · LW(p) · GW(p)
The real experts will likely spend a bunch of time correct popular misconceptions, which the fakers may subscribe to. By contrast, the fakers will generally not bother "correcting" the truth to their fakery, because why would they? They're trying to sell to unreflective people who just believe the obvious-seeming thing; someone who actually bothered to read corrections to misconceptions at any point is likely too savvy to be their target audience.

This seems to rely on the fakes knowing they are fakes. I agree that is a problem and your heuristic useful but I think we (non-experts) are still stuck with the problem of separating out the real experts from those that mistakenly think they are also real experts. Those will likely attempt to correct the true security approach according to their mistaken premises and solutions. We're still stuck with the problem that money doesn't get the non-expert client too far.

Now, you've clearly been able to reduce the ratio of real solution to snake oil so moved the probabilities in your favor when throwing money at the problems but not sure just how far.

Replies from: ChristianKl
comment by ChristianKl · 2020-02-03T15:33:03.056Z · LW(p) · GW(p)

It seems like "real expert" is here used in two different senses. In the one sense there's an expert is someone who spend their 10,000 hours of deliberate practice and developed strong opinions about what's the right way to do things that they can articulate. That person will likely have convictions about what public misconceptions happen to be.

In the other sense being an expert is about an ability to produce certain quality outputs.

You can tell whether a person is an expert in the first sense by seeing whether they try to correct your misconception and have convictions about what's the right way to act or whether the person just tells you what's popular to say and what you want to hear.

Replies from: jmh
comment by jmh · 2020-02-03T16:54:57.844Z · LW(p) · GW(p)

I assume that is directed at my comment but not certain. The point I am making is that even after eliminating "the person just tells you what's popular to say and what you want to hear." you still have the problem of some of the remainder will be experts than understand the subtleties and details as they apply to your specific needs from those that don't.

The heuristic about how they present their sales pitch are " leaky filters" as the OP notes and I'm not entirely sure we understand how far they actually move the probabilities for actually getting the expert rather than the mediocre ( knows all the theory and terms and even has a good idea of how they all relate but just does not actually get the system as a whole or perhaps is just to lazy to do the work).

For those pushing these specific heuristics, is there any actual data we can look at to see how effective they are?

comment by [deleted] · 2020-05-21T22:40:17.182Z · LW(p) · GW(p)

Here's another: probing into their argument structure a bit and checking if they can keep it from collapsing under its own weight.

https://www.lesswrong.com/posts/wyyfFfaRar2jEdeQK/entangled-truths-contagious-lies [LW · GW]

comment by Vaniver · 2020-02-04T21:09:36.552Z · LW(p) · GW(p)

I think basically my suggestion can be best summarized at "look at argument structure".

And how does one distinguish snake oil salesmen and real experts when it comes to identifying argument structure and what it implies?

Replies from: Sniffnoy
comment by Sniffnoy · 2020-02-05T18:53:19.983Z · LW(p) · GW(p)

What I said above. Sorry, to be clear here, by "argument structure" I don't mean the structure of the individual arguments but rather the overall argument -- what rebuts what.

(Edit: Looks like I misread the parent comment and this fails to respond to it; see below.)

Replies from: Vaniver
comment by Vaniver · 2020-02-05T23:23:43.986Z · LW(p) · GW(p)

To be clear as well, the rhetorical point underneath my question is that I don't think your heuristic is all that useful, and seems grounded in generalization from too few examples without searching for counterexamples. Rather than just attacking it directly like Gordon, I was trying to go up a meta-level, to just point at the difficulty of 'buying' methods of determining expertise, because you need to have expertise in distinguishing the market there.

(In general, when someone identifies a problem and you think you have a solution, it's useful to consider whether your solution suffers from that problem on a different meta-level; sometimes you gain from sweeping the difficulty there, and sometimes you don't.)

Replies from: Sniffnoy, None
comment by Sniffnoy · 2020-02-06T20:24:51.447Z · LW(p) · GW(p)

Oh, I see. I misread your comment then. Yes, I am assuming one already has the ability to discern the structure of an argument and doesn't need to hire someone else to do that for you...

comment by [deleted] · 2020-05-21T22:36:14.371Z · LW(p) · GW(p)

Probably the skill of discerning skill would be easier to learn than... every single skill you're trying to discern.

comment by cousin_it · 2020-02-02T20:45:57.650Z · LW(p) · GW(p)

This problem crops up in many places.

I think the most promising solution is something like the personalized PageRank algorithm. It's a formalization of the idea "here's some sources I already trust, so let's walk the graph starting from them and find the best sources to answer my questions". It doesn't absolve you from figuring stuff out, but acts as a force multiplier on top of that.

One important use case is funding of research. Today prestigious journals judge the importance of research by accepting or rejecting papers, and funders make decisions based on that (and citations, but those come later). A system without journals at all, only PageRank-like endorsements between researchers, could be cheaper and just as reliable.

Replies from: SaidAchmiz
comment by Said Achmiz (SaidAchmiz) · 2020-02-03T00:26:02.388Z · LW(p) · GW(p)

The Web of Trust didn’t work for secure communication, and PageRank didn’t work for search [LW(p) · GW(p)]. What makes you think either or both of these things will work for research funding?

Replies from: cousin_it
comment by cousin_it · 2020-02-03T08:21:24.170Z · LW(p) · GW(p)

Yeah, the reference to web of trust wasn't really important, I've edited it out. As for PageRank, AFAIK it works fine for recommendation systems. You do need another layer to prevent link farms and other abuse, but since research is a gated community with ethical standards, that should be easier than on the web.

Replies from: SaidAchmiz
comment by Said Achmiz (SaidAchmiz) · 2020-02-03T09:06:37.522Z · LW(p) · GW(p)

I think to properly combat the factors that make PageRank not work, we need to broaden our analysis. Saying it’s “link farms and other abuse” doesn’t quite get to the heart of the matter—what needs to be prevented is adversarial activity, i.e., concerted efforts to exploit (and thus undermine) the system.

Now, you say “research is a gated community with ethical standards”, and that’s… true to some extent, yes… but are you sure it’s true enough, for this purpose? And would it remain true, if such a system were implemented? (Consider, in other words, that switching to a PageRank-esque system for allocating funding would create clear incentives for adversarial action, where currently there are none!)

Replies from: cousin_it, Kaj_Sotala, mr-hire
comment by cousin_it · 2020-02-03T10:40:31.254Z · LW(p) · GW(p)

would create clear incentives for adversarial action, where currently there are none

Well, citation farms already exist, so we know roughly how many people are willing to do stuff like that. I still think the personalized PageRank algorithm (aka PageRank with priors, maybe initialized with a bunch of trustworthy researchers) is a good fit for solving this problem.

Replies from: philh
comment by philh · 2020-02-07T15:26:11.120Z · LW(p) · GW(p)

citation farms already exist, so we know roughly how many people are willing to do stuff like that.

To be precise, we have a lower bound.

comment by Kaj_Sotala · 2020-02-03T11:32:36.232Z · LW(p) · GW(p)

Google Scholar seems to recommend new papers to me based on, I think, works that I have cited in my own previous publications. The recommendations seem about as decent as feels fair to expect from our current level of AI.

comment by Matt Goldenberg (mr-hire) · 2020-02-04T23:06:19.593Z · LW(p) · GW(p)

One of the issues with pagerank is it needs universal ranking. If you do something like personal page rank the issues with adversarial activity are much reduced.

comment by Randini · 2020-02-11T22:34:34.209Z · LW(p) · GW(p)

I've seen a lot of "American cars are ugly" sentiments lately, and I think those people may just be failing to see that US automakers are just optimising to their design constraints.

Consider: Maximizing fuel economy while maintaining size/power means aggressively chasing aerodynamics, but there are hard limits on possible shapes when you must also preserve passenger comfort, visibility, and storage space. Collision safety itself is a major driver towards "boxification."

Think of a beautiful car, and it's probably deeply deficient in one or more of those criteria.

comment by adamShimi · 2020-02-07T10:48:04.501Z · LW(p) · GW(p)

This is interesting, because at least in theoretical computer science, verifying something is conjectured to be easier that creating it: the P vs NP question for example, where almost all complexity theorist conjectures that P is not equal to NP. That is to say, some problems in NP (problems for which we can verify a solution in polynomial time) are conjectured to not be in P (problems for which we can find a solution in polynomial time).

On the other hand, your examples hint at cases where verifying something (the quality of the product for example) is almost as hard as creating this thing (building a quality product).

Not sure if this adds anything to the conversation, but I found the connection surprising.

Replies from: johnswentworth
comment by johnswentworth · 2020-02-07T16:45:40.263Z · LW(p) · GW(p)

In CS, there are some problems whose answer is easier to verify than to create. The same is certainly true in the world in general - there are many objectives whose completion we can easily verify, and those are well-suited to outsourcing. But even in CS, there are also (believed to be) problems whose answer is hard to verify.

But the answer being hard to verify is different from a proof being hard to verify - perhaps the right analogy is not NP, but IP or some variant thereof.

This line of reasoning does suggest some interesting real-world strategies - in particular, we know that MIP = NEXPTIME, so quizzing multiple alleged experts in parallel (without allowing them to coordinate answers) could be useful. Although that's still not quite analogous, since IP and MIP aren't about distinguishing real from fake experts - just true from false claims.

Replies from: adamShimi
comment by adamShimi · 2020-02-08T16:07:38.745Z · LW(p) · GW(p)

The existence of problems whose answers are hard to verify does not entail that this verification is harder than finding the answer itself. Do you have examples of the latter case? Intuitively, it seems akin to comparing any (deterministic) complexity class with its non-deterministic version, and any problem solvable in the former is verifiable in the latter, by dropping the proof and just solving the problem.

For the difference between verifying a proof and an answer, I agree that interactive protocols are more appropriate for the discussion we're having. Even if interactive protocols are not about distinguishing between different experts, they might serve this point indirectly by verifying the beauty of a car design or the security of a system. That is, we could (in theory) use interactive proofs to get convinced with good probability of the quality of a candidate-expert's output.

Replies from: johnswentworth
comment by johnswentworth · 2020-02-08T17:21:04.244Z · LW(p) · GW(p)
The existence of problems whose answers are hard to verify does not entail that this verification is harder than finding the answer itself.

That's not quite the relevant question. The point of hiring an expert is that it's easier to outsource the answer-finding to the expert than to do it oneself; the relevant question is whether there are problems for which verification is not any easier than finding the answer. That's what I mean by "hard to verify" - questions for which we can't verify the answer any faster than we can find the answer.

I thought some more about the IP analogy yesterday. In many cases, the analogy just doesn't work - verifying claims about the real world (i.e. "I've never heard of a milkmaid who had cowpox later getting smallpox") or about human aesthetic tastes (i.e. "this car is ugly") is fundamentally different from verifying a computation; we can verify a computation without needing to go look at anything in the physical world. It does seem like there are probably use-cases for which the analogy works well enough to plausibly adopt IP-reduction algorithms to real-world expert-verification, but I do not currently have a clear example of such a use-case.

Replies from: adamShimi, Ericf
comment by adamShimi · 2020-02-09T15:22:26.903Z · LW(p) · GW(p)

Okay, so we agree that it's improbable (at least for decision problems) to be able to verify an answer faster than finding it. What you care about are cases where verification is easier, as is conjectured for example for NP (where verification is polynomial, but finding an answer is supposed to not be).

For IP, if we only want to verify any real-world property, I actually have a simple example I give into my intro to complexity theory lectures. Imagine that you are color-blind (precisely, a specific red and a specific green look exactly the same to you). If I have two balls, perfectly similar except one is green and the other is red, I can convince you that these balls are of different colors. It is basically the interactive protocol for graph non-isomorphism: you flip a coin, and depending on the result, you exchange the balls without me seeing it. If I can tell whether you exchanged the balls a sufficient number of times, then you should get convinced that I can actually differentiate them.

Of course this is not necessarily applicable to questions like tastes. Moreover, it is a protocol for showing that I can distinguish between the balls; it does not show why.

Replies from: johnswentworth
comment by johnswentworth · 2020-02-09T17:28:46.374Z · LW(p) · GW(p)

That is an awesome example, thank you!

It does still require some manipulation ability - we have to be able to experimentally intervene (at reasonable expense). That doesn't open up all possibilities, but it's at least a very large space. I'll have to chew on it some more.

comment by Ericf · 2020-02-13T03:54:57.745Z · LW(p) · GW(p)

It can be very easy to come up with answers. The hard part is often verifying. Example 1: Fermats Last Therom. One man spent a short time to state the answer, hundreds spent decades to verify it. Example 2: Solve for x: Problem: floor(37/4.657 + 3^9) = x. Proposed solution (this took me 4 seconds to write down) X = 1,456,299 It will probably take you longer than 4 seconds to verify or disprove my solution.

comment by Vaniver · 2021-12-23T16:02:53.423Z · LW(p) · GW(p)

I think this post labels an important facet of the world, and skillfully paints it with examples without growing overlong. I liked it, and think it would make a good addition to the book.

There's a thing I find sort of fascinating about it from an evaluative perspective, which is that... it really doesn't stand on its own, and can't, as it's grounded in the external world, in webs of deference and trust. Paul Graham makes a claim about taste; do you trust Paul Graham's taste enough to believe it? It's a post about expertise that warns about snake oil salesmen, while possibly being snake oil itself. How can you check? "there is no full substitute for being an expert yourself."

And so in a way it seems like the whole rationalist culture, rendered in miniature: money is less powerful than science, and the true science is found in carefully considered personal experience and the whispers of truth around the internet, more than the halls of academia.

comment by Hooman Habibi (hooman-habibi) · 2020-02-05T21:19:16.466Z · LW(p) · GW(p)

I try to make an isolated example to show how difficult the problem of "Knowing" and "Persuasion" is in the personal domain.

Let us look at this question:

is living next to power lines dangerous?

It is established scientifically that non-ionizing electromagnetic radiations are not carcinogenic (do not lead to cancer or other disease), for example, electromagnetic radiation from power lines and cell phones.

There was a research decades ago that linked living near power lines to blood cancer. This research however later was shown to be invalid.

I am an electrical engineer by education and my wife is a physics graduate so both of us should be able to follow the reasoning fairly well, as long as it is related to electromagnetic and not biology.

My wife, however, opposes to buy a house within a visible distance of high voltage power lines and whatever I did I could not persuade her that this is harmless. She is not alone as I heard this from many highly educated engineers around me. They just prefer to stay on the safe side which might be wrong. They always combine it with the argument of 'there are things that we don't know'. This is even reflected in the market price of houses near power lines.

Now how can you prove that living near power lines is safe and more importantly persuade someone else? Can you run your own tests? Can you follow the health of people living near power lines? If your full-time job is not that then this would be impossible.

When I google the question I get a boxed search result from a "snake oil seller" feeding on the fears of people:

Hundreds of studies worldwide have shown that living next to high voltage power lines and other parts of the power transmission network increases your risk of cancer and other health problems. The closer you are the more you are bombarded with dangerous EMFs www.safespaceprotection.com › emf-health-risks › emf-health-effects

So much praise should go to the power of page-rank and other algorithms in google search to bring this up. I am certain that the majority of people won't go further than the boxed results.

Now, this seems like a trivial and not so important example. But we are just following the same line of reasoning for many more decisions in everyday life.

Replies from: jmh, habryka4
comment by jmh · 2020-02-08T21:44:00.013Z · LW(p) · GW(p)

Well, doesn't that lead to an opportunity for those that are confident in the fact that there is no risk. Buy the house for cheap, likely get a windfall profit in years as more people come to accept that the facts are no risk and they will have increased options for investing the price differential so able to retires earlier or with greater resources?

(I understand that doesn't solve the problem with your wife but....)

comment by habryka (habryka4) · 2020-02-05T21:19:06.033Z · LW(p) · GW(p)

Edit note: Cleaned up your formatting a bunch.

comment by Matt Goldenberg (mr-hire) · 2020-02-05T02:23:32.997Z · LW(p) · GW(p)

Thanks for writing this. I think a lot of what's awful about the world is related to this issue of common knowledge about who is good at what, and who would do well where. I think solving this problem is high priority and spent a few years working on it before my startup failed.

I think any solution not only has to solve the problem, but also has to solve the problem that any solution that can tell you who is actually competent has to contend with existing power structures that don't want to actually know who is competent.

comment by Ben Pace (Benito) · 2020-02-03T23:52:58.235Z · LW(p) · GW(p)

This seems like a very important point to me, I'm glad it's been written down clearly and concretely. Curated.

comment by Vanessa Kosoy (vanessa-kosoy) · 2020-02-02T00:59:07.991Z · LW(p) · GW(p)

One method of solving the problem is looking at empirical performance on objective metrics. For example, we can test different healing methods using RCTs and see which actually work. Or, if "beauty" is defined as "something most people consider beautiful", we can compare designers by measuring how their designs are rated by large groups of people. Of course, if such evidence is not already available, then producing it is usually expensive. But, this is something money can buy, at least in principle. Then again, it requires the arbiter to at least understand how empirical evidence works. Louis XV, to eir misfortune, did not know about RCTs.

Replies from: JesperO, Pattern
comment by JesperO · 2022-02-15T01:47:27.560Z · LW(p) · GW(p)

Agree that empirical performance is a very important way to assess experts.

Unfortunately it can be tricky. In the RCT example, you need expertise to be able to evaluate the RCT. It's not just about knowing about their existence, but also you'd need to be able to eg avoid p-hacking, file-drawer effects and other methodological issues. Especially in a high stakes adversarial landscape like national politics. Joe Biden himself doesn't have enough expertise to assess empirical performance using RCTs. And it's unclear if even any of his advisors can.

comment by Pattern · 2020-02-04T07:33:22.759Z · LW(p) · GW(p)
Louis XV, to eir misfortune, did not know about RCTs.

But if snake oil was poisonous, and it was known that anyone who ingests it dies immediately afterward, then this information would be of value to Lous XV, in ignoring/banning at least one sort of pseudo-physician.

comment by Benquo · 2023-07-04T14:39:31.152Z · LW(p) · GW(p)

Experts should be able to regularly win bets against nonexperts, and generally valued generally scarce commodities like money should be able to incentivize people to make bets. If you don't know how to construct an experiment to distinguish experts from nonexperts, you probably do not have a clear idea what it is you want an expert on, and if you don't have any idea what you are trying to buy, it's unclear what it would even mean to intend to buy it.

Abstract example: 16th Century "math duels," in which expert algebraists competed to solve quantitative problems.

Concrete example: LockPickingLawyer, who demonstrates on YouTube how to easily defeat many commercially popular locks.

If I needed to physically secure a closure against potentially expert adversaries, I'd defer to LockPickingLawyer, and if I had the need and budget to do so at scale, I'd try to hire them. Obviously I'd be vulnerable to principal-agent problems, and if the marginal value of resolving those were good enough I'd look for an expert on that as well. However, it would be cheaper and probably almost as effective to simply avoid trying to squeeze LockPickingLawyer financially, and instead pay something they're happy with.

Replies from: johnswentworth
comment by johnswentworth · 2023-07-05T15:51:43.095Z · LW(p) · GW(p)

I expect such a policy to produce the opposite of good results. By default, it will Goodhart on legible metrics rather than whatever's actually valuable. For instance, in the case of physical security, a lockpicker can very legibly demonstrate how easy it is to beat commercial locks - but this is potentially a distraction, if locks aren't the main weak point of one's physical security. Similarly with math duels: they're very externally legible, but they're potentially a distraction from more important skills like e.g. the ability to find high-value problems to work on.

Replies from: Benquo
comment by Benquo · 2023-07-07T14:31:37.993Z · LW(p) · GW(p)

I agree that streetlamp effects are a problem.  I think you are imagining a different usage than I am.  I was imagining deferring to LockPickingLawyer about locks so that I could only spend about 5 minutes on that part of the problem, and spend whatever time I saved on other problems, including other aspects of securing an enclosure.  If I had $100M and didn't have friends I already trusted to do this sort of thing, offering them $20k to red-team a building might be worth it if I were worried about perimeter security; the same mindset that notices you can defeat some locks by banging them hard seems like it would have a good chance at noticing other simple ways to defeat barriers e.g. "this window can be opened trivially from the outside without triggering any alarms".

Holding math duels to the standard of finding high-value problems to work on just seems nuts; I meant them as an existence proof of ways to measure possession of highly abstract theoretical secrets.  If someone wanted to learn deep math, and barely knew what math was, they could do a lot worse than hiring someone who won a lot of math duels (or ideally whoever taught that person) as their first tutor, and then using that knowledge to find someone better.  If you wanted to subsidize work on deep math, you might do well to ask a tournament-winner (or Fields medalist) whose non-tournament work they respect.

I went through a similar process in learning how to use my own body better: qigong seemed in principle worth learning, but I didn't have a good way to distinguish real knowledge (if any) from bullshit.  When I read Joshua Waitzkin's The Art of Learning, I discovered that the closely related "Martial" Art of Tai Chi had a tournament that revealed relative levels of skill - and that William C. C. Chen, the man who'd taught tournament-winner Waitzkin, was still teaching in NYC.  So I started learning from him, and very slowly improved my posture and balance. Eventually one of the other students invited me to a group that practiced on Sunday mornings in Chinatown's Columbus Park, and when I went, I had just enough skill to immediately recognize the man teaching there as someone who had deep knowledge and was very good at teaching it, and I started learning much faster, in ways that generalized much better to other domains of life.  This isn't the only search method I used - recommendations from high-discernment friends also led to people who were able to help me - but it's one that's relatively easy to reproduce.

comment by RedMan · 2020-02-03T18:39:29.273Z · LW(p) · GW(p)

Anonymity helps.

By just being known as rich or occupying a prominent position, you will always attract people who want a piece, and will try to figure out what it is that you need in a friend or subcontractor and attempt to provide it, often extremely successfully. I mean, as Eliezer has said (paraphrasing, hopefully faithfully), the kinds of people you find at 'high status' conventions are just a better class of people than the hoi polloi.

With a degree of anonymity, it becomes somewhat straightforward to search for things like the farmer's cowpox cure, because professional purveyors of things to the wealthy do not waste their time crafting pitches for nobodies.

But then, you also have the separate problem as a nobody that 'somebodies' do not return your calls.

comment by CTVKenney · 2021-02-21T14:29:34.825Z · LW(p) · GW(p)

Money should be able to guarantee that, over several periods of play, you perform not-too-much-worse than an actual expert. Here: https://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15859-f11/www/notes/lecture16.pdf is a paper about an idealized CS-version of this problem. 

Replies from: johnswentworth
comment by johnswentworth · 2021-02-21T17:11:36.577Z · LW(p) · GW(p)

Cool piece!

I don't think it's particularly relevant to the problems this post is talking about, since things like "how do we evaluate success?" or "what questions should we even be asking?" are core to the problem; we usually don't have lots of feedback cycles with clear, easy-to-evaluate outcomes. (The cases where we do have lots of feedback cycles with clear, easy-to-evaluate outcomes tend to be the "easy cases" for expert evaluation, and those methods you linked are great examples of how to handle the problem in those cases.)

Drawing from some of the examples:

  • Evaluating software engineers is hard because, unless you're already an expert, you can't just look at the code or the product. The things which separate the good from the bad mostly involve long-term costs of maintenance and extensibility.
  • Evaluating product designers is hard because, unless you're already an expert, you won't consciously notice the things which matter most in a design. You'd need to e.g. a/b test designs on a fairly large user base, and even then you need to be careful about asking the right questions to avoid Goodharting.
  • In the smallpox case, the invention of clinical trials was exactly what gave us lots of clear, easy-to-evaluate feedback on whether things work. Louis XV only got one shot, and he didn't have data on hand from prior tests.
comment by leggi · 2020-02-05T04:06:03.805Z · LW(p) · GW(p)
some basic reading on a few different viewpoints and think things through for yourself
try it yourself. Spend time in the field, practicing the relevant skills first-hand; see both what works and what makes sense. Collect data; run trials. See what other people suggest and test those things yourself. Directly study which things actually produce good results.

Excellent advice. A bit of research, some thought, get some experience, assess results.

A little surprised that it's not standard practice, so it's good you've written this post.


Indeed, top medical “experts” of the time would likely have warned him away from Jesty.

Speculation? I am being picky though, it's a well written post but I can imagine other scenarios.


there is knowledge for which money cannot substitute

I've just made this post [LW · GW]to help share some important knowledge. Try it for yourself.

comment by adamShimi · 2020-02-07T10:53:39.032Z · LW(p) · GW(p)

Also, I am pretty sure that the xkcd example is wrong. Mathematically, the entropy of the second password should be lower, because we can guess the next letters of the words from dictionary analysis, or even frequencies of next letters in language like english. And practically, dictionary attacks are pretty much built for breaking passwords like the latter.

The standard for root passwords in my local community is more on the order of finding a very long sentence, and taking letters from each words (the first one in the easiest scheme, but it can get harder) to build a long password that is both hard to guess and relatively easy to remember.

Replies from: gjm
comment by gjm · 2020-02-07T13:37:43.184Z · LW(p) · GW(p)

I'm pretty sure you're wrong about the xkcd example.

He doesn't just look at the number of characters in the four words. He reckons 11 bits of entropy per word and doesn't operate at the letter level at all. If those words were picked at random from a list of ~2000 words then the entropy estimate is correct.

I don't know where he actually got those words from. Maybe he just pulled them out of his head, in which case the effective entropy might be higher or lower. To get a bit of a handle on this, I found something on the internet that claims to be a list of the 2000ish most common English words (the actual figure is 2265, as it happens) and

  • checked whether the xkcd words are in the list ("correct" and "horse" are, "battery" and "staple" aren't)
  • generated some quadruples of random words from the list to see whether they feel stranger than the xkcd set (which, if true, would suggest that maybe he picked his by a process with less real entropy than picking words at random from a set of 2000). I got: result lie variety work; fail previously anything weakness; experienced understand relative efficiency; ear recognize list shower; classroom inflation space refrigerator. These feel to me about as strange as the xkcd set.

So I'm pretty willing to believe that the xkcd words really do have ~11 independent bits of entropy each.

In your local community's procedure, I worry about "finding a very long sentence". Making it up, or finding it somewhere else? The total number of sentences in already-existing English text is probably quite a lot less than 2^44, and I bet there isn't that much entropy in your choice of which letter to pick from each word.

Replies from: adamShimi
comment by adamShimi · 2020-02-07T14:07:05.164Z · LW(p) · GW(p)

You're right. I was thinking on the level of letters, but the fact that he gives the same number of bits of entropy to four quite different words should have alerted me. And with around 2000 common words to choose from, the entropy is indeed around 11 bits per word.

Thanks for the correction!

(For our local password, the sentences tends to be created, to avoid some basic dictionary attacks, and they tends to be complex and full of puns. But you might be right about the entropy loss in this case.

comment by M. Y. Zuo · 2023-05-15T16:31:08.405Z · LW(p) · GW(p)

Deriving from first principles is always correct but much harder. 

But theoretically it's possible for someone to be an expert in nothing, except the basic fundamental principles of electromagnetism, the weak and strong nuclear force, gravity, etc..., and derive everything in logical sequence from vibrating electrons on up. 

Though this definitely would require a very unusual amount of competence. 

For a real world complex software program, like Excel, figuring this out would be worthy of a nobel prize.