Posts

Overcoming Algorithm Aversion: People Will Use Imperfect Algorithms If They Can (Even Slightly) Modify Them 2017-05-22T18:31:44.750Z
Algorithmic tacit collusion 2017-05-07T14:57:46.639Z
Stuart Ritche reviews Keith Stanovich's book "The rationality quotient: Toward a test of rational thinking" 2017-01-11T11:51:53.972Z
Social effects of algorithms that accurately identify human behaviour and traits 2016-05-14T10:48:27.159Z
Hedge drift and advanced motte-and-bailey 2016-05-01T14:45:08.023Z
Sleepwalk bias, self-defeating predictions and existential risk 2016-04-22T18:31:32.480Z
Identifying bias. A Bayesian analysis of suspicious agreement between beliefs and values. 2016-01-31T11:29:05.276Z
Does the Internet lead to good ideas spreading quicker? 2015-10-28T22:30:50.026Z
ClearerThinking's Fact-Checking 2.0 2015-10-22T21:16:58.544Z
[Link] Tetlock on the power of precise predictions to counter political polarization 2015-10-04T15:19:32.558Z
Matching donation funds and the problem of illusory matching 2015-09-18T20:05:46.098Z
Political Debiasing and the Political Bias Test 2015-09-11T19:04:49.452Z
Pro-Con-lists of arguments and onesidedness points 2015-08-21T14:15:36.306Z
Opinion piece on the Swedish Network for Evidence-Based Policy 2015-06-09T21:13:09.327Z
Could auto-generated troll scores reduce Twitter and Facebook harassments? 2015-04-30T14:05:45.848Z
[Link] Algorithm aversion 2015-02-27T19:26:43.647Z
The Argument from Crisis and Pessimism Bias 2014-11-11T20:25:44.734Z
Reverse engineering of belief structures 2014-08-26T18:00:31.094Z
Three methods of attaining change 2014-08-16T15:38:45.743Z
Multiple Factor Explanations Should Not Appear One-Sided 2014-08-07T14:10:00.504Z
Separating university education from grading 2014-07-03T17:23:57.027Z
The End of Bullshit at the hands of Critical Rationalism 2014-06-04T18:44:29.801Z
Book review: The Reputation Society. Part II 2014-05-14T10:16:34.380Z
Book Review: The Reputation Society. Part I 2014-05-14T10:13:19.826Z
The Rationality Wars 2014-02-27T17:08:45.470Z
Private currency to generate funds for effective altruism 2014-02-14T00:00:05.931Z
Productivity as a function of ability in theoretical fields 2014-01-26T13:16:15.873Z
Do we underuse the genetic heuristic? 2014-01-22T17:37:26.608Z
Division of cognitive labour in accordance with researchers' ability 2014-01-16T09:28:02.920Z

Comments

Comment by Stefan_Schubert on The "public debate" about AI is confusing for the general public and for policymakers because it is a three-sided debate · 2023-08-03T09:49:55.789Z · LW · GW

Yeah, I think so. But since those people generally find AI less important (there's both less of an upside and less of a downside) they generally participate less in the debate. Hence there's a bit of a selection effect hiding those people.

There are some people who arguably are in that corner who do participate in the debate, though - e.g. Robin Hanson. (He thinks some sort of AI will eventually be enormously important, but that the near-term effects, while significant, will not be at the level people on the right side think).

Looking at the 2x2 I posted I wonder if you could call the lower left corner something relating to "non-existential risks". That seems to capture their views. It might be hard to come up with a catch term, though.

The upper left corner could maybe be called "sceptics".

Comment by Stefan_Schubert on The "public debate" about AI is confusing for the general public and for policymakers because it is a three-sided debate · 2023-08-02T18:27:51.895Z · LW · GW

Not exactly what you're asking for, but maybe a 2x2 could be food for thought. 

Comment by Stefan_Schubert on The "public debate" about AI is confusing for the general public and for policymakers because it is a three-sided debate · 2023-08-02T06:17:44.206Z · LW · GW

Realist and pragmatist don't seem like the best choices of terms, since they pre-judge the issue a bit in the direction of that view.

Comment by Stefan_Schubert on AI psychology should ground the theories of AI consciousness and inform human-AI ethical interaction design · 2023-01-08T19:03:52.326Z · LW · GW

Thanks.

I think psychologists-scientists should have unusually good imaginations about the potential inner workings of other minds, which many ML engineers probably lack.

That's not clear to me, given that AI systems are so unlike human minds. 

Comment by Stefan_Schubert on AI psychology should ground the theories of AI consciousness and inform human-AI ethical interaction design · 2023-01-08T12:14:16.372Z · LW · GW

tell your fellow psychologist (or zoopsychologist) about this, maybe they will be incentivised to make a switch and do some ground-laying work in the field of AI psychology

Do you believe that (conventional) psychologists would be especially good at what you call AI psychology, and if so, why? I guess other skills (e.g. knowledge of AI systems) could be important.

Comment by Stefan_Schubert on Let’s think about slowing down AI · 2022-12-26T20:36:26.053Z · LW · GW

I think that's exactly right.

Comment by Stefan_Schubert on Let’s think about slowing down AI · 2022-12-26T10:16:40.813Z · LW · GW

I think that could be valuable.

It might be worth testing quite carefully for robustness - to ask multiple different questions probing the same issue, and see whether responses converge. My sense is that people's stated opinions about risks from artificial intelligence, and existential risks more generally, could vary substantially depending on framing. Most haven't thought a lot about these issues, which likely contributes. I think a problem problem with some studies on these issues is that researchers over-generalise from highly framing-dependent survey responses.

Comment by Stefan_Schubert on The next decades might be wild · 2022-12-22T12:45:28.789Z · LW · GW

I wrote an extended comment in a blog post

Summary:

Summing up, I disagree with Hobbhahn on three points.

  1. I think the public would be more worried about harm that AI systems cause than he assumes.
  2. I think that economic incentives aren’t quite as powerful as he thinks they are, and I think that governments are relatively stronger than he thinks.
  3. He argues that governments’ response will be very misdirected, and I don’t quite buy his arguments.

Note that 1 and 2/3 seem quite different: 1 is about how much people will worry about AI harms, whereas 2 and 3 are about the relative power of companies/economic incentives and governments, and government competency. It’s notable that Hobbhahn is more pessimistic on both of those relatively independent axes.

Comment by Stefan_Schubert on For every choice of AGI difficulty, conditioning on gradual take-off implies shorter timelines. · 2022-04-22T11:53:26.053Z · LW · GW

Another way to frame this, then, is that "For any choice of AI difficulty, faster pre-takeoff growth rates imply shorter timelines."

I agree. Notably, that sounds more like a conceptual and almost trivial claim.

I think that the original claims sound deeper than they are because they slide between a true but trivial interpretation and a non-trivial interpretation that may not be generally true.

Comment by Stefan_Schubert on For every choice of AGI difficulty, conditioning on gradual take-off implies shorter timelines. · 2022-04-21T11:04:09.383Z · LW · GW

Thanks.

My argument involved scenarios with fast take-off and short time-lines. There is a clarificatory part of the post that discusses the converse case, of a gradual take-off and long time-lines:

Is it inconsistent, then, to think both that take-off will be gradual and timelines will be long? No – people who hold this view probably do so because they think that marginal improvements in AI capabilities are hard. This belief implies both a gradual take-off and long timelines.

Maybe a related clarification could be made about the fast take-off/short time-line combination.

However, this claim also confuses me a bit:

No – people who hold this view probably do so because they think that marginal improvements in AI capabilities are hard. This belief implies both a gradual take-off and long timelines.

The main claim in the post is that gradual take-off implies shorter time-lines. But here the author seems to say that according to the view "that marginal improvements in AI capabilities are hard", gradual take-off and longer timelines correlate. And the author seems to suggest that that's a plausible view (though empirically it may be false). I'm not quite sure how to interpret this combination of claims.

Comment by Stefan_Schubert on For every choice of AGI difficulty, conditioning on gradual take-off implies shorter timelines. · 2022-04-21T09:28:41.118Z · LW · GW

For every choice of AGI difficulty, conditioning on gradual take-off implies shorter timelines.

What would you say about the following argument?

  • Suppose that we get AGI tomorrow because of a fast take-off. If so timelines will be extremely short.
  • If we instead suppose that take-off will be gradual, then it seems impossible for timelines to be that short.
  • So in this scenario - this choice of AGI difficulty - conditioning on gradual take-off doesn't seem to imply shorter timelines.
  • So that's a counterexample to the claim that for every choice of AGI difficulty, conditioning on gradual take-off implies shorter timelines.

I'm not sure whether it does justice to your reasoning, but if so, I'd be interested to learn where it goes wrong.

Comment by Stefan_Schubert on Beyond fire alarms: freeing the groupstruck · 2021-10-08T17:09:28.202Z · LW · GW
Comment by Stefan_Schubert on Is there a name for the theory that "There will be fast takeoff in real-world capabilities because almost everything is AGI-complete"? · 2021-10-08T03:09:24.471Z · LW · GW

Holden Karnofsky defends this view in his latest blog post.

I think it’s too quick to think of technological unemployment as the next problem we’ll be dealing with, and wilder issues as being much further down the line. By the time (or even before) we have AI that can truly replace every facet of what low-skill humans do, the “wild sci-fi” AI impacts could be the bigger concern.

Comment by Stefan_Schubert on Is there a name for the theory that "There will be fast takeoff in real-world capabilities because almost everything is AGI-complete"? · 2021-09-03T01:35:14.303Z · LW · GW

A related view is that less advanced/more narrow AI will do be able to do a fair number of tasks, but not enough to create widespread technological unemployment until very late, when very advanced AI quite quickly causes lots of people to be unemployed.

One consideration is how long time it will take for people to actually start using new AI systems (it tends to take some time for new technologies to be widely used). I think that some have speculated that that time lag may be shortened as AI become more advanced (as AI becomes involved in the deployment of other AI systems).

Comment by Stefan_Schubert on The Death of Behavioral Economics · 2021-08-31T12:23:07.823Z · LW · GW

Scott Alexander has written an in-depth article about Hreha's article:

The article itself mostly just urges behavioral economists to do better, which is always good advice for everyone. But as usual, it’s the inflammatory title that’s gone viral. I think a strong interpretation of behavioral economics as dead or debunked is unjustified.

See also Alex Imas's and Chris Blattman's criticisms of Hreha (on Twitter).

Comment by Stefan_Schubert on A revolution in philosophy: the rise of conceptual engineering · 2021-08-19T12:29:38.449Z · LW · GW

I think that though there's been a welcome surge of interest in conceptual engineering in recent years, the basic idea has been around for quite some time (though under different names). In particular, Carnap argued that we should "explicate" rather than "analyse" concepts already in the 1940s and 1950s. In other words, we shouldn't just try to explain the meaning of pre-existing concepts, but should develop new and more useful concepts that partially replace the old concepts.

Carnap’s understanding of explication was influenced by Karl Menger’s conception of the methodological role of definitions in mathematics, exemplified by Menger’s own explicative definition of dimension in topology.
...
Explication in Carnap’s sense is the replacement of a somewhat unclear and inexact concept C, the explicandum, by a new, clearer, and more exact concept , the explicatum.

See also Logical Foundations of Probability, pp. 3-20.

Comment by Stefan_Schubert on is scope insensitivity really a brain error? · 2020-09-29T10:52:07.831Z · LW · GW

The link doesn't seem to work.

Comment by Stefan_Schubert on Moral public goods · 2020-09-20T10:26:02.954Z · LW · GW

Potentially relevant new paper:

The logic of universalization guides moral judgment
To explain why an action is wrong, we sometimes say: “What if everybody did that?” In other words, even if a single person’s behavior is harmless, that behavior may be wrong if it would be harmful once universalized. We formalize the process of universalization in a computational model, test its quantitative predictions in studies of human moral judgment, and distinguish it from alternative models. We show that adults spontaneously make moral judgments consistent with the logic of universalization, and report comparable patterns of judgment in children. We conclude that alongside other well-characterized mechanisms of moral judgment, such as outcome-based and rule-based thinking, the logic of universalizing holds an important place in our moral minds.
Comment by Stefan_Schubert on Rationality is about pattern recognition, not reasoning · 2020-06-18T11:15:56.189Z · LW · GW

A new paper may give some support to arguments in this post:

The smart intuitor: Cognitive capacity predicts intuitive rather than deliberate thinking
Cognitive capacity is commonly assumed to predict performance in classic reasoning tasks because people higher in cognitive capacity are believed to be better at deliberately correcting biasing erroneous intuitions. However, recent findings suggest that there can also be a positive correlation between cognitive capacity and correct intuitive thinking. Here we present results from 2 studies that directly contrasted whether cognitive capacity is more predictive of having correct intuitions or successful deliberate correction of an incorrect intuition. We used a two-response paradigm in which people were required to give a fast intuitive response under time pressure and cognitive load and afterwards were given the time to deliberate. We used a direction-of-change analysis to check whether correct responses were generated intuitively or whether they resulted from deliberate correction (i.e., an initial incorrect-to-correct final response change). Results showed that although cognitive capacity was associated with the correction tendency (overall r = .13) it primarily predicted correct intuitive responding (overall r = .42). These findings force us to rethink the nature of sound reasoning and the role of cognitive capacity in reasoning. Rather than being good at deliberately correcting erroneous intuitions, smart reasoners simply seem to have more accurate intuitions.
Comment by Stefan_Schubert on Coronavirus as a test-run for X-risks · 2020-06-14T12:25:07.448Z · LW · GW

An economist friend said in a discussion about sleepwalk bias 9 March:

In the case of COVID, this led me to think that there will not be that much mortality in most rich countries, but only due to drastic measures.

The rest of the discussion may also be of interest; e.g. note his comment that "in economics, I think we often err on the other side -- people fully incorporate the future in many models."

Comment by Stefan_Schubert on Coronavirus as a test-run for X-risks · 2020-06-14T10:36:48.410Z · LW · GW

I agree people often underestimate policy and behavioural responses to disaster. I called this "sleepwalk bias" - the tacit assumption that people will sleepwalk into disaster to a greater extent than is plausible.

Jon Elster talks about "the younger sibling syndrome":

A French philosopher, Maurice Merleau-Ponty, said that our spontaneous tendency is to view other people as ‘‘younger siblings.’’ We do not easily impute to others the same capacity for deliberation and reflection that introspection tells us that we possess ourselves, nor for that matter our inner turmoil, doubts, and anguishes. The idea of viewing others as being just as strategic and calculating as we are ourselves does not seem to come naturally.
Comment by Stefan_Schubert on The case for C19 being widespread · 2020-03-28T12:00:04.132Z · LW · GW

Thanks, Lukas. I only saw this now. I made a more substantive comment elsewhere in this thread. Lodi is not a village, it's a province with 230K inhabitants, as are Cremona (360K) and Bergamo (1.11M). (Though note that all these names are also names of the central town in these provinces.)

Comment by Stefan_Schubert on The case for C19 being widespread · 2020-03-28T11:57:49.154Z · LW · GW

In the province of Lodi (part of Lombardy), 388 people were reported to have died of Covid-19 on 27 March. Lodi has a population of 230,000, meaning that 0.17% of _the population_ of Lodi has died. Given that everyone hardly has been infected, IFR must be higher.

The same source reports that in the province of Cremona (also part of Lombardy), 455 people had died of Covid-19 on 27 March. Cremona has a population of 360,000, meaning that 0.126% of the population of Cremona has died, according to official data.

Note also that there are reports of substantial under-reports of deaths in the Bergamo province. Some reports estimate that the true death rates in some areas may be as much as 1%. However, those reports are highly uncertain. And they may be outliers.

https://www.facebook.com/stefan.schubert.3954/posts/1369053463295040

Comment by Stefan_Schubert on Rational vs Reasonable · 2020-02-10T13:12:14.352Z · LW · GW

Here is a new empirical paper on folk conceptions of rationality and reasonableness:

Normative theories of judgment either focus on rationality (decontextualized preference maximization) or reasonableness (pragmatic balance of preferences and socially conscious norms). Despite centuries of work on these concepts, a critical question appears overlooked: How do people’s intuitions and behavior align with the concepts of rationality from game theory and reasonableness from legal scholarship? We show that laypeople view rationality as abstract and preference maximizing, simultaneously viewing reasonableness as sensitive to social context, as evidenced in spontaneous descriptions, social perceptions, and linguistic analyses of cultural products (news, soap operas, legal opinions, and Google books). Further, experiments among North Americans and Pakistani bankers, street merchants, and samples engaging in exchange (versus market) economy show that rationality and reasonableness lead people to different conclusions about what constitutes good judgment in Dictator Games, Commons Dilemma, and Prisoner’s Dilemma: Lay rationality is reductionist and instrumental, whereas reasonableness integrates preferences with particulars and moral concerns.
Comment by Stefan_Schubert on Moral public goods · 2020-01-26T23:56:55.461Z · LW · GW

Thanks, this is interesting. I'm trying to understand your ideas. Please let me know if I represent them correctly.

It seems to me that at the start, you're saying:

1. People often have strong selfish preferences and weak altruistic preferences.

2. There are many situations where people could gain more utility through engaging in moral agreements or moral trade - where everyone promises to take some altruistic action conditional on everyone else doing the same. That is because the altruistic utility they gain more than makes up for the selfish utility they lose.

These claims in themselves seem compatible with "altruism being about consequentialism".

To conclude that that's not the case, it seems that one has to add something like the following point. I'm not sure whether that's actually what you mean, but in any case, it seems like a reasonable idea.

3. Fairness considerations loom large in our intuitive moral psychology: we feel very strongly about the principle that everyone should do and have their fair share, hate being suckers, are willing to altruistically punish free-riders, etc.

It's known from dictator game studies, prisoner's dilemma studies, tragedies of the common, and similar research that people have such fairness-oriented dispositions (though there may be disagreements about details). They help us solve collective action problems, and make us provide for public goods.

So in those experiments, people aren't always choosing the action that would maximise their selfish interests in a one-off game. Instead they choose, e.g. to punish free-riders, even at a selfish cost.

Similarly, when people are trying to satisfy their altruistic interests (which is what you discuss), they aren't choosing the actions that, at least on the face of it (setting indirect effects of norm-setting, etc, aside), maximally satisfy their altruistic interests. Instead they take considerations of fairness and norms into account - e.g. they may contribute in contexts where others are contributing, but not in contexts where others aren't. In that sense, they aren't (act)-consequentialists, but rather do their fair share of worthwhile projects/slot into norms they find appropriate, etc.

Comment by Stefan_Schubert on Have epistemic conditions always been this bad? · 2020-01-26T14:08:53.659Z · LW · GW

I think this is a kind of question where our intuitions are quite weak and we need empirical studies to know. It is very easy to get annoyed with poor epistemics and to conclude, in exasperation, that things must have got worse. But since people normally don't remember or know well what things were like 30 years ago or so, we can't really trust those conclusions.

One way to test this would be to fact-check and argument-check (cf. https://www.lesswrong.com/posts/k54agm83CLt3Sb85t/clearerthinking-s-fact-checking-2-0 ) opinion pieces and election debates from different eras, and compare their relative quality. That doesn't seem insurmountably difficult. But of course it doesn't capture all aspects of our epistemic culture.

One could also look at features that one may suspect are correlated with poor epistemics, like political polarisation. On that, a recent paper gives evidence that the US has indeed become more polarised, but five out of the other nine studied OECD countries rather had become less polarised.

https://www.brown.edu/Research/Shapiro/pdfs/cross-polar.pdf

Comment by Stefan_Schubert on Robin Hanson on the futurist focus on AI · 2019-11-15T00:26:41.298Z · LW · GW
How about a book that has a whole bunch of other scenarios, one of which is AI risk which takes one chapter out of 20, and 19 other chapters on other scenarios?

It would be interesting if you went into more detail on how long-termists should allocate their resources at some point; what proportion of resources should go into which scenarios, etc. (I know that you've written a bit on such themes.)

Unrelatedly, it would be interesting to see some research on the supposed "crying wolf effect"; maybe with regards to other risks. I'm not sure that effect is as strong as one might think at first glance.

Comment by Stefan_Schubert on Robin Hanson on the futurist focus on AI · 2019-11-14T20:11:05.435Z · LW · GW

Associate professor, not assistant professor.

Comment by Stefan_Schubert on Is there a definitive intro to punishing non-punishers? · 2019-11-01T00:01:18.850Z · LW · GW
One of those concepts is the idea that we evolved to "punish the non-punishers", in order to ensure the costs of social punishment are shared by everyone.

Before thinking of how to present this idea, I would study carefully whether it's true. I understand there is some disagreement regarding the origins of third-party punishment. There is a big literature on this. I won't discuss it in detail, but here are some examples of perspectives which deviate from that taken in the quoted passage.

Joe Henrich writes:

This only makes sense as cultural evolution. Not much third party punishment in many small-scale societies .

So in Henrich's view, we didn't even (biologically) evolve to punish wrong-doers (as third parties), let alone non-punishers. Third-party punishment is a result of cultural, not biological, evolution, in his view.

Another paper of potential relevance by Tooby and Cosmides and others:

A common explanation is that third-party punishment exists to maintain a cooperative society. We tested a different explanation: Third-party punishment results from a deterrence psychology for defending personal interests. Because humans evolved in small-scale, face-to-face social worlds, the mind infers that mistreatment of a third party predicts later mistreatment of oneself.

Another paper by Pedersen, Kurzban and McCullough argues that the case for altruistic punishment is overstated.

Here, we searched for evidence of altruistic punishment in an experiment that precluded these artefacts. In so doing, we found that victims of unfairness punished transgressors, whereas witnesses of unfairness did not. Furthermore, witnesses’ emotional reactions to unfairness were characterized by envy of the unfair individual's selfish gains rather than by moralistic anger towards the unfair behaviour. In a second experiment run independently in two separate samples, we found that previous evidence for altruistic punishment plausibly resulted from affective forecasting error—that is, limitations on humans’ abilities to accurately simulate how they would feel in hypothetical situations. Together, these findings suggest that the case for altruistic punishment in humans—a view that has gained increasing attention in the biological and social sciences—has been overstated.
Comment by Stefan_Schubert on How do you assess the quality / reliability of a scientific study? · 2019-10-30T10:17:31.680Z · LW · GW

A recent paper developed a statistical model for predicting whether papers would replicate.

We have derived an automated, data-driven method for predicting replicability of experiments. The method uses machine learning to discover which features of studies predict the strength of actual replications. Even with our fairly small data set, the model can forecast replication results with substantial accuracy — around 70%. Predictive accuracy is sensitive to the variables that are used, in interesting ways. The statistical features (p-value and effect size) of the original experiment are the most predictive. However, the accuracy of the model is also increased by variables such as the nature of the finding (an interaction, compared to a main effect), number of authors, paper length and the lack of performance incentives. All those variables are associated with a reduction in the predicted chance of replicability.
...
The first result is that one variable that is predictive of poor replicability is whether central tests describe interactions between variables or (single-variable) main effects. Only eight of 41 interaction effect studies replicated, while 48 of the 90 other studies did.

Another, unrelated, thing is that authors often make inflated interpretations of their studies (in the abstract, the general discussion section, etc). Whereas there is a lot of criticism of p-hacking and other related practices pertaining to the studies themselves, there is less scrutiny of how authors interpret their results (in part that's understandable, since what counts as a dodgy interpretation is more subjective). Hence when you read the methods and results sections it's good to think about whether you'd make the same high-level interpretation of the results as the authors.

Comment by Stefan_Schubert on Two explanations for variation in human abilities · 2019-10-26T00:52:25.215Z · LW · GW

One aspect may be that the issues we discuss and try to solve are often at the limit of human capabilities. Some people are way better at solving them than others, and since those issues are so often in the spotlight, it looks like the less able are totally incompetent. But actually, they're not; it's just that the issues they are able to solve aren't discussed.

Cf. https://www.lesswrong.com/posts/e84qrSoooAHfHXhbi/productivity-as-a-function-of-ability-in-theoretical-fields

Comment by Stefan_Schubert on What Comes After Epistemic Spot Checks? · 2019-10-23T16:55:42.887Z · LW · GW
On first blush this looks like a success story, but it’s not. I was only able to catch the mistake because I had a bunch of background knowledge about the state of the world. If I didn’t already know mid-millenium China was better than Europe at almost everything (and I remember a time when I didn’t), I could easily have drawn the wrong conclusion about that claim. And following a procedure that would catch issues like this every time would take much more time than ESCs currently get.

Re this particular point, I guess one thing you might be able to do is to check arguments, as opposed to statements of facts. Sometimes, one can evaluate whether arguments are valid even when one isn't too knowledgeable about the particular topic. I previously did some work on argument-checking of political debates. (Though the rationale for that wasn't that argument-checking can require less knowledge than fact-checking, but rather that fact-checking of political debates already exists, whereas argument-checking does not).

I never did any systematic epistemic spot checks, but if a book contains a lots of arguments that appear fallacious or sketchy, I usually stop reading it. I guess that's related.

Comment by Stefan_Schubert on Replace judges with Keynesian beauty contests? · 2019-10-08T12:02:56.782Z · LW · GW

Thanks for this. In principle, you could use KBCs for any kind of evaluation, including evaluation of products, texts (essay grading, application letters, life plans, etc), pictures (which of my pictures is the best?), etc. The judicial system is very high-stakes and probably highly resistant to reform, whereas some of the contexts I list are much lower stakes. It might be better to try out KBCs in such a low-stakes context (I'm not sure which one would be best). I don't know what extent KBCs have tested for these kinds of purposes (it was some time since I looked into these issues, and I've forgotten a bit). That would be good to look into.

One possible issue that one would have to overcome is explicit collusion among subsets of raters. Another is, as you say, that people might converge on some salient characteristics that are easily observable but don't track what you're interested in (this could at least in some cases be seen as a form of "tacit collusion").

My impression is that collusion is a serious problems for ratings or recommender systems (which KBCs can be seen as a type of) in general. As a rule of thumb, people might be more inclined to engage in collusion when the stakes are higher.

To prevent that, one option would be to have a small number of known trustworthy experts, who also make evaluations which function as a sort of spot checks. Disagreement with those experts could be heavily penalised, especially if there are signs that the disagreement is due to (either tacit or explicit) collusion. But in the end only any anti-collusion measure needs to be tested empirically.

Relatedly, once people have a history of ratings, you may want to give disproportionate weights to those with a strong track record. Such epistocratic systems can be more efficient than democratic systems. See Thirteen Theorems in Search of the Truth.

KBCs can also be seen as a kind of prediction contests, where you're trying to predict other people's judgements. Hence there might be synergies with other forms of work on predictions.

Comment by Stefan_Schubert on Occam's Razor: In need of sharpening? · 2019-08-05T00:34:11.074Z · LW · GW

There is a substantial philosophical literature on Occam's Razor and related issues:

https://plato.stanford.edu/entries/simplicity/

Comment by Stefan_Schubert on Hedge drift and advanced motte-and-bailey · 2019-06-14T22:57:52.639Z · LW · GW

Yes, a new paper confirms this.

The association between quality measures of medical university press releases and their corresponding news stories—Important information missing
Comment by Stefan_Schubert on Say Wrong Things · 2019-05-30T11:31:28.652Z · LW · GW

Agreed; those are important considerations. In general, I think a risk for rationalists is to change one's behaviour on complex and important matters based on individual arguments which, while they appear plausible, don't give the full picture. Cf Chesterton's fence, naive rationalism, etc.


Comment by Stefan_Schubert on Reality has a surprising amount of detail · 2017-05-16T07:56:05.326Z · LW · GW

This was already posted a few links down.

Comment by Stefan_Schubert on OpenAI makes humanity less safe · 2017-04-06T15:57:33.424Z · LW · GW

One interesting aspect of posts like this is that they can, to some extent, be (felicitously) self-defeating.

Comment by Stefan_Schubert on Open thread, Oct. 03 - Oct. 09, 2016 · 2016-10-05T17:01:23.637Z · LW · GW

As Bastian Stern has pointed out to me, people often mix up pro tanto-considerations with all-things-considered-judgements - usually by interpreting what is merely intended to be a pro tanto-consideration as an all-things-considered judgement. Is there a name for this fallacy? It seems both dangerous and common so should have a name.

Comment by Stefan_Schubert on Open Thread May 9 - May 15 2016 · 2016-05-11T13:39:09.175Z · LW · GW

Thanks Ryan, that's helpful. Yes, I'm not sure one would be able to do something that has the right combination of accuracy, interestingness and low-cost at present.

Comment by Stefan_Schubert on Open Thread May 9 - May 15 2016 · 2016-05-10T17:08:53.193Z · LW · GW

Sure, I guess my question was whether you'd think that it'd be possible to do this in a way that would resonate with readers. Would they find the estimates of quality, or level of postmodernism, intuitively plausible?

My hunch was that the classification would primarily be based on patterns of word use, but you're right that it would probably be fruitful to use at patterns of citations.

Comment by Stefan_Schubert on Open Thread May 9 - May 15 2016 · 2016-05-10T10:26:20.772Z · LW · GW

deleted

Comment by Stefan_Schubert on Hedge drift and advanced motte-and-bailey · 2016-05-02T09:55:06.803Z · LW · GW

Good points. I agree that what you write within parentheses is a potential problem. Indeed, it is a problem for many kinds of far-reaching norms on altruistic behaviour compliance with which is hard to observe: they might handicap conscientious people relative to less conscientious people to such an extent that the norms do more harm than good.

I also agree that individualistic solutions to collective problems have a chequered record. The point of 1)-3) was rather to indicate how you potentially could reduce hedge drift, given that you want to do that. To get scientists and others to want to reduce hedge drift is probably a harder problem.

In conversation, Ben Levinstein suggested that it is partly the editors' role to frame articles in a way such that hedge drift doesn't occur. There is something to that, though it is of course also true that editors often have incentives to encourage hedge drift as well.

Comment by Stefan_Schubert on Sleepwalk bias, self-defeating predictions and existential risk · 2016-04-24T12:07:04.920Z · LW · GW

Thanks. My claim is somewhat different, though. Adams says that "whenever humanity can see a slow-moving disaster coming, we find a way to avoid it". This is an all-things-considered claim. My claim is rather that sleepwalk bias is a pro-tanto consideration indicating that we're too pessimistic about future disasters (perhaps especially slow-moving ones). I'm not claiming that we never sleepwalk into a disaster. Indeed, there might be stronger countervailing considerations, which if true would mean that all things considered we are too optimistic about existential risk.

Comment by Stefan_Schubert on Sleepwalk bias, self-defeating predictions and existential risk · 2016-04-24T11:59:47.835Z · LW · GW

It is not quite clear to me whether you are here just talking about instances of sleepwalking, or whether you are also talking about a predictive error indicating anti-sleepwalking bias: i.e. that they wrongly predicted that the relevant actors would act, yet they sleepwalked into a disaster.

Also, my claim is not that sleepwalking never occurs, but that people on average seem to think that it happens more often than it actually does.

Comment by Stefan_Schubert on Open Thread April 11 - April 17, 2016 · 2016-04-17T10:19:41.730Z · LW · GW

Open Phil gives $500,000 to Tetlock's research.

Comment by Stefan_Schubert on The Sally-Anne fallacy · 2016-04-15T08:55:58.564Z · LW · GW

Great post. Another issue is why B doesn't believe Y in spite of believing X and in spite of A believing that X implies Y. Some mechanisms:

a) B rejects that X implies Y, for reasons that are good or bad, or somewhere in between. (Last case: reasonable disagreement.)

b) B hasn't even considered whether X implies Y. (Is not logically omniscient.)

c) Y only follows from X given some additional premises Z, which B either rejects (for reasons that are good or bad or somehwere in between) or hasn't entertained. (What Tyrrell McAllister wrote.)

d) B is confused over the meaning of X, and hence is confused over what X implies. (The dialect case.)

Comment by Stefan_Schubert on Open Thread March 21 - March 27, 2016 · 2016-03-22T20:45:20.908Z · LW · GW

Thanks a lot! Yes, super-useful.

Comment by Stefan_Schubert on Open Thread March 21 - March 27, 2016 · 2016-03-22T14:14:30.956Z · LW · GW

I have a maths question. Suppose that we are scoring n individuals on their performance in an area where there is significant uncertainty. We are categorizing them into a low number of categories, say 4. Effectively we're thereby saying that for the purposes of our scoring, everyone with the same score performs equally well. Suppose that we say that this means that all individuals with that score get assigned the mean actual performance of the individuals with that that score. For instance, if there were three people who got the highest score, and their perfomance equals 8, 12 and 13 units, the assigned performance is 11 units.

Now suppose that we want our scoring system to minimise information loss, so that the assigned performance is on average as close as possible to the actual performance. The question is: how do we achieve this? Specifically, how large a proportion of all individuals should fall into each category, and how does that depend on the performance distribution?

It would seem that if performance is linearly increasing as we go from low to high performers, then all categories should have the same number of individuals, whereas if the increase is exponential, then the higher categories should have a smaller number of individuals. Is there a theorem that proves this, and which exacty specifies how large the categories should be for a given shape of the curve? Thanks.

Comment by Stefan_Schubert on Does the Internet lead to good ideas spreading quicker? · 2015-10-29T11:09:29.113Z · LW · GW

Great comment. Thanks!

Basically, rapid communication gives people too much choice. They choose things comfortably similar to what they know. Isolation is needed to allow new things to gain an audience before they're stomped out by the dominant things.

This is an interesting idea, reminiscent of, e.g. Lakatos's view of the philosophy of science. He argued that we shouldn't let new theories be discarded too quickly, just because they seem to have some things going against them. Only if their main tenets prove to be unfeasible should we discard them.

I think premature convergence does occur regarding the spread of ideas (memes), too (though it obviously varies). I do think, for instance, that what you describe in music has to a certain extent happened in analytic philosophy. In the early 20th century, several "scientific" approaches to philosophy developed, in, e.g. Cambridge, Vienna and Upsala. Today, the higher pace of communication leads to more convergence.