Comment by unnamed on The Real Rules Have No Exceptions · 2021-01-25T10:06:19.306Z · LW · GW

It seems like the core thing that this post is doing is treating the concept of "rule" as fundamental. 

If you have a general rule plus some exceptions, then obviously that "general rule" isn't the real process that is determining the results. And noticing that (obvious once you look at it) fact can be a useful insight/reframing.

The core claim that this post is putting forward, IMO, is that you should think of that "real process" as being a rule, and aim to give it the virtues of good rules such as being simple, explicit, stable, and legitimate (having legible justifications).

An alternative approach is to step outside of the "rules" framework and get in touch with what the rule is for - what preferences/values/strategy/patterns/structures/relationships/etc. it serves. Once you're in touch with that purpose, then you can think about both the current case, and what will become of the "general rule", in that light. This could end up with an explicitly reformulated rule, or not.

It seems like treating the "real process" as a rule is more fitting in some cases than others, a better fit for some people's style of thinking than for other people's, and also something that a person could choose to aim for more or less.

I think I'd find it easier to think through this topic if there was a long, diverse list of brief examples.

Comment by unnamed on Building up to an Internal Family Systems model · 2021-01-25T08:25:36.284Z · LW · GW

The back-and-forth (here and elsewhere) between Kaj & pjeby was an unusually good, rich, productive discussion, and it would be cool if the book could capture some of that. Not sure how feasible that is, given the sprawling nature of the discussion.

Comment by unnamed on Dishonest Update Reporting · 2021-01-24T06:18:59.997Z · LW · GW

This post seems to me to be misunderstanding a major piece of Paul's "sluggish updating" post, and clashing with Paul's post in ways that aren't explicit.

The core of Paul's post, as I understood it, is that incentive landscapes often reward people for changing their stated views too gradually in response to new arguments/evidence, and Paul thinks he has often observed this behavioral pattern which he called "sluggish updating." Paul illustrated this incentive landscape through a story involving Alice and Bob, where Bob is thinking through his optimal strategy, since that's a convenient way to describe incentive landscapes. But that kind of intentional strategic thinking isn't how the incentives typically manifest themselves in behavior, in Paul's view (e.g., "I expect this to result in unconscious bias rather than conscious misrepresentation. I suspect this incentive significantly distorts the beliefs of many reasonable people on important questions"). This post by Zvi misunderstands this as Paul describing the processes that go on inside the heads of actual Bobs. This loses track of the important distinction (which is the subject of multiple other LW Review nominees) between the rewards that shape an agent's behavior and the agent's intentions. It also sweeps much of the disagreement between Paul & Zvi's posts under the rug.

A few related ways the views in the two posts clash:

This post by Zvi focuses on dishonesty, while Paul suggests that unconsciously distorted beliefs are the typical case. This could be because Zvi disagrees with Paul and thinks that dishonesty is the typical case. Or it could be that Zvi is using the word "dishonest" broadly - he mostly agrees with Paul about what happens in people's heads, but applies the "dishonesty" frame in places where Paul wouldn't. Or maybe Zvi is just choosing to focus on the dishonest subset of cases. Or some combination of these.

Zvi focuses on cases where Bob is going to the extreme in following these incentives, optimizing heavily for it and propagating it into his thinking. "This is a world where all one cares about is how one is evaluated, and lying and deceiving others is free as long as you’re not caught." "Bob’s optimal strategy is full anti-epistemology." Paul seems especially interested in cases where pretty reasonable people (with some pretty good features in their epistemics, motivations, and incentives) still sometimes succumb to these incentives for sluggishness. Again, it's unclear how much of this is due to Zvi & Paul having different beliefs about the typical case and how much is about choosing to focus on different subsets of cases (or which cases to treats as central for model-building).

Paul's post is written from a perspective of 'Good epistemics don't happen by default', where thinking well as an individual involves noticing places where your mental processes haven't been aimed towards accurate beliefs and trying to do better, and social epistemics are an extension of that at the group level. Zvi's post is written from a perspective of 'catching cheaters', where good social epistemics is about noticing ways that people are something-like-lying to you, and trying to stop that from happening.

Zvi treats Bob as an adversary. Paul treats him as a potential ally (or as a state that you or I or anyone could find oneself in), and mentions "gaining awareness" of the sluggishness as one way for an individual to counter it.

Related to all of this, the terminology clashes (as I mentioned in a comment). I'd like to say a simple sentence like "Paul sees [?sluggishness?] as mainly due to [?unconscious processes?], Zvi as mainly due to [?dishonest update reporting?]" but I'm not sure what terms go in the blanks.

The "fire Bob" recommendation depends a lot on how you're looking at the problem space / which part of the problem space you're looking at. If it's just a recommendation for a narrow set of cases then I think it wouldn't apply to most of the cases that Paul was talking about in his "Observations in the wild", but if it's meant to apply more widely then that could get messy in ways that interact with the clashes I've described.

The other proposed solutions seem less central to these two posts, and to the clash between Paul & Zvi's perspectives.

I think there is something interesting in the contrast between Paul & Zvi's perspectives, but this post didn't work as a way to shine light on that contrast. It focuses on a different part of the problem space, while bringing in bits from Paul's post in ways that make it seem like it's engaging with Paul's perspective more than it actually does and make it confusing to look at both perspectives side by side.

Comment by unnamed on Coherent decisions imply consistent utilities · 2021-01-15T00:22:56.762Z · LW · GW

Sounds like the thing that is typically called "regret aversion".

Comment by unnamed on Covid 1/14: To Launch a Thousand Shipments · 2021-01-14T22:53:53.551Z · LW · GW

Crunching some numbers in a copy of the spreadsheet... Zvi's predictions are better than the naive model of assuming next week's numbers will be the same as this week's numbers.

Biggest improvement over the null model for predicting deaths (mean squared error is 47% as big), smallest improvement for positive test % (MSE 80% as big), in between for number of tests (MSE 67% as big).

Although if I instead look at the predicted weekly change and compare it to the actual change that week, all three sets of predictions are roughly equally accurate with correlations (predicted change vs. actual change) between .52 and .58.

Comment by unnamed on Covid 1/14: To Launch a Thousand Shipments · 2021-01-14T20:10:59.682Z · LW · GW

When I read this bit:

Only 37% of all distributed doses have been given

I wondered how that would look translated into a delay. The number of doses given through January 13 equals the number of doses that had been distributed __ days earlier.

I see from the graph titled "The US COVID-19 Vaccine Shortfall" (and introduced with "We can start with how it’s gone so far:") that the answer is about 17 days.

This seems like a more natural framing - it matches the process of why many doses haven't been given yet, and it seems likely to be more stable as we project the curves forward over many weeks (and less dependent on the shape of the 'doses distributed' curve).

So now I'm wondering if the delay (now 17 days) is likely to get smaller over time, or larger, or stay about the same.

Comment by unnamed on Dishonest Update Reporting · 2021-01-14T02:58:43.611Z · LW · GW

Seems like the terminology is still not settled well. 

There's a general thing which can be divided into two more specific things.

General Thing: The information points to 50%, the incentive landscape points to 70%, Bob says "70%".

Specific Thing 1: The information points to 50%, the incentive landscape points to 70%, Bob believes 50% and says "70%".

Specific Thing 2: The information points to 50%, the incentive landscape points to 70%, Bob believes and says "70%".

There are three Things and just two names, so the terminology is at least incomplete.

"Dishonest update reporting" sounds like the name of Specific Thing 1.

In Paul's post "sluggish updating" referred to the General Thing, but Dagon's argument here is that "sluggish updating" should only refer to Specific Thing 2. So there's ambiguity.

It seems most important to have a good name for the General Thing. And that's maybe the one that's nameless? Perhaps "sluggish update reporting", which can happen either because the updating is sluggish or because the reporting is sluggish/dishonest. Or "sluggish social updating"? Or something related to lightness? Or maybe "sluggish updating" is ok despite Dagon's concerns (e.g. a meteorologist updating their forecast could refer to changes that they make to the forecast that they present to the world).

Comment by unnamed on Any examples of people analyzing/critiquing scientific studies or papers? · 2021-01-13T23:46:25.822Z · LW · GW

A couple things that are maybe not exactly what you're looking for but are nearby and probably somewhat useful:

The blog Data Colada (example, example2)

Elizabeth's "epistemic spot check" series (example)

Comment by unnamed on Any examples of people analyzing/critiquing scientific studies or papers? · 2021-01-13T23:44:28.501Z · LW · GW

Here is a thing I wrote 10 years ago assessing an N-back study. That's a easy-for-me-to-remember example, where I also remember that the writeup comes pretty close to reflecting how I was thinking through things as I was looking at the paper.

Comment by unnamed on Johannes Kepler, Sun Worshipper · 2021-01-11T20:56:35.948Z · LW · GW

Well, the sun being the only object in our solar system that emits light is evidence for it being at the center.

It seems likely that there's something special about whichever body is in the center of the solar system. A lot of astronomers thought the Earth was special for being made of rock & water, and that this was related to the Earth being at the center, but they just conjectured that Mars & Venus & the other planets were made of something else. Whereas Kepler had much more direct observations about the sun's unique luminosity.

Aristarchus had a heliocentric model of the solar system in Ancient Greece, apparently motivated in large part by the fact that the sun was the largest object in the solar system.

In hindsight, we know that both luminosity and size relative to neighbors are both highly correlated with being at the center of a solar system, with Aristarchus's size thing having a tighter causal relationship with centrality.

Comment by unnamed on 100 Tips for a Better Life · 2021-01-05T10:45:00.493Z · LW · GW

Those public health official examples seem unrelated to tip #59 ("Those who generate anxiety in you and promise that they have the solution are grifters.").

I took hermanc1 to be pointing to how, in Feb-Mar 2020, the people who were saying scary sounding stuff (like using the word "pandemic") and proposing things to do about it were the ones who had insights and were telling it straight. Meanwhile many other people were calling those people out for "fearmongering" or spinning things to downplay the risk in order to prevent panic.

There are grifters who try to generate anxiety so they can sell you something. And also the world contains problems, and noticing problems can induce anxiety, and searching for & sharing (partial) solutions to problems is good. Maybe a sophisticated way of following tip #59 can distinguish between those, but the naive way of doing it can run into trouble and fail to see the smoke.

Comment by unnamed on Covid 12/24: We’re F***ed, It’s Over · 2020-12-30T03:35:18.802Z · LW · GW

Back in March, there was a lot of concern that uncontrolled spread would overwhelm the medical system and some hope that delay would improve the standard of care. Do we have good estimates now of those two effects? They could influence IFR estimates by a fair amount.

Also, my understanding is that the number of infections could've shot way past herd immunity levels. Herd immunity is just the point at which the number of active infections starts declining rather than increasing, and if there are lots of active infections at that time then they can spread to much of the remaining people before dwindling.

Comment by unnamed on Morality as "Coordination", vs "Do-Gooding" · 2020-12-29T06:47:24.113Z · LW · GW

I've had similar thoughts; the working title that I jotted down at some point is "Two Aspects of Morality: Do-Gooding and Coordination." A quick summary of those thoughts:

Do-gooding is about seeing some worlds as better than others, and steering towards the better ones. Consequentialism, basically. A widely held view is that what makes some worlds better than others is how good they are for the beings in those worlds, and so people often contrast do-gooding with selfishness because do-gooding requires recognizing that the world is full of moral patients.

Coordination is about recognizing that the world is full of other agents, who are trying to steer towards (at least somewhat) different worlds. It's about finding ways to arrange the efforts of many agents so that they add up to more than the sum of their parts, rather than less. In other words, try for: many agents combine their efforts to get to worlds that are better (according to each agent) than the world that that agent would have reached without working together. And try to avoid: agents stepping on each other's toes, devoting lots of their efforts to undoing what other agents have done, or otherwise undermining each other's efforts. Related: game theory, Moloch, decision theory, contractualism.

These both seem like aspects of morality because:

  • "moral emotions", "moral intuitions", and other places where people use words like "moral" arise from both sorts of situations
  • both aspects involve some deep structure related to being an agent in the world, neither seems like just messy implementation details for the other
  • a person who is trying to cultivate virtues or become a more effective agent will work on both
Comment by unnamed on Great minds might not think alike · 2020-12-26T21:12:17.469Z · LW · GW

Related to section I: Dunning, Meyerowitz,& Holzberg (1989) Ambiguity and self-evaluation: The role of idiosyncratic trait definitions in self-serving assessments of ability. From the abstract:

When people are asked to compare their abilities to those of their peers, they predominantly provide self-serving assessments that appear objectively indefensible. This article proposes that such assessments occur because the meaning of most characteristics is ambiguous, which allows people to use self-serving trait definitions when providing self-evaluations. Studies 1 and 2 revealed that people provide self-serving assessments to the extent that the trait is ambiguous, that is, to the extent that it can describe a wide variety of behaviors.

Comment by unnamed on DanielFilan's Shortform Feed · 2020-12-23T04:27:57.495Z · LW · GW

It seems clear that we want politicians to honestly talk about what they're intending to do with the policies that they're actively trying to change (especially if they have a reasonable chance of enacting new policies before the next election). That's how voters can know what they're getting.

It's less obvious how this should apply to their views on things which aren't going to be enacted into policy. Three lines of thinking that point in the direction of maybe it's good for politicians to keep quiet about (many of) their unpopular views:

It can be hard for listeners to tell how likely the policy is to be enacted, or how actively the politician will try to make it happen. I guess it's hard to fit into 5 words? e.g. I saw a list of politicians' "broken promises" on one of the fact checking sites, which was full of examples where the politician said they were in favor of something and then it didn't get enacted, and the fact checkers deemed that sufficient to count it as a broken promise. This can lead to voters putting too little weight on the things that they're actually electing the politician to do, e.g. local politics seems less functional if local politicians focus on talking about their views on national issues that they have no control over.

Another issue is that it's cheap talk. The incentive structure / feedback loops seem terrible for politicians talking about things unrelated to the policies they're enacting or blocking. Might be more functional to have a political system where politicians mostly talk about things that are more closely related to their actions, so that their words have meaning that voters can see.

Also, you can think of politicians' speech as attempted persuasion. You could think of voters as picking a person to go around advocating for the voters' hard-to-enact views (as well as to implement policies for the voters' feasible-to-enact views). So it seems like it could be reasonable for voters to say "I think X is bad, so I'm not going to vote for you if you go around advocating for X", and for a politician who personally favors X but doesn't talk about it to be successfully representing those voters.

Comment by unnamed on Fusion and Equivocation in Korzybski's General Semantics · 2020-12-21T08:58:38.890Z · LW · GW

You can think of growth mindset as a deidentification, basically identical to that example of Anna the student except done by Anna about herself rather than by her teacher. "Yet" is a wedge that gets you to separate your concept of you-so-far from your concept of future you. "I'm bad at X" sneaks in an equivocation to imply "now and always."

Comment by unnamed on Motive Ambiguity · 2020-12-20T04:50:48.591Z · LW · GW

I notice that many of these examples involve something like vice signalling - the person is destroying value in order to demonstrate that they have a quality which I (and most LWers) consider to be undesirable. It seems bad for the middle manager, politician, and start-up founder to aim for the shallow thing that they're prioritizing. And then they take the extra step of destroying something I do value in order to accentuate that. It's a combination that feels real icky.

The romantic dinner and the handmade gift examples don't have that feature. And those two cases feel more ambiguous - I can imagine versions of these where it seems good that the person is doing these things, and versions where it seems bad. I can picture a friend telling me "I took my partner out for their birthday to a restaurant that I don't really care for, but they just adore" and it being a heartwarming story, where it seems like something good is happening for my friend and their relationship.

Katja's recent post on Opposite Attractions points to one thing that seems good about taking your spouse to a restaurant that only they love - your spouse's life is full of things that you both like, and perhaps starved of certain styles of things that they like and you don't, and they could be getting something out of drawing from that latter category even if there's some sense in which they don't like it any more than a thing in the "youboth like it" category. And there's something good about them getting some of those things within the relationship, of having the ground that the relationship covers not be limited to the intersection of "the things you like" and "the things your spouse likes" - your relationship mostly takes advantage of that part of the territory but sometimes it's good to explore other parts of it together. And I could imagine you bringing an attitude to the meal where you're tuned in to your spouse's experience, trying to take pleasure in how much they enjoy the meal, rather than being focused on your own food. And (this is the part where paying a cost to resolve motive ambiguity comes in directly) going to a restaurant that they love and you don't like seems like it can help set the context for this kind of thing - putting the information in common knowledge between you two that this is a special occasion, and what sort of special occasion it's trying to be. It seems harder to hit some of these notes in a context where both people love the food.

(There are also versions the one-sided romantic dinner which seem worse, and good relationships where this version doesn't fit or isn't necessary.)

Comment by unnamed on Writing tools for tabooing? · 2020-12-13T20:07:52.176Z · LW · GW

Can you tell your spellchecker that they're not words?

Comment by unnamed on Number-guessing protocol? · 2020-12-07T21:33:51.157Z · LW · GW

If you're just doing this occasionally without recordkeeping, then it seems convenient to have the game result in "winners" rather than a more fine-grained score. But it could be fine to sometimes have multiple winners, or zero winners. Here's a simple protocol that does that:

The person who asks the question also defines what counts as "winning". e.g. "What's the value of such-and-such? Can anybody get it within 10%?" Then everyone guesses simultaneously, and all the people whose guesses are within 10% of the true value are "winners".

("Simultaneous" guessing can mean that first everyone comes up with their guess in their head, and then they take turns saying them out loud while on the honor system to not change their guess.)

Slightly more complicated, the asker could propose 2 standards of winning. "When did X happen? Grand prize if you guess the exact year, honorable mention if you get it within 5 years." Then if anyone guesses the exact year they're the big winner(s) and the people who get it within 5 years get the lesser glow of "honorable mention". And if no one guesses the exact year then the people who get it within 5 years feel more like winners.

If you continue farther in this direction you could get to one of Ericf's proposals. I think my version has lower barriers to entry, while Ericf's version could work better among people who use it regularly.

Comment by unnamed on Real-Life Examples of Prediction Systems Interfering with the Real World (Predict-O-Matic Problems) · 2020-12-04T00:44:17.925Z · LW · GW

Note that Trump got around 63M votes in 2016, and around 71M in 2020, whereas Democrats got 66M and 75M respectively.

The 2020 results are 81M-74M with some votes still left to count. 75M-71M might have been the margin a few weeks ago when there were still a bunch more not-yet-counted votes.

Comment by unnamed on Covid 12/3: Land of Confusion · 2020-12-03T22:46:11.751Z · LW · GW

Two minor corrections on the Denver Broncos section:

For whatever reason, Denver was told this weekend that the show had to go on, despite all four of its quarterbacks being ruled out due to contact tracing from their primary quarterback. No masks had been worn. 

If you think that represents gross incompetence and they should have held their backup backup backup quarterback in reserve like a designated survivor if they had no fifth option, you’d be right, but they did not think about that at the time.

The quarterback who got covid was their 3rd stringer (I think); he definitely wasn't their primary quarterback. Lock is their starter, Driskel is the one who got sick, and I believe their depth chart went Lock-Rypien-Driskel-Bortles. This is a downside of having 4 quarterbacks rather than 2 or 3 - more vectors into the quarterback room.

Also, the Broncos presumably did at least come across the idea of having a designated survivor reserve QB. They just decided not to do it. Some NFL teams have been keeping one of their quarterbacks apart - there have been news articles all year about the Buffalo Bills using Jake Fromm as their quarantined quarterback - and that news must've reached Denver.

Comment by unnamed on Maybe Lying Can't Exist?! · 2020-11-15T02:23:07.803Z · LW · GW

Here's a toy example which should make it clearer that the probability assigned to the true state is not the only relevant update.

Let's say that a seeker is searching for something, and doesn't know whether it is in the north, east, south, or west. If the object is in the north, then it is best for the seeker to go towards it (north), worst for the seeker to go directly away from it (south), and intermediate for them to go perpendicular to it (east or west). The seeker meets a witness who knows where the thing is. The majority (2/3) of witnesses want to help the seeker find it and the rest (1/3) want to hinder the seeker's search. And they have common knowledge of all of this.

In this case, the witness can essentially just direct the seeker's search - if the witness says "it's north" then the seeker goes north, since 2/3 of witnesses are honest. So if it's north and the witness wants to hinder the seeker, they can just say "it's south". This seems clearly deceptive - it's hindering the seeker's search as much as possible by messing up their beliefs. But pointing them south does actually lead to a right-direction update on the true state of affairs, with p(north) increasing from 1/4 (the base rate) to 1/3 (the proportion of witnesses who aim to hinder). It's still a successful deception because it increases p(south) from 1/4 to 2/3, and that dominates the seeker's choice.

Comment by unnamed on Maybe Lying Can't Exist?! · 2020-11-15T02:08:40.690Z · LW · GW

There are simpler examples where identifying deception seems more straightforward. e.g., If a non-venomous snake takes on the same coloration as a venomous snake, this is intended to increase others' estimates of p(venomous) and reduce their estimates of p(not venomous), which is a straightforward update in the wrong direction. 

In the fist attempt at a definition of deceptive signalling, it seems like a mistake to only look at the probability assigned to the true state ("causing the receiver to update its probability distribution to be less accurate (operationalized as the logarithm of the probability it assigns to the true state)"). Actions are based on their full probability distribution, not just the probability assigned to the true state. In the firefly example, P. rey is updating in the right direction on p(predator) (and on p(nothing)), but in the wrong direction on p(mate). And their upward update on p(mate) seems to be what's driving the predator's choice of signal. Some signs of this:

The predator mimicked the signal that the mates were using, when it could have caused a larger correct update to p(predator) and reversed the incorrect update to p(mate) by choosing any other signal. Also, P. redator chose the option that maximized the prey's chances of approaching it, and the prey avoids locations when p(predator) is sufficiently high. If we model the prey as acting according to a utility function, the signal caused the prey to update its expected utility estimate in the wrong direction by causing it to update one of its probabilities in the wrong direction (equivalently: the prey updated the weighted average of its probabilities in the wrong direction, where the weights are based on the relevant utilities). We could also imagine hypothetical scenarios, like if the predator was magically capable of directly altering the prey's probability estimates rather than being limited to changing its own behavior and allowing the prey to update. 

Comment by unnamed on Signalling & Simulacra Level 3 · 2020-11-15T01:42:18.940Z · LW · GW

I think part of the story is that language is compositional. If someone utters the words "maroon hexagon", you can make a large update in favor of a specific hypothesis even if you haven't previously seen a maroon hexagon, or heard those words together, or judged there to be anything special about that hypothesis. "Maroon" has been sufficiently linked to a specific narrow range of colors, and "hexagon" to a specific shape, so you get to put those inferences together without needing additional coordination with the speaker.

This seems related to the denotation/connotation distinction, where compositional inferences are (typically?) denotations. Although the distinction seems kind of fuzzy, as it seems that connotations can (gradually?) become denotations over time, e.g. "goodbye" to mean that a departure is imminent, or an image of a red octagon to mean "stop" (although I'd say that the words "red octagon" still only have the connotation of "stop"). And "We should get together more often" is interesting because the inferences you can draw from it aren't that related to the inferences you typically draw from the phrases "get together" and "more often".

Comment by unnamed on CFAR Participant Handbook now available to all · 2020-11-10T01:29:24.385Z · LW · GW

Hello! I'm wondering if I can translate your book into Russian?

I'm not going to monetize it, and of course I will give the credits.

Yes, you can translate it. Just make it clear that the original content in English is from CFAR, and the translation into Russian is something that you've done independently.

-Dan from CFAR

Comment by unnamed on Incentive Problems With Current Forecasting Competitions. · 2020-11-09T20:57:41.007Z · LW · GW

There's a similar challenge in sports with evaluating athletes' performance. Some pieces of what happens there:

There are many different metrics to summarize/evaluate a player's performance rather than just one score (e.g., see all the tables here). Many are designed to evaluate a particular aspect of a player's performance rather than how well they did overall, and there are also multiple attempts to create a single comprehensive overall rating. Over the past decade or two there have been a bunch of improvements in this, with more and better metrics, including metrics that incorporate different sources of information.

There common features of different stats that people who follow the analytics are aware of, such as whether they're volume stats (number of successes) or efficiency stats (number of successes per attempt). Some metrics attempt to adjust for factors that aren't under the player's control which can influence the numbers, such as the quality of the opponent, the quality of the player's teammates, the environment of the game (weather & stadium), various sources of randomness, and whether the play happened in "garbage time" (when the game was already basically decided).

Payment is based on negotiations with the people who benefit from the player's performance (their team's owners) rather than being directly dependent on their stats. Their stats do play into the decision, as do other thing such as close examinations of particular plays that they made.

The awards for individual performance that people care about the most (e.g., All-Star teams, MVP awards, Hall of Fame) are based on voting (by various pools of voters) rather than being directly based on the statistics. Though again, they're influenced by the statistics and tend to line up pretty closely with the statistics.

The achievements that people care about the most (e.g., winning games & championships) are team achievements rather than individual achievements. In a typical league there might be 30 teams which each have 20 players, and there's a mix of competitiveness between teams and cooperativeness within a team.

Seems like forecasting might benefit from copying some parts of this. For example, instead of having one leaderboard with an overall forecasting score, have several leaderboards for different ways of evaluating forecasts, along with tables where you can compare forecasters on a bunch of them at once and a page for each forecaster where you can see all their metrics, how they rank, and maybe some other stuff like links to their most upvoted comments.

Comment by unnamed on Open & Welcome Thread – October 2020 · 2020-10-31T21:05:44.363Z · LW · GW

AI Camera Ruins Soccer Game For Fans After Mistaking Referee's Bald Head For Ball

Comment by unnamed on Bet On Biden · 2020-10-18T08:07:02.049Z · LW · GW

It looks like this includes the fees you pay to PredictIt, but not the taxes you pay to the government.

Comment by unnamed on Bet On Biden · 2020-10-18T03:08:21.069Z · LW · GW

Do you have an estimate of expected profit per $100 bet (for a few of the most plausible scenarios)?

My impression is that PredictIt is +EV is you make lots of not-too-correlated bets so that your losses can offset your wins (though maybe not by enough to be worth the time & effort), but it's generally -EV (or at best barely +EV) if you deposit to make a one-off bet where you have to pay fees & taxes on your winnings (and don't get any tax benefit from your losses).

Related: Limits of Current US Prediction Markets (PredictIt Case Study)

Comment by unnamed on Babble challenge: 50 ways of sending something to the moon · 2020-10-02T00:56:59.967Z · LW · GW

Here's a doc, since I haven't figured out how to do spoilers.

Comment by unnamed on is scope insensitivity really a brain error? · 2020-09-29T21:01:44.612Z · LW · GW

One way to look at this is to pick questions where you're really sure that the two versions of the question should have different answers. For example, questions where the answer is a probability rather than a subjective value. One study some years ago asked some people for the probability that Assad's regime would fall in the next 3 months, and others for the probability that Assad's regime would fall in the next 6 months. As described in the book Superforecasting, non-superforecasters gave essentially identical answers to these two questions (40% and 41%, respectively). So it seems like they were making some sort of error by not taking into account the size of the duration. (Superforecasters gave different answers, 15% and 24%, which did take the duration into account pretty well.)

Comment by unnamed on Comparative Advantage is Not About Trade · 2020-09-22T20:03:28.370Z · LW · GW

I think of comparative advantage & specialization as features of production. People producing the things that they have comparative advantage at puts society on the pareto frontier in terms of the amount of each good that is produced.

I haven't been thinking of this as a theorem, but I think it could go something like: there are n people and m goods and person i will produce p*f(i,j) units of good j if they devote p fraction of their time to producing good j, and each person uses 100% of their time producing goods. Then if you want to describe the pareto frontier that maximizes the amount of goods produced, it involves each person producing a good where they have a favorable ratio of how much of that good they can produce vs. how much of other goods-being-produced they can produce.

Comment by unnamed on What's the CFAR position on how the workbook can be used? · 2020-09-12T03:11:13.934Z · LW · GW

(This is Dan from CFAR)

Yep, you're definitely free to run a reading group on the handbook.

You can basically just treat it like any other book. CFAR made the handbook as a supplement to our workshops, and we put it out there so that other people can see what's in it and make their own calls about what else to do with it.

Comment by unnamed on The Four Children of the Seder as the Simulacra Levels · 2020-09-09T00:59:02.090Z · LW · GW

I guess I'm still confused about the basics of simulacrum levels, because I'm not sure what level those sentences are on. e.g., "Please pass the potatoes" is intended to have the consequence of causing someone to pass the potatoes, rather than attempting to accurately describe the world, which (I think) matches how people have been describing level 2. But also it seems concrete and grounded, rather than involving a distortion of reality. So maybe it is level 1? Or not in the hierarchy at all?

Comment by unnamed on Escalation Outside the System · 2020-09-09T00:52:07.852Z · LW · GW

Related post by hilzoy.

Its opening section is the part that's least related, so you could skip it and begin with this part:

Back in 1983, I sat in on a conference on women and social change. There were fascinating people from all over the world, women who had been doing extraordinary things in their own countries, and who had gathered together to talk it through; and I got to be a fly on the wall.
During this conference, there was a recurring disagreement about the role of violence in fighting deeply unjust regimes.
Comment by unnamed on Can Social Dynamics Explain Conjunction Fallacy Experimental Results? · 2020-08-14T20:24:53.040Z · LW · GW

The social dynamics that you point to in your John-Linda anecdote seem to depend on the fact that John knows what happened with Linda. This suggests that these social dynamics would not apply to questions about the future, where the question was coming from someone who couldn't know what was going to happen.

Some studies have looked for the conjunction fallacy in predictions about the future, and they've found it there too. One example which was mentioned in the post that you linked is the forecast about a breakdown of US-Soviet relations. Here's a more detailed description of the study from a an earlier post in that sequence:

Another experiment from Tversky and Kahneman (1983) was conducted at the Second International Congress on Forecasting in July of 1982.  The experimental subjects were 115 professional analysts, employed by industry, universities, or research institutes.  Two different experimental groups were respectively asked to rate the probability of two different statements, each group seeing only one statement:
1. "A complete suspension of diplomatic relations between the USA and the Soviet Union, sometime in 1983."
2. "A Russian invasion of Poland, and a complete suspension of diplomatic relations between the USA and the Soviet Union, sometime in 1983."
Estimates of probability were low for both statements, but significantly lower for the first group than the second (p < .01 by Mann-Whitney).  Since each experimental group only saw one statement, there is no possibility that the first group interpreted (1) to mean "suspension but no invasion".
Comment by unnamed on A Personal (Interim) COVID-19 Postmortem · 2020-06-26T20:30:50.835Z · LW · GW
It seems clear that maks wearing reduces spread somewhat, but note that this is because of reducing spread from infectious individuals, especially pre-symptomatic and asymptomatic people, not protecting mask wearers. The early skepticism was in part based on the assumption, which in March seemed to have been shared by both promoters and skeptics, that the benefits were that masks were individually protective, rather than that they helped population-level spread reduction.

The early *arguments* I saw were mainly about whether masks meaningfully reduced the wearer's chances of getting infected. But it was already conventional wisdom that masks did meaningfully reduce the wearer's chances of infecting others, people just weren't taking the next step of arguing for general mask use on these grounds. For example, the early March CDC recommendation (linked in the anti-CDC LW post) was:

CDC does not recommend that people who are well wear a facemask to protect themselves from respiratory diseases, including COVID-19.
Facemasks should be used by people who show symptoms of COVID-19 to help prevent the spread of the disease to others. The use of facemasks is also crucial for health workers and people who are taking care of someone in close settings (at home or in a health care facility).

By mid March, there were organized efforts to increase mask use on the grounds that it reduced the wearer's chances of infecting others. The Czech government (which mandated mask use on March 19) and the #Masks4All campaign were the most prominent ones that I saw - both encouraged people to make their own cloth masks and used the slogan "My mask protects you, your mask protects me" (they may also have talked about some risk-reduction benefits for the wearer). A quick search turns up this March 14 video (in Czech, with English closed captioning available) as the earliest source I could quickly find clearly making this case for widespread mask use.

Comment by unnamed on SlateStarCodex deleted because NYT wants to dox Scott · 2020-06-23T23:44:45.441Z · LW · GW

This reminds me of the time that Slate published hilzoy's real name, in 2009.

I think what happened there is that the Slate author was following journalistic customs of using real names and didn't realize that hilzoy wanted to stay pseudonymous online, and hilzoy had been even less vigilant than Scott about keeping her real name unfindable. And then once the article had been published, hilzoy's request to remove her name ran into Slate's policy of never changing published articles unless they contain a factual error, and this was not a factual error. (It's possible that the author also had some adversarial motives for publishing the name - it did happen in the context of a disagreement between her and hilzoy - but I don't know of any clear or direct evidence for that.)

So the main storyline here might be about the media having its own customs and not much caring about what happens to the people that they cover. The press does not hate you, nor does it love you, but you are made out of stories which it can tell to its audience. I'm not sure what implications (if any) this has about what to do now.

Comment by unnamed on Using the Quantified Self paradigma for COVID-19 · 2020-06-18T04:37:30.731Z · LW · GW

May 28: WVU Rockefeller Neuroscience Institute announces capability to predict COVID-19 related symptoms up to three days in advance using Oura rings

June 16: NBA restart plan includes using Oura rings to catch COVID-19 symptoms

Comment by unnamed on Everyday Lessons from High-Dimensional Optimization · 2020-06-14T21:37:59.440Z · LW · GW

The distance between the n-dimensional points (0,0,...,0) and (1,1,...,1) is sqrt(n). So if you move sqrt(n) units along that diagonal, you move 1 unit along the dimension that matters. Or if you move 1 unit along the diagonal, you move 1/sqrt(n) units along that dimension. 1/sqrt(n) efficiency.

If you instead move 1 unit in a random direction then sometimes you'll move more than that and sometimes you'll move less, but I figured that was unimportant enough on net to leave it O(1/sqrt(n)).

Comment by unnamed on Everyday Lessons from High-Dimensional Optimization · 2020-06-08T01:16:52.552Z · LW · GW

Seems like some changes are more like Euclidean distance while others are more like turning a single knob. If I go visit my cousin for a week and a bunch of aspects of my lifestyles shift towards his, that is more Euclidean than if I change my lifestyle by adding a new habit of jogging each morning. (Although both are in between the extremes of pure Euclidean or purely a single knob - you could think of it in terms of the dimensionality of the subspace that you're moving in.)

And something similar can apply to work habits, thinking styles, etc.

Comment by unnamed on Everyday Lessons from High-Dimensional Optimization · 2020-06-07T02:02:46.343Z · LW · GW
On the other hand, if we’re designing a bridge and there’s one particular strut which fails under load, then we’d randomly try changing hundreds or thousands of other struts before we tried changing the one which is failing.

This bridge example seems to be using a different algorithm than the e coli movement. The e coli moves in a random direction while the bridge adjustments always happen in the direction of a basis vector.

If you were altering the bridge in the same way that e coli moves, then every change to the bridge would alter that one particular strut by a little bit (in addition to altering every other aspect of the bridge).

Whereas if e coli moved in the same way that you describe the hypothetical bridge design, then it would only move purely along a single coordinate (such as from (0,0,0,0,0) to (0,0,0,1,0)) rather than in a random direction.

My intuition is that the efficiency of the bridge algorithm is O(1/n) and the e coli algorithm is O(1/sqrt(n)).

Which suggests that if you're doing randomish exploration, you should try to shake things up and move in a bunch of dimensions at once rather than just moving along a single identified dimension.

Comment by unnamed on What are objects that have made your life better? · 2020-05-21T22:08:56.925Z · LW · GW

Some lists that people have made of products that they use & recommend:

Sam Bowman, 2017

Sam Bowman, 2019

Robert Wiblin, 2019

Arden Koehler, 2019

Rosie Campbell, 2019

Comment by unnamed on What are objects that have made your life better? · 2020-05-21T21:46:39.500Z · LW · GW

The Time Timer Audible Countdown Timer.

This is the timer that I like to use when working, e.g. if I decide "alright, I'm going to spend the next half hour working on this thing." It is a visual timer, where the fraction of the circle that is red tells you what fraction of an hour is left. Ignore its bizarre name - its best feature is that it is completely inaudible.

Features that I like:
- it counts down silently, without any ticking
- I can (and do) set it to end silently, without any alarm sound
- it is easy to tell at a glance about how much time is left
- it is quick & straightforward to set the timer, without any button pressing
- it is a physical object rather than a program on a computing device

Features that it lacks which some people might miss:
- you can't choose a nice sound for the alarm, either it's silent or there's the one kinda annoying alarm sound
- it is not a program on your computing device, but rather a separate object you need to have with you
- it can't be set to more than an hour
- it can't be set precisely

Comment by unnamed on Book Review: Narconomics · 2020-05-03T05:13:58.398Z · LW · GW

The economic argument seems wrong in the "Burning coca leaves won’t win the war" section.

The total amount of a good that consumers buy must be less than or equal to the amount that is produced (and not destroyed). So if enough of the crop gets destroyed, then less of it will get consumed. And that'll happen regardless of whether the suppliers are in a competitive market or monopsony or threaten people with guns.

I framed this in terms of quantities rather than prices because the argument seems more straightforward this way. Also, it seems like reducing the quantity sold is more directly related to what anti-drug folks care about than raising the price. Also, the street price for US consumers would presumably go up if the availability went down, since the people who sell drugs to consumers would be able to make more profit by raising their prices.

If there are problems with the economic argument in the post, that doesn't necessarily mean the conclusion is wrong. "Burning lots of coca crops will have little to no effect on the price or quantity of cocaine in the US" does seem plausible, mainly because producers can just grow a lot more coca leaves than they need. Producers can predict in advance that lots of their crop might get destroyed (or their product lost in transit or similar), and growing coca leaves is not that expensive relative to their operation, so they can add a lot of slack by growing more than they need. (This doesn't depend on monopsony or violence.)

Comment by unnamed on The One Mistake Rule · 2020-04-14T09:08:38.033Z · LW · GW

One obviously mistaken model that I got a lot of use out of during a stretch of Feb-Mar is the one where the cumulative number of coronavirus infections in a region doubles every n days (for some globally fixed, unknown value of n).

This model has ridiculous implications if you extend it forward for a few months, as well as various other flaws. I was aware of those ridiculous implications and some of those other flaws, and used it anyways for several days before trying to find less flawed models.

I'm glad that I did, since it helped me have a better grasp of the situation and be more prepared for what was coming. And I don't think it would've made much difference at the time if I'd learned more about SEIR models and so on.

It's unclear how examples like this are supposed to fit with the One Mistake Rule or the exceptions in the last paragraph.

Comment by unnamed on The One Mistake Rule · 2020-04-10T21:20:12.055Z · LW · GW

This seems important.

Another feature of competitive markets is that "not betting" is always available as a safe default option. Maybe that means waiting to bet until some unknown future date when your models are good enough, maybe it means never betting in that market. In many other contexts (like responding to covid-19) there is no safe default option.

Comment by unnamed on Has LessWrong been a good early alarm bell for the pandemic? · 2020-04-04T03:55:49.340Z · LW · GW

In the broader rationality/EA community there was also a Siderea post on Jan 30 and an 80K podcast on Feb 3 (along with a followup podcast on Feb 14).

These two, plus Matthew Barnett's late Jan EA Forum post (which you linked), are the three examples I recall which look most like early visible public alarms from the rationality/EA community.

Other writing was less visible (e.g., on Twitter, Facebook, or Metaculus), less alarm-like (discussions of some aspect of what was happening rather than a call to attention), or later (like the putanumonit Seeing the Smoke post on Feb 27).

Comment by unnamed on Has LessWrong been a good early alarm bell for the pandemic? · 2020-04-03T22:34:03.635Z · LW · GW

I think this post is giving the stock market too much credit.

I'd date the start of the stock market fall as February 24 rather than February 20. The S&P close on Feb 20 & Feb 21 was roughly the same as it had been over the previous couple weeks, and higher than the close on Feb 7, 5, 4, or 3. The first notable dip happened on February 24th; that was the first day that set a low for the month of Feb 2020 (and Feb 25 was the first day that set a low for calendar year 2020).

Also, that was just the start of the crash. The stock market continued falling sharply and erratically for a couple more weeks, and didn't get within 10% of its current level until March 12th (2.5 weeks after it started its fall on Feb 24).

Comment by unnamed on April Fools: Announcing LessWrong 3.0 – Now in VR! · 2020-04-01T08:50:51.188Z · LW · GW

This is now my favorite way to read HPMOR. I love the Star Wars feel.