## Posts

Multitudinous outside views 2020-08-18T06:21:47.566Z
Update more slowly! 2020-07-13T07:10:50.164Z
A Personal (Interim) COVID-19 Postmortem 2020-06-25T18:10:40.885Z
Market-shaping approaches to accelerate COVID-19 response: a role for option-based guarantees? 2020-04-27T22:43:26.034Z
Potential High-Leverage and Inexpensive Mitigations (which are still feasible) for Pandemics 2020-03-09T06:59:19.610Z
Ineffective Response to COVID-19 and Risk Compensation 2020-03-08T09:21:55.888Z
Updating a Complex Mental Model - An Applied Election Odds Example 2019-11-28T09:29:56.753Z
Theater Tickets, Sleeping Pills, and the Idiosyncrasies of Delegated Risk Management 2019-10-30T10:33:16.240Z
Divergence on Evidence Due to Differing Priors - A Political Case Study 2019-09-16T11:01:11.341Z
Hackable Rewards as a Safety Valve? 2019-09-10T10:33:40.238Z
What Programming Language Characteristics Would Allow Provably Safe AI? 2019-08-28T10:46:32.643Z
Schelling Fences versus Marginal Thinking 2019-05-22T10:22:32.213Z
Values Weren't Complex, Once. 2018-11-25T09:17:02.207Z
Oversight of Unsafe Systems via Dynamic Safety Envelopes 2018-11-23T08:37:30.401Z
Collaboration-by-Design versus Emergent Collaboration 2018-11-18T07:22:16.340Z
Multi-Agent Overoptimization, and Embedded Agent World Models 2018-11-08T20:33:00.499Z
Policy Beats Morality 2018-10-17T06:39:40.398Z
(Some?) Possible Multi-Agent Goodhart Interactions 2018-09-22T17:48:22.356Z
Lotuses and Loot Boxes 2018-05-17T00:21:12.583Z
Non-Adversarial Goodhart and AI Risks 2018-03-27T01:39:30.539Z
Evidence as Rhetoric — Normative or Positive? 2017-12-06T17:38:05.033Z
A Short Explanation of Blame and Causation 2017-09-18T17:43:34.571Z
Prescientific Organizational Theory (Ribbonfarm) 2017-02-22T23:00:41.273Z
A Quick Confidence Heuristic; Implicitly Leveraging "The Wisdom of Crowds" 2017-02-10T00:54:41.394Z
A Cruciverbalist’s Introduction to Bayesian reasoning 2017-01-12T20:43:48.928Z
Map:Territory::Uncertainty::Randomness – but that doesn’t matter, value of information does. 2016-01-22T19:12:17.946Z
Meetup : Finding Effective Altruism with Biased Inputs on Options - LA Rationality Weekly Meetup 2016-01-14T05:31:20.472Z
Perceptual Entropy and Frozen Estimates 2015-06-03T19:27:31.074Z
Meetup : Group Decision Making (the good, the bad, and the confusion of welfare economics) 2013-04-30T16:18:04.955Z

Comment by davidmanheim on Eight claims about multi-agent AGI safety · 2021-01-13T18:17:20.373Z · LW · GW

My point was that deception will almost certainly outperform honesty/cooperation when AI is interacting with humans, and in reflection, seems likely do so even interacting with other AIs by default because there is no group selection pressure.

Comment by davidmanheim on A vastly faster vaccine rollout · 2021-01-13T16:08:52.154Z · LW · GW

One key limitation for vaccines is supply, as others have noted. That certainly doesn't explain everything, but it does explain a lot.

This obstacle was, of course, completely foreseeable, and we proposed a simple way to deal with the problem, which we presented to policymakers and even posted on Lesswrong, by the end of April.

Thus beings our story.

Unfortunately, we couldn't get UK policymakers on board when we discussed it, and the US was doing "warp speed" and Congress wasn't going to allocate money for a new idea.

We were told that in general policymakers wanted an idea published / peer reviewed before they'd take the idea more seriously, so we submitted a paper. At this point, as a bonus, Preprints.org refused to put the preprint online. (No, really. And they wouldn't explain.)

We submitted it as a paper to Vaccine May 20th, and they sent it for review, we got it back mid-june, did revisions and resubmitted early July, then the journal changed its mind and said "your paper does not appear to conduct original research, thus it does not fit the criteria." After emailing to ask what they were doing, they relented and said we could cut the length in half and re-submit as an opinion piece.

We went elsewhere, to a newer, open access, non-blinded review journal, and it was finally online in October, fully published: https://f1000research.com/articles/9-1154

Comment by davidmanheim on A vastly faster vaccine rollout · 2021-01-13T15:52:58.580Z · LW · GW

Which is why we proposed exactly that, at the end of April: https://www.lesswrong.com/posts/uXb4gcDP2fgBPcMHJ/market-shaping-approaches-to-accelerate-covid-19-response-a

Now published: https://f1000research.com/articles/9-1154

Unfortunately, we couldn't get policymakers on board.

Comment by davidmanheim on The Case for a Journal of AI Alignment · 2021-01-10T07:42:36.871Z · LW · GW

In the spirit of open peer review, here are a few thoughts:

First, overall, I was convinced during earlier discussions that this is a bad idea - not because of costs, but because the idea lacks real benefits, and itself will not serve the necessary functions. Also see this earlier proposal (with no comments). There are already outlets that allow robust peer review, and the field is not well served by moving away from the current CS / ML dynamic of arXiv papers and presentations at conferences, which allow for more rapid iteration and collaboration / building on work than traditional journals - which are often a year or more out of date as of when they appear. However, if this were done, I would strongly suggest doing it as an arXiv overlay journal, rather than a traditional structure.

One key drawback you didn't note is that allowing AI safety further insulation from mainstream AI work could further isolate it. It also likely makes it harder for AI-safety researchers to have mainstream academic careers, since narrow journals don't help on most of the academic prestige metrics.

Two more minor disagreement are about first, the claim that  "If JAA existed, it would be a great place to send someone who wanted a general overview of the field." I would disagree - in field journals are rarely as good a source as textbooks or non-technical overview. Second, the idea that a journal would provide deeper, more specific, and better review than Alignment forum discussions and current informal discussions seems farfetched given my experience publishing in journals that are specific to a narrow area, like Health security, compared to my experience getting feedback on AI safety ideas.

Comment by davidmanheim on Eight claims about multi-agent AGI safety · 2021-01-10T07:30:58.260Z · LW · GW

Honesty, too, arose that way. So I'm not sure whether (say) a system trained to answer questions in such a way that the humans watching it give reward would be more or less likely to be deceptive.

I think it is mistaken. (Or perhaps I don't understand a key claim / assumption.)

Honesty evolved as a group dynamic, where it was beneficial for the group to have ways for individuals to honestly commit, or make lying expensive in some way. That cooperative pressure dynamic does not exist when a single agent is "evolving" on its own in an effectively static environment of humans. It does exist in a co-evolutionary multi-agent dynamic - so there is at least some reason for optimism within a multi-agent group, rather than between computational agents and humans - but the conditions for cooperation versus competition seem at least somewhat fragile.

Comment by davidmanheim on Eight claims about multi-agent AGI safety · 2021-01-10T07:24:22.651Z · LW · GW

Strongly agree that it's unclear that there failures would be detected.
For discussion and examples, see my paper here: https://www.mdpi.com/2504-2289/3/2/21/htm

Comment by davidmanheim on Eight claims about multi-agent AGI safety · 2021-01-10T07:21:59.095Z · LW · GW

Another possible argument is that we can't tell when multiple AIs are failing or subverting each other.
Each agent pursuing its own goals in a multi-agent environment are intrinsically manipulative, and when agents are manipulating one another, it happens in ways that we do not know how to detect or consider. This is somewhat different than when they manipulate humans, where we have a clear idea of what does and does not qualify as harmful manipulation.

Comment by davidmanheim on Vanessa Kosoy's Shortform · 2021-01-05T07:58:53.365Z · LW · GW

re: #5, that doesn't seem to claim that we can infer U given their actions, which is what the impossibility of deducing preferences is actually claiming. That is, assuming 5, we still cannot show that there isn't some  such that .

(And as pointed out elsewhere, it isn't Stuart's thesis, it's a well known and basic result in the decision theory / economics / philosophy literature.)

Comment by davidmanheim on Reason as memetic immune disorder · 2021-01-03T09:12:42.859Z · LW · GW

Those aren't actually how orthodox Jews interpret the rules, or apply them nowadays. Tassels are only on very specific articles of clothing, which are hidden under people's shirts, I'm not even sure what "tying money to yourself" is about, adulterers are only stoned if the temple stands and only under nearly-impossible to satisfy conditions, trees less than 5 years old are only considered a biblical problem in Israel, and if you're unsure, the fruit is allowed in the rest of the world, and the ritual purity laws don't apply in general because everyone is assumed to be contaminated anyways.

Comment by davidmanheim on What evidence will tell us about the new strain? How are you updating? · 2020-12-30T08:39:42.019Z · LW · GW

'Has been spotted in' isn't much to work with.

Agreed, but note that the US just found its first case, and it is community-acquired, plus we aren't doing anything to stop importation, so I'm assuming it's everywhere already, and just starting the exponential phase.

(Note that I cannot find good public data for spread within the UK, which would be the key way to update about the strain.)

Comment by davidmanheim on Unconscious Economics · 2020-12-25T08:37:21.798Z · LW · GW

I'll note that in my policy PhD program, the points made in the post were well-understood, and even discussed briefly as background in econ classes - despite the fact that we covered a lot of the same econ material you did. That mostly goes to show the difference between policy and "pure" econ, but still.

Comment by davidmanheim on A Personal (Interim) COVID-19 Postmortem · 2020-12-14T10:47:03.741Z · LW · GW

First, in that comment, I wasn't arguing that quarantines aren't helpful. I said that the parentheses make the claim false; "In epidemiology it is a basic fact in the 101 textbook that slowing long distance transmission (using quarantines / travel restrictions) is very important." You seem to agree that this is the received wisdom.

And I agree that we should have done border closures earlier, but I would note that the simple counterfactual world, where people in general ignore epidemiologists more often, is far worse than our world in many ways.  I think a world where border closures could be done at the drop of a hat would be worse in other ways as well. You can argue, correctly, that only doing closures when actually necessary is better, but I don't think breaking down the norm of not banning travel would be a net benefit. (See: Chesterton's fence, and for a concrete example, see China's ongoing internal and external travel restrictions, and how that enables concentration camps in Xinjian.)

In my view I have downgraded the profession's scientific credibility

I agree with you that the current failure should make your downgrade your opinion of experts somewhat. But see above about what I think of ignoring epidemiologists more often in general.

"it seems foregone conclusion that if one has limited test&trace capability, limiting introduction of new infectious cases will be helpful for the available capacity to contain new clusters"

Agreed, but there was no reason to have limited test and trace resources. More recent articles confirm that we could have done symptomatic tracing - loss of smell, coughing, etc - and isolation of just those cases, and shut down transmission completely without any testing. Shutting down borders helps, especially without sufficient tests, but it should not have been needed.

"Other mind-boggling decisions by epidemiological elite here in Finland..."

I can't comment on Finland specifically, but think that your local elite was probably less unanimous at the time, and the international consensus was different as well.

"if we think clusters have become an uncontrolled epidemic, we will just cease all tracing and other similar efforts","

Yes, if spread grows too large, tracing + quarantines is in fact not worthwhile, and shutdowns will be cheaper. (You can play with a basic DE model and put costs on tracing to convince yourself why this is true.)

And yes, removing all restrictions does lead to a rebound and worse spread later. Just look at the US.

Comment by davidmanheim on Real-Life Examples of Prediction Systems Interfering with the Real World (Predict-O-Matic Problems) · 2020-12-14T09:11:51.997Z · LW · GW

That's a good point. For some of the questions, that's a reasonable criticism, but as GJ Inc. becomes increasingly based on client-driven questions, it's a less viable strategy.

Comment by davidmanheim on Real-Life Examples of Prediction Systems Interfering with the Real World (Predict-O-Matic Problems) · 2020-12-14T09:10:29.986Z · LW · GW

Yes - GJO isn't actually quite doing superforecasting as the book describes - for example, it's not team-based.

Comment by davidmanheim on Market-shaping approaches to accelerate COVID-19 response: a role for option-based guarantees? · 2020-12-14T08:52:13.947Z · LW · GW

See the original twitter thread for a slightly earlier version of this idea, a video presentation to Foresight on the topic, as well as their extensive report that discussed that presentation, and the new paper for a more polished version.

Comment by davidmanheim on Real-Life Examples of Prediction Systems Interfering with the Real World (Predict-O-Matic Problems) · 2020-12-08T09:19:15.660Z · LW · GW

Thankfully, this scam is far less viable now that people can google the writers of these predictions.

And there was always the simple defense of not trusting stock picks from people who aren't very wealthy  themselves, and already managing people's money successfully in public view.

Comment by davidmanheim on Real-Life Examples of Prediction Systems Interfering with the Real World (Predict-O-Matic Problems) · 2020-12-08T09:16:35.385Z · LW · GW

I think this is exactly what most pundits do, and it's well known that correct predictions are reputation makers.

The problem is that making more than one correct but still low-probability prediction is incredibly unlikely, since you multiply two small numbers. This functions as a very strong filter. And you don't need to carefully vet track records to see when someone loudly gets it wrong, so as we see, most pundits stop making clear and non-consensus predictions once they start making money as pundits.

Comment by davidmanheim on Real-Life Examples of Prediction Systems Interfering with the Real World (Predict-O-Matic Problems) · 2020-12-08T09:13:07.252Z · LW · GW

Thanks - this is super-helpful! And after looking briefly, a citation for the above example is here.

Comment by davidmanheim on Real-Life Examples of Prediction Systems Interfering with the Real World (Predict-O-Matic Problems) · 2020-12-07T14:24:38.023Z · LW · GW

Another possible example:

If we view markets as prediction systems, there is a great example of self-fulfilling prophecy in the form of the Black-Scholes option pricing model. Before its publication, the price of options were very random, and the prices could be almost anywhere. Once a (supposedly) normative model for prices was available, people's willingness to trade converged to those prices fairly quickly.

(This simplifies slightly, because part of the B-S model was arbitrage, which allowed markets to reinforce these "correct" prices, but it's a useful example of when a prediction can stabilize the system.)

Comment by davidmanheim on Real-Life Examples of Prediction Systems Interfering with the Real World (Predict-O-Matic Problems) · 2020-12-07T14:20:29.459Z · LW · GW

"Superforecasters learning to choose easier questions"

Just wanted to note that it's not easier questions, per se, it's ones where you have a marginal advantage due to information or skill asymmetry. And because it's a competition, at least sometimes, you have an incentive to predict on questions that are being ignored as well. There are definitely fewer people forecasting more intrinsically uncertain questions, but since participants get scored with the superforecaster median for questions they don't answer, that's a resource allocation question, rather than the system interfering with the real world. We see this happening broadly when prediction scoring systems don't match incentives, but I've discussed that elsewhere, and there was a recent LW post on the point as well.

Mostly, this type of interference is from real-world goals to predictions, rather than the reverse. We do see some interference in prediction markets in order to change real world outcomes happens, in the first half of the 20th century: "The newspapers periodically contained charges that the partisans were manipulating the reported betting odds to create a bandwagon effect." (Rhode and Strumpf, 2003)

Comment by davidmanheim on Anti-EMH Evidence (and a plea for help) · 2020-12-07T13:21:22.863Z · LW · GW

Two securities (symbols to come later as this is still being actively traded) are supposed to give the same dividend stream. The company's official website states that they are meant to be economically equivalent.

I don't know exactly what is happening here, and this seems like a strange situation - but there are cases where different securities with identical rights to dividends have different value because they have different voting rights or similar, with implications for pricing if there are rumors of corporate takeovers, for example. Similarly, the fact that they are intended to be equivalent isn't necessarily a binding requirement, and I imagine that  if large investors decide to preferentially buy one class, they could push for changes.

Comment by davidmanheim on Moloch Hasn’t Won · 2020-12-02T12:46:27.926Z · LW · GW

I have repeatedly thought back to and referenced this series of posts, which improved my mental models for how people engage within corporations.

Comment by davidmanheim on Prize: Interesting Examples of Evaluations · 2020-11-30T05:22:09.348Z · LW · GW

One key factor in metrics is how the number relates to the meaning. We'd prefer metrics that have scales which are meaningful to the users, not arbitrary. I really liked one example I saw recently.

In discussing this point in a paper entitled "Arbitrary metrics in psychology," Blanton and Jaccard (doi:10.1037/0003-066X.61.1.27) fist point out that likert scales are not so useful. They then discuss the the (in)famous IAT test, where the scale is a direct measurement of the quantity of interest, but note that: "The metric of milliseconds, however, is arbitrary when it is used to measure the magnitude of an attitudinal preference." Therefore, when thinking about degree of racial bias, "researchers and practitioners should refrain from making such diagnoses until the metric of the IAT can be made less arbitrary and until a compelling empirical case can be made for the diagnostic criteria used." They go on to discuss norming measures, and looking at variance - but the base measure being used in not meaningful, so any transformation is of dubious value.

Going beyond that paper, looking at the broader literature on biases, we can come up with harder to measure but more meaningful measures of bias. Using probability of hiring someone based on racially-coded names might be a more meaningful indicator - but probability is also not a clear indicator, and use of names as a proxy obscures some key details about whether the measurement is class-based versus racial. It's also not as clear how big of an effect a difference in probability makes, despite being directly meaningful.

A very directly meaningful measure of bias that is even easier to interpret is dollars. This is immediately meaningful; if a person pays a different amount for identical service, that is a meaningful indicator of not only the existence, but the magnitude of a bias. Of course, evidence of pay differentials is a very indirect and complex question, but there are better ways of getting the same information in less problematic contexts. Evidence can still be direct, such as how much someone bids for watches, where pictures were taken with the watch on a black or white person's wrist, are a much more direct and useful way to understand how much bias is being displayed.

Comment by davidmanheim on Should we postpone AGI until we reach safety? · 2020-11-24T21:20:20.837Z · LW · GW

The idea that most people who can't do technical AI alignment are therefore able to do effective work in public policy or motivating public change seems unsupported by anything you've said. And a key problem with "raising awareness" as a method of risk reduction is that it's rife with infohazard concerns. For example, if we're really worried about a country seizing a decisive strategic advantage via AGI, that indicates that countries should be much more motivated to pursue AGI.

And I don't think that within the realm of international agreements and pursuit of AI regulation, postponement is neglected, at least relative to tractability, and policy for AI regulation is certainly an area of active research.

Comment by davidmanheim on AGI Predictions · 2020-11-22T20:45:56.133Z · LW · GW

"Will > 50% of AGI researchers agree with safety concerns by 2030?"

From my research, I think they mostly already do, they just use different framings, and care about different time frames.

Comment by davidmanheim on Embedded Interactive Predictions on LessWrong · 2020-11-22T20:40:43.216Z · LW · GW

Strong +1 to this suggestion, at least as an option that people can set.

Comment by davidmanheim on Should we postpone AGI until we reach safety? · 2020-11-22T20:37:49.413Z · LW · GW

I don't think this type of comment is appropriate or needed. (It was funny, but still not a good thing to post.)

Comment by davidmanheim on Should we postpone AGI until we reach safety? · 2020-11-22T20:34:21.145Z · LW · GW

On the first argument, I replied that I think a non-AGI safety group could do this, and therefore not hurt the principally unrelated AGI safety efforts. Such a group could even call for reduction of existential risk in general, further decoupling the two efforts.

It sounds like you are suggesting that someone somewhere should do this. Who, and how? Because until there is a specific idea being put forward, I can say that pausing AGI would be good, since misaligned AGI would be bad. I don't know how you'd do it, but if choosing between two world, the one without misaligned AGI seems likely to be better.

But in my mind, the proposal falls apart as soon as you ask who this group is, and whether this hypothetical groups has any leverage or any arguments that would convince people who are not already convinced. If the answer is yes, why do we need this new group to do this, and would we be better off using this leverage to increase the resources and effort put into AI safety?

Comment by davidmanheim on Incentive Problems With Current Forecasting Competitions. · 2020-11-11T10:14:42.316Z · LW · GW

This is tractable in sports because there are millions of dollars on the line for each player. In most contexts, the costs of negotiation and running a market for talent doesn't work as well, and it's better to use simple metrics despite all the very important problems with poorly aligned metrics. (Of course, the better solution is to design better metrics; https://mpra.ub.uni-muenchen.de/98288/ )

Comment by davidmanheim on Incentive Problems With Current Forecasting Competitions. · 2020-11-11T10:10:09.821Z · LW · GW

This is great, and it deals with a few points I didn't, but here's my tweetstorm from the beginning of last year about the distortion of scoring rules alone:

If you're interested in probability scoring rules, here's a somewhat technical and nit-picking tweetstorm about why proper scoring for predictions and supposedly "incentive compatible" scoring systems often aren't actually a good idea.

First, some background. Scoring rules are how we "score" predictions - decide how good they are. Proper scoring rules are ones where a predictor's score is maximized when it give it's true best guess. Wikipedia explains; en.wikipedia.org/wiki/Scoring_r…

A typical improper scoring rule is the "better side of even" rule, where every time your highest probability is assigned to the actual outcome, you get credit. In that case, people have no reason to report probabilities correctly - just pick a most likely outcome and say 100%.

There are many proper scoring rules. Examples include logarithmic scoring, where your score is the log of the probability assigned to the correct answer, and Brier score, which is the mean squared error. de Finetti et al. lays out the details here; link.springer.com/chapter/10.100…

These scoring rules are all fine as long as people's ONLY incentive is to get a good score.

In fact, in situations where we use quantitative rules, this is rarely the case. Simple scoring rules don't account for this problem. So what kind of misaligned incentives exist?

Bad places to use proper scoring rules #1 - In many forecasting applications, like tournaments, there is a prestige factor in doing well without a corresponding penalty for doing badly. In that case, proper scoring rules incentivise "risk taking" in predictions, not honesty.

Bad places to use proper scoring rules #2 - In machine learning, scoring rules are used for training models that make probabilistic predictions. If predictions are then used to make decisions that have asymmetric payoffs for different types of mistakes., it's misaligned.

Bad places to use proper scoring rules #3 - Any time you want the forecasters to have the option to say answer unknown. If this is important - and it usually is - proper scoring rules can disincentify or overincentify not guessing, depending on how that option is treated.

Using a metric that isn't aligned with incentives is bad. (If you want to hear more, follow me. I can't shut up about it.)

Carvalho discusses how proper scoring is misused; https://viterbi-web.usc.edu/~shaddin/cs699fa17/docs/Carvalho16.pdf

Anyways, this paper shows a bit of how to do better; https://pubsonline.informs.org/doi/abs/10.1287/deca.1110.0216

Fin.

Comment by davidmanheim on Industrial literacy · 2020-10-06T20:45:55.572Z · LW · GW

That all makes sense - I'm less certain that there is a reachable global maximum that is a Pareto improvement in terms of inputs over the current system. That is, I expect any improvement to require more of some critical resource - human time, capital investment, or land.

Comment by davidmanheim on Industrial literacy · 2020-10-05T09:49:46.464Z · LW · GW

No, the claim as written is true - agriculture will ruin soil over time, which has happened in recent scientific memory in certain places in Africa. And if you look at the biblical description of parts of the middle east, it's clear that desertification had taken a tremendous toll over the past couple thousand years. That's not because of fertilizer usage, it's because agriculture is about extracting food and moving it elsewhere, usually interrupting the cycle of nutrients, which happens organically otherwise. Obviously, natural habitats don't do this in the same way, because the varieties of plants shift over time, fauna is involved, etc.

Comment by davidmanheim on Industrial literacy · 2020-10-05T09:42:20.746Z · LW · GW

Yes, in the modern world, where babies are seen as precious, that is true. It clearly wasn't as big a deal when infant mortality was very high.

Comment by davidmanheim on Industrial literacy · 2020-10-05T09:40:12.636Z · LW · GW

This is disingenuous, I think. Of course they don't exist at the necessary scale yet, because the market is small. If the market grew, and was profitable, scaling would be possible. Rare earths aren't rare enough to be a real constraint, we'd just need to mine more of them.  The only thing needed would be to make more of things we know how to make. (And no, that wouldn't happen, because the new tech being developed would get developed far faster, and used instead.)

Comment by davidmanheim on Industrial literacy · 2020-10-05T09:35:34.758Z · LW · GW

This isn't critiquing the claim, though. Yes, there are alternatives that are available, but those alternatives - multi-cropping, integrating livestock, etc. are more labor intensive, and will produce less over the short term. And I'm very skeptical that the maximum is only local - have you found evidence that you can use land more efficiently, while keeping labor minimal, and produce more? Because if you did, there's a multi-billion dollar market for doing that. Does that make the alternatives useless, or bad ideas? Not at all, and I agree that changes are likely necessary for long-term stability - unless other technological advances obviate the need for them. But we can't pretend that staying at the maximum isn't effectively necessary.

Comment by davidmanheim on Expansive translations: considerations and possibilities · 2020-10-05T04:51:19.140Z · LW · GW

Agreed - and this reminds me of the observation that all of physics is contained in a single pebble; with enough undesrstnding, you could infer all of physics from close observation of quantum effects, find gravity at a very small scale if you had sensitive enough instruments, know much of natural history, liked the fact that earth has running water that made the stone smooth, that it must be in a universe more than a certain age given its composition, etc. With enough detail, any facet of a story requires effectively unlimited detail to fully understand.

And that makes it clear that we don't intend for every translation to be of unlimited depth - but the depth of the translation matters, and we trade off between depth of translation and accuracy. Translating Sherbert Lemon as Lemon Sorbet is probably a lack of understanding and an overly direct literal-but-incorrect meaning, while translating it as Crembo might be a reasonable choice because of the context, but is not at all a literal translation.

Comment by davidmanheim on Expansive translations: considerations and possibilities · 2020-09-30T11:22:31.534Z · LW · GW

As the post notes, inferential distance relates to differing worldviews and life experiences. This was written to an audience that mostly understands what inferential distance has to do with different worldviews - how would you explain it to a different audience?

Well, a typical translation doesn't try to bridge the gap between languages, it just picks something on the far side of the gap that seems similar to the one on the near side. But that leaves something out.

An example of this is in translations of Harry Potter, where Dumbledore's password is translated into a local sweet. The UK versions has "Sherbet Lemon" while the US version has "Lemon drop." Are these the same? I assumed so, but actually it seems the UK version has a "fizzy sweet powder" on the inside. In Danish and Swedish, it's (mis?) translated as lemon ice cream - which isn't the same at all. And in Hebrew, it's translated as Krembo, which doesn't even get close to translating the meaning correctly - it's supposed to be an "equivalent children’s dessert" - but the translation simply doesn't work, because you can't carry a Krembo around in your pocket, since it would melt. Does this matter? Well, the difference between a kindly old wizard who carries around a sucking candy, and one who carries around a kind-of-big marshmallow dessert. But that's beside the point - you don't translate the life experience that growing up eating sherbert lemons gives you, you find an analogue.

The only way to translate a specific word or term accurately could be to provide so much background that the original point is buried, and the only way to translate an idea is to find an analogue that the reader already understands. And that's why translation is impossible - but we do it anyways, and just accept that the results are fuzzy equivalents, and accept that worldviews are different enough that bridging the gap is impossible.

Comment by davidmanheim on Puzzle Games · 2020-09-29T13:16:28.349Z · LW · GW

Tier 3, I think: Hoplite, on Android.

The free game is basically a roguelike, but it's full information on each level, with only a little bit of strategy for which abilities to pick, and the Challenge mode available in the paid version, for $3, has a lot more straight puzzles. Comment by davidmanheim on Stripping Away the Protections · 2020-09-23T21:17:17.157Z · LW · GW To what extent are these dynamics the inevitable result of large organizations? I want to note that I've previously argued that much of the dynamics are significantly forced by structure - but not in this context, and I'm thinking about how much or little of that argument applies here. (I'll need to see what yo say in later posts in the series.) Comment by davidmanheim on Needed: AI infohazard policy · 2020-09-22T15:54:52.723Z · LW · GW I think there needs to be individual decisionmaking (on the part of both organizations and individual researchers, especially in light of the unilateralists' curse,) alongside a much broader discussion about how the world should handle unsafe machine learning, and more advanced AI. I very much don't think that the AI safety community debating and coming up with shared, semi-public guidelines for, essentially, what to withhold from the broader public, done without input from the wider ML / AI and research community who are impacted and whose work is a big part of what we are discussing, would be wise. That community needs to be engaged in any such discussions. Comment by davidmanheim on Needed: AI infohazard policy · 2020-09-22T04:40:09.201Z · LW · GW There's some intermediate options available instead of just "full secret" or "full publish"... and I haven't seen anyone mention that... OpenAI's phased release of GPT2 seems like a clear example of exactly this. And there is a forthcoming paper looking at the internal deliberations around this from Toby Shevlane, in addition to his extant work on the question of how disclosure potentially affects misuse. Comment by davidmanheim on Needed: AI infohazard policy · 2020-09-22T04:34:24.546Z · LW · GW The first thing I would note is that stakeholders need to be involved in making any guidelines, and that pushing for guidelines from the outside is unhelpful, if not harmful, since it pushes participants to be defensive about their work. There are also an extensive literature discussing the general issue of information dissemination hazards and the issues of regulation in other domains, such as nuclear weapons technology, biological and chemical weapons, and similar. There is also a fair amount of ongoing work on synthesizing this literature and the implications for AI. Some of it is even on this site. For example, see: https://www.lesswrong.com/posts/RY9XYoqPeMc8W8zbH/mapping-downside-risks-and-information-hazards and https://www.lesswrong.com/posts/6ur8vDX6ApAXrRN3t/information-hazards-why-you-should-care-and-what-you-can-do So there is tons of discussion about this already, and there is plenty you should read on the topic - I suspect you can start with the paper that provided the name for your post, and continuing with sections of GovAI's research agenda. Comment by davidmanheim on Are aircraft carriers super vulnerable in a modern war? · 2020-09-21T10:58:02.220Z · LW · GW Noting that this is correct, but incomplete. They are very important for force projection even in near-peer engagements, since otherwise you likely can't get your planes to where you need them. The question that matters for this is who wins the area-denial / anti-aircraft battle, i.e. can drones and similar get close enough to sink anything, and this is the critical question anyways, since your carriers and planes are useless if you can't get close enough. And this isn't my area, but my very preliminary impression is that AA/AD makes aerial combat fairly defense-dominant. Comment by davidmanheim on What's the best overview of common Micromorts? · 2020-09-04T08:57:32.990Z · LW · GW "Could someone write a LW style book review of the Norm Chronicles?" Endorsed. Comment by davidmanheim on Reflections on AI Timelines Forecasting Thread · 2020-09-03T06:04:37.643Z · LW · GW "A good next step would be to create more consensus on the most productive interpretation for AGI timeline predictions. " Strongly agree with this. I don't think the numbers are meaningful, since AGI could mean anything from "a CAIS system-of-systems that can be used to replace most menial jobs with greater than 50% success," to "a system that can do any one of the majority of current jobs given an economically viable (<$10m) amount of direct training and supervision" to "A system that can do everything any human is able to do at least as well as that human, based only on available data and observation, without any direct training or feedback, for no marginal cost."

Comment by davidmanheim on What's the best overview of common Micromorts? · 2020-09-03T05:49:34.223Z · LW · GW

Scott's answer is a good one - you should read "The Norm Chronicles." But I think the question has a problem. Micromorts are a time-agnostic measure of dying, and the problem is that most risks you take don't actually translate well into micromorts.

Smoking a cigarette, which reduces your life expectancy by around 30 seconds, translates into either zero micromorts, or one, depending on how you set up the question. Increasing your relative risk of dying from cancer in 30 years isn't really the same as playing Russian roulette with a 1-million-chamber gun. Similarly, a healthy 25 year old getting COVID has about a 70-micromort risk based on direct mortality from COVID. But that number ignores the risks of chronic fatigue, later complications, or reduced life expectancy (all of which we simply don't know enough to quantify well.)

The answer that health economists have to this question is the QALY, which has its own drawbacks. For example, QALYs can't uniformly measure the risks of Russian roulette, since the risk depends on the age and quality of life of the player.

What we're left with is that the actual question we want answered has a couple more dimensions than a single metric can capture - and as I have mentioned once or twice elsewhere, reductive. metrics. have various. problems.

Comment by davidmanheim on nostalgebraist: Recursive Goodhart's Law · 2020-08-28T07:46:51.037Z · LW · GW

I'd agree with the epistemic warning ;)

I don't think the model is useful, since it's non-predictive. And we have good reasons to think that human brains are actually incoherent. Which means I'm skeptical that there is something useful to find by fitting a complex model to find a coherent fit for an incoherent system.

Comment by davidmanheim on Multitudinous outside views · 2020-08-28T07:44:23.185Z · LW · GW

I don't think that weights are the right answer - not that they aren't better than nothing, but as the Tesla case shows, the actual answer is having a useful model with which to apply reference classes. For example, once you have a model of stock prices as random walks, the useful priors are over the volatility rather than price, or rather, the difference between implied options volatility and post-hoc realized volatility for the stock, and other similar stocks. (And if your model is stochastic volatility with jumps, you want priors over the inputs to that.) At that point, you can usefully use the reference classes, and which one to use isn't nearly as critical.

In general, I strongly expect that in "difficult" domains, causal understanding combined with outside view and reference classes will outperform simply using "better" reference classes naively.

Comment by davidmanheim on nostalgebraist: Recursive Goodhart's Law · 2020-08-27T10:44:06.941Z · LW · GW

I talked about this in terms of "underspecified goals" - often, the true goal doesn't usually exist clearly, and may not be coherent. Until that's fixed, the problem isn't really Goodhart, it's just sucking at deciding what you want.

I'm thinking of a young kid in a candy store who has \$1, and wants everything, and can't get it. What metric for choosing what to purchase will make them happy? Answer: There isn't one. What they want is too unclear for them to be happy. So I can tell you in advance that they're going to have a tantrum later about wanting to have done something else no matter what happens now. That's not because they picked the wrong goal, it's because their desires aren't coherent.

Comment by davidmanheim on nostalgebraist: Recursive Goodhart's Law · 2020-08-27T10:36:49.366Z · LW · GW

Strongly agree - and Goodhart's law is at least 4 things. Though I'd note that anti-inductive behavior / metric gaming is hard to separate from goal mis-specification, for exactly the reasons outlined in the post.

But saying there is a goal too complex to be understandable and legible implies that it's really complex, but coherent. I don't think that's the case of individuals, and I'm certain it isn't true of groups. (Arrow's theorem, etc.)