## Posts

SDM's Shortform 2020-07-23T14:53:52.568Z
Modelling Continuous Progress 2020-06-23T18:06:47.474Z
Coronavirus as a test-run for X-risks 2020-06-13T21:00:13.859Z
Will AI undergo discontinuous progress? 2020-02-21T22:16:59.424Z
The Value Definition Problem 2019-11-18T19:56:43.271Z

Comment by SDM on Why did the UK switch to a 12 week dosing schedule for COVID-19 vaccines? · 2021-06-22T12:10:18.940Z · LW · GW

Dominic Cummings (who is a keen LW reader and agrees with Zvi's most cynical takes about the nature of government) is likely a major factor, although he was gone at the time the first doses first issue arose, the overall success of the UK's vaccine procurement can be credited to the vaccine taskforce which was an ad-hoc organization set up to be exempt from much of the usual rules, partly due to his influence and that of Patrick Vallance, the UK's chief scientific advisor - that way of thinking may well have leaked into other decisions about vaccine prioritization, and Vallance certainly was involved with the first doses first decision. See this from Cummings' blog:

This is why there was no serious vaccine plan — i.e spending billions on concurrent (rather than the normal sequential) creation/manufacturing/distribution etc — until after the switch to Plan B. I spoke to Vallance on 15 March about a ‘Manhattan Project’ for vaccines out of Hancock’s grip but it was delayed by the chaotic shift from Plan A to lockdown then the PM’s near-death. In April Vallance, the Cabinet Secretary and I told the PM to create the Vaccine Taskforce, sideline Hancock, and shift commercial support from DHSC to BEIS. He agreed, this happened, the Chancellor supplied the cash. On 10 May I told officials that the VTF needed a) a much bigger budget, b) a completely different approach to DHSC’s, which had been mired in the usual processes, so it could develop concurrent plans, and c) that Bingham needed the authority to make financial decisions herself without clearance from Hancock.

(I see the success of the UK vaccine taskforce and its ability to have a somewhat appropriate sense of the costs and benefits involved and the enormous value of vaccinations, to be a good example of how it's institution design that is the key issue which most needs fixing. Have an efficient, streamlined taskforce, and you can still get things done in government.)

Other differences that may be relevant: this UK Government arguably has much more slack than the US under Biden or Trump. The UK's system gives very broad powers to the executive as long as they have a majority in parliament, this government is relatively popular due to the perception that it followed through on getting Brexit done, and we were in the middle of an emergency when that delay decision was authorized. Also, vaccine hesitancy is significantly lower in the UK than the US, and therefore fear of vaccine hesitancy by policymakers (which seemed to be driving the CDCs intransigence) is also significantly lower.

Comment by SDM on Pros and cons of working on near-term technical AI safety and assurance · 2021-06-19T13:44:03.419Z · LW · GW

It depends somewhat on what you mean by 'near term interpretability' - if you apply that term to research into, for example, improving the stability and ability to access the 'inner world models' held by large opaque langauge models like GPT-3, then there's a strong argument that ML based 'interpretability' research might be one of the best ways of directly working on alignment research,

https://www.alignmentforum.org/posts/29QmG4bQDFtAzSmpv/an-141-the-case-for-practicing-alignment-work-on-gpt-3-and

And see this discussion for more,

Evan Hubinger: +1 I continue to think that language model transparency research is the single most valuable current research direction within the class of standard ML research, for similar reasons to what Eliezer said above.

Ajeya Cotra: Thanks! I'm also excited about language model transparency, and would love to find ways to make it more tractable as a research statement / organizing question for a field. I'm not personally excited about the connotations of transparency because it evokes the neuroscience-y interpretability tools, which don't feel scalable to situations when we don't get the concepts the model is using, and I'm very interested in finding slogans to keep researchers focused on the superhuman stuff.

So language model transparency/interpretability tools might be useful on the basis of pro 2) and also 1) to some extent, because it will help build tools for intereting TAI systems and alos help align them ahead of time.

1. Most importantly, the more we align systems ahead of time, the more likely that researchers will be able to put thought and consideration into new issues like treacherous turns, rather than spending all their time putting out fires.

2. We can build practical know-how and infrastructure for alignment techniques like learning from human feedback.

3. As the world gets progressively faster and crazier, we’ll have better AI assistants helping us to navigate the world.

4. It improves our chances of discovering or verifying a long-term or “full” alignment solution.

Comment by SDM on Taboo "Outside View" · 2021-06-19T13:28:23.349Z · LW · GW

The way I understand it is that 'outside view' is relative, and basically means 'relying on more reference class forecasting / less gears-level modelling than whatever the current topic of discussion is relying on'. So if we're discussing a gears-level model of how a computer chip works in the context of if we'll ever get a 10 OOM improvement in computing power, bringing up moore's law and general trends would be using an 'outside view'.

If we're talking about very broad trend extrapolation, then the inside view is already not very gears-level. So suppose someone says GWP is improving hyperbolically so we'll hit a singularity in the next century. An outside view correction to that would be 'well for x and y reasons we're very unlikely a priori to be living at the hinge of history so we should lower our credence in that trend extrapolation'.

So someone bringing up broad priors or the anti-weirdness heruistic if we're talking about extrapolating trends would be moving to a 'more outside' view. Someone bringing up a trend when we're talking about a specific model would be using an 'outside view'. In each case, you're sort of zooming out to rely on a wider selection of (potentially less relevant) evidence than you were before.

Note that what I'm trying to do here isn't to counter your claim that the term isn't useful anymore but just to try and see what meaning the broad sense of the term might have, and this is the best I've come up with. Since what you mean by outside view shifts dependent on context, it's probably best to use the specific thing that you mean by it in each context, but there is still a unifying theme among the different ideas.

Comment by SDM on Open and Welcome Thread – June 2021 · 2021-06-07T15:19:16.562Z · LW · GW

Everyone says the Culture novels are the best example of an AI utopia, but even though it's a cliché to mention the culture, it's a cliché for a good reason. Don't start with Consider Phlebas (the first one), but otherwise just dive in. My other recommendation is the Commonwealth Saga by Peter F Hamilton and the later Void Trilogy - it's not on the same level of writing quality as the Culture, although still a great story, but it depicts an arguably superior world to that of the Culture - with more unequivocal support of life extension and transhumanism.

The Commonwealth has effective immortality, a few downsides of it are even noticeable (their culture and politics is a bit more stagnant than we might like), but there's never any doubt at all that it's worth it, and it's barely commented on in the story. The latter-day Void Trilogy Commonwealth is probably the closest a work of published fiction has come to depicting a true eudaemonic utopia that lacks the problems of the culture.

Comment by SDM on Looking for reasoned discussion on Geert Vanden Bossche's ideas? · 2021-06-06T21:23:02.920Z · LW · GW

According to this article (https://www.deplatformdisease.com/blog/addressing-geert-vanden-bossches-claims) - the key claim of (2), that natural antibodies are superior to vaccine antibodies and permanently replaced by them, is just wrong ('absolute unvarnished nonsense' was the quote). One or the other is right, and we just need someone who actually knows immunology to tell us

Principally, antibodies against SARS-CoV-2 could be of value if they are neutralizing. Bossche presents no evidence to support that natural IgM is neutralizing (rather than just binding) SARS-CoV-2.

Comment by SDM on Covid 6/3: No News is Good News · 2021-06-04T14:16:05.294Z · LW · GW

We have had quite significant news accumulating this week; Delta (Indian) COVID-19 has been shown to be (central estimate) 2.5 times deadlier than existing strains and the central estimate of its increased transmissibility is down a bit, but still 50-70% on top of B.1.1.7 (just imagine if we'd had to deal with Delta in early 2020!), though with a fairly small vaccine escape. This thread gives a good summary of the modelling and likely consequences for the UK, and also more or less applies to most countries with high vaccination rates like the US.

For the US/UK this situation is not too concerning, certainly not compared to March or December 2020, and if restrictions are held where they are now the R_t will likely soon go under 1 as vaccination rates increase. However, there absolutely can be a large exit wave that could push up hospitalizations and in the reasonable worst case lead to another lockdown. Also, the outlook for the rest of the world is significantly worse than it looked just a month ago thanks to this variant - see this by Zeynep Turfecki.

The data is preliminary, and I really hope that the final estimate ends up as low as possible. But coupled with what we are observing in India and in Nepal, where it is rampant, I fear that the variant is a genuine threat.

In practical terms, to put it bluntly, it means that the odds that the pandemic will end because enough people have immunity via getting infected rather than being vaccinated just went way up.

Effective Altruists may want to look to India-like oxygen interventions in other countries over the next couple of months.

Comment by SDM on What will 2040 probably look like assuming no singularity? · 2021-05-16T22:46:45.120Z · LW · GW

You cover most of the interesting possibilities on the military technology front, but one thing that you don't mention that might matter especially considering the recent near-breakdowns of some of the nuclear weapon treaties e.g. NEWSTART, is the further proliferation of nuclear weapons including fourth generation nuclear weapons like nuclear shaped charge warheads, pure fusion and sub-kiloton devices or tactical nuclear weapons - and more countries fitting nuclear-armed cruise missiles or drones with nuclear capability which might be a destabilising factor. If laser technology is sufficiently developed we may also see other forms of directed energy weapons becoming more common such as electron beam weapons or electrolasers

Comment by SDM on Covid 5/13: Moving On · 2021-05-14T13:22:27.574Z · LW · GW

For anyone reading, please consider following in Vitalik's footsteps and donating to the GiveIndia Oxygen fundraiser, which likely beats givewell's top charities in terms of life-years saved per dollar.

One of the more positive signs that I've seen in recent times, is that well-informed elite opinion (going by, for example, the Economist editorials) has started to shift towards scepticism of institutions and a recognition of how badly they've failed. Among the people who matter for policymaking, the scale of the failure has not been swept under the rug. See here:

We believe that Mr Biden is wrong. A waiver may signal that his administration cares about the world, but it is at best an empty gesture and at worst a cynical one.

...

Economists’ central estimate for the direct value of a course is 2,900—if you include factors like long covid and the effect of impaired education, the total is much bigger. This strikes me as the sort of remark I'd expect to see in one of these comment threads, which has to be a good sign. In that same issue, we also saw the first serious attempt that I've seen to calculate the total death toll of Covid, accounting for all reporting biases, throughout the world. The Economist was the only publication I've seen that didn't parrot the almost-meaningless official death toll figures. The true answer is, of course, horrifying: between 7.1m and 12.7m dead, with a central estimate of 10.2m - this unfortunately means that we ended up with the worst case scenario I imagined back in late February. Moreover, we appear to currently be at the deadliest point of the entire pandemic. Comment by SDM on What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs) · 2021-04-11T14:12:23.334Z · LW · GW Great post! I'm glad someone has outlined in clear terms what these failures look like, rather than the nebulous 'multiagent misalignment', as it lets us start on a path to clarifying what (if any) new mitigations or technical research are needed. Agent-agnostic perspective is a very good innovation for thinking about these problems - is line between agentive and non-agentive behaviour is often not clear, and it's not like there is a principled metaphysical distinction between the two (e.g. Dennett and the Intentional Stance). Currently, big corporations can be weakly modelled this way and individual humans are fully agentive, but Transformative AI will bring up a whole spectrum of more and less agentive things that will fill up the rest of this spectrum. There is a sense in which, if the outcome is something catastrophic, there must have been misalignment, and if there was misalignment then in some sense at least some individual agents were misaligned. Specifically, the systems in your Production Web weren't intent-aligned because they weren't doing what we wanted them to do, and were at least partly deceiving us. Assuming this is the case, 'multipolar failure' requires some subset of intent misalignment. But it's a special subset because it involves different kinds of failures to the ones we normally talk about. It seems like you're identifying some dimensions of intent alignment as those most likely to be neglected because they're the hardest to catch, or because there will be economic incentives to ensure AI isn't aligned in that way, rather than saying that there some sense in which the transformative AI in the production web scenario is 'fully aligned' but still produces an existential catastrophe. I think that the difference between your Production Web and Paul Christiano's subtle creeping Outer Alignment failure scenario is just semantic - you say that the AIs involved are aligned in some relevant sense while Christiano says they are misaligned. The further question then becomes, how clear is the distinction between multiagent alignment and 'all of alignment except multiagent alignment'. This is the part where your claim of 'Problems before solutions' actually does become an issue - given that the systems going wrong in Production Web aren't Intent-aligned (I think you'd agree with this), at a high level the overall problem is the same in single and multiagent scenarios. So for it to be clear that there is a separate multiagent problem to be solved, we have to have some reason to expect that the solutions currently intended to solve single agent intent alignment aren't adequate, and that extra research aimed at examining the behaviour of AI e.g. in game theoretic situations, or computational social choice research, is required to avert these particular examples of misalignment. A related point - as with single agent misalignment, the Fast scenarios seem more certain to occur, given their preconditions, than the slow scenarios. A certain amount of stupidity and lack of coordination persisting for a while is required in all the slow scenarios, like the systems involved in Production Web being allowed to proliferate and be used more and more even if an opportunity to coordinate and shut the systems down exists and there are reasons to do so. There isn't an exact historical analogy for that type of stupidity so far, though a few things come close (e.g. covid response, leadup to WW2, cuban missile crisis). As with single agent fast takeoff scenarios, in the fast stories there is a key 'treacherous turn' moment where the systems suddenly go wrong, which requires much less lack of coordination to be plausible than the slow Production Web scenarios. Therefore, multipolar failure is less dangerous if takeoff is slower, but the difference in risk between slow vs fast takeoff for multipolar failure is unfortunately a lot smaller than the slow vs fast risk difference for single agent failure (where the danger is minimal if takeoff is slow enough). So multiagent failures seem like they would be the dominant risk factor if takeoff is sufficiently slow. Comment by SDM on SDM's Shortform · 2021-03-30T00:04:30.952Z · LW · GW Yes, its very oversimplified - in this case 'capability' just refers to whatever enables RSI, and we assume that it's a single dimension. Of course, it isn't, but we assume that the capability can be modelled this way as a very rough approximation. Physical limits are another thing the model doesn't cover - you're right to point out that on the intelligence explosion/full RSI scenarios the graph goes vertical only for a time until some limit is hit Comment by SDM on SDM's Shortform · 2021-03-29T18:38:48.724Z · LW · GW # Update to'Modelling Continuous Progress' I made an attempt to model intelligence explosion dynamics in this post, by attempting to make the very oversimplified exponential-returns-to-exponentially-increasing-intelligence model used by Bostrom and Yudkowsky slightly less oversimplified. This post tries to build on a simplified mathematical model of takeoff which was first put forward by Eliezer Yudkowsky and then refined by Bostrom in Superintelligence, modifying it to account for the different assumptions behind continuous, fast progress as opposed to discontinuous progress. As far as I can tell, few people have touched these sorts of simple models since the early 2010’s, and no-one has tried to formalize how newer notions of continuous takeoff fit into them. I find that it is surprisingly easy to accommodate continuous progress and that the results are intuitive and fit with what has already been said qualitatively about continuous progress. The page includes python code for the model. This post doesn't capture all the views of takeoff - in particular it doesn't capture the non-hyperbolic faster growth mode scenario, where marginal intelligence improvements are exponentially increasingly difficult, and therefore we get a (continuous or discontinuous switch to a) new exponential growth mode rather than runaway hyperbolic growth. But I think that by modifying the f(I) function that determines how RSI capability varies with intelligence we can incorporate such views. In the context of the exponential model given in the post that would correspond to an f(I) function where which would result in a continuous (determined by size of d) switch to a single faster exponential growth mode But I think the model still roughly captures the intuition behind scenarios that involve either a continuous or a discontinuous step to an intelligence explosion. Given the model assumptions, we see how the different scenarios look in practice: If we plot potential AI capability over time, we can see how no new growth mode (brown) vs a new growth mode (all the rest), the presence of an intelligence explosion (red and orange) vs not (green and purple), and the presence of a discontinuity (red and purple) vs not (orange and green) affect the takeoff trajectory. Comment by SDM on My research methodology · 2021-03-29T18:12:50.597Z · LW · GW Is a bridge falling down the moment you finish building it an extreme and somewhat strange failure mode? In the space of all possible bridge designs, surely not. Most bridge designs fall over. But in the real world, you could win money all day betting that bridges won't collapse the moment they're finished. I'm not saying this is an exact analogy for AGI alignment - there are lots of specific technical reasons to expect that alignment is not like bridge building and that there are reasons why the approaches we're likely to try will break on us suddenly in ways we can't fix as we go - treacherous turns, inner misalignment or reactions to distributional shift. It's just that there are different answers to the question of what's the default outcome depending on if you're asking what to expect abstractly or in the context of how things are in fact done. Instrumental Convergence plus a specific potential failure mode (like e.g. we won't pay sufficient attention to out of distribution robustness), is like saying 'you know the vast majority of physically possible bridge designs fall over straight away and also there's a giant crack in that load-bearing concrete pillar over there' - if for some reason your colleague has a mental block around the idea that a bridge could in principle fall down then the first part is needed (hence why IC is important for presentations of AGI risk because lots of people have crazily wrong intuitions about the nature of AI or intelligence), but otherwise IC doesn't do much to help the case for expecting catastrophic misalignment and isn't enough to establish that failure is a default outcome. It seems like your reason for saying that catastrophic misalignment can't be considered an abnormal or extreme failure mode comes down to this pre-technical-detail Instrumental Convergence thesis - that IC by itself gives us a significant reason to worry, even if we all agree that IC is not the whole story. this seems a bizarre way to describe something that we agree is the default result of optimizing for almost anything (eg paperclips). = 'because strongly optimizing for almost anything leads to catastrophe via IC, we can't call catastrophic misalignment a bizarre outcome'? Maybe it's just a subtle difference in emphasis without a real difference in expectation/world model, but I think there is an important need to clarify the difference between 'IC alone raises an issue that might not be obvious but doesn't give us a strong reason to expect a catastrophe' and 'IC alone suggests a catastrophe even though it's not the whole story' - and the first of these is a more accurate way of viewing the role of IC in establishing the likelihood of catastrophic misalignment. Ben Garfinkel argues for the first of these and against the second, in his objection to the 'classic' formulation of instrumental convergence/orthogonality - that these are just 'measure based' arguments which identify that a majority of possible AI designs with some agentive properties and large-scale goals will optimize in malign ways, rather than establishing that we're actually likely to build such agents. Comment by SDM on Mathematical Models of Progress? · 2021-02-16T15:47:42.268Z · LW · GW I made an attempt to model intelligence explosion dynamics in this post, by attempting to make the very oversimplified exponential-returns-to-exponentially-increasing-intelligence model used by Bostrom and Yudkowsky slightly less oversimplified. This post tries to build on a simplified mathematical model of takeoff which was first put forward by Eliezer Yudkowsky and then refined by Bostrom in Superintelligence, modifying it to account for the different assumptions behind continuous, fast progress as opposed to discontinuous progress. As far as I can tell, few people have touched these sorts of simple models since the early 2010’s, and no-one has tried to formalize how newer notions of continuous takeoff fit into them. I find that it is surprisingly easy to accommodate continuous progress and that the results are intuitive and fit with what has already been said qualitatively about continuous progress. The page includes python code for the model. This post doesn't capture all the views of takeoff - in particular it doesn't capture the non-hyperbolic faster growth mode scenario, where marginal intelligence improvements are exponentially increasingly difficult and therefore we get a (continuous or discontinuous switch to a) new exponential growth mode rather than runaway hyperbolic growth. But I think that by modifying the f(I) function that determines how RSI capability varies with intelligence we can incorporate such views. (In the context of the exponential model given in the post that would correspond to an f(I) function where which would result in a continuous (determined by size of d) switch to a single faster exponential growth mode) But I think the model still roughly captures the intuition behind scenarios that involve either a continuous or a discontinuous step to an intelligence explosion. Comment by SDM on The Meaning That Immortality Gives to Life · 2021-02-16T12:21:47.180Z · LW · GW Modern literature about immortality is written primarily by authors who expect to die, and their grapes are accordingly sour. This is still just as true as when this essay was written, I think - even the Culture had its human citizens mostly choosing to die after a time... to the extent that I eventually decided: if you want something done properly, do it yourself. But there are exceptions - the best example of published popular fiction that has immortality as a basic fact of life is the Commonwealth Saga by Peter F Hamilton and the later Void Trilogy (the first couple of books were out in 2007). The Commonwealth has effective immortality, a few downsides of it are even noticable (their culture and politics is a bit more stagnant than we might like), but there's never any doubt at all that it's worth it, and it's barely commented on in the story, In truth, I suspect that if people were immortal, they would not think overmuch about the meaning that immortality gives to life. (Incidentally, the latter-day Void Trilogy Commonwealth is probably the closest a work of published fiction has come to depicting a true eudaimonic utopia that lacks the problems of the culture) I wonder if there's been any harder to detect shift in how immortality is portrayed in fiction since 2007? Is it still as rare now as then to depict it as a bad thing? Comment by SDM on Covid 2/11: As Expected · 2021-02-12T12:05:38.568Z · LW · GW The UK vaccine rollout is considered a success, and by the standards of other results, it is indeed a success. This interview explains how they did it, which was essentially ‘make deals with companies and pay them money in exchange for doses of vaccines.’ A piece of this story you may find interesting (as an example of a government minister making a decision based on object level physical considerations): multiple reports say Matt Hancock, the UK's health Secretary, made the decision to insist on over-ordering vaccines because he saw the movie Contagion and was shocked into viscerally realising how important a speedy rollout was. https://www.economist.com/britain/2021/02/06/after-a-shaky-start-matt-hancock-has-got-the-big-calls-right It might just be a nice piece of PR, but even if that's the case it's still a good metaphor for how object level physical considerations can intrude into government decision making Comment by SDM on Review of Soft Takeoff Can Still Lead to DSA · 2021-02-06T16:08:26.501Z · LW · GW I agree with your argument about likelihood of DSA being higher compared to previous accelerations, due to society not being able to speed up as fast as the technology. This is sorta what I had in mind with my original argument for DSA; I was thinking that leaks/spying/etc. would not speed up nearly as fast as the relevant AI tech speeds up. Your post on 'against GDP as a metric' argues more forcefully for the same thing that I was arguing for, that 'the economic doubling time' stops being so meaningful - technological progress speeds up abruptly but other kinds of progress that adapt to tech progress have more of a lag before the increased technological progress also affects them? So we're on the same page there that it's not likely that 'the economic doubling time' captures everything that's going on all that well, which leads to another problem - how do we predict what level of capability is necessary for a transformative AI to obtain a DSA (or reach the PONR for a DSA)? I notice that in your post you don't propose an alternative metric to GDP, which is fair enough since most of your arguments seem to lead to the conclusion that it's almost impossibly difficult to predict in advance what level of advantage over the rest of the world in which areas are actually needed to conquer the world, since we seem to be able to analogize persuasion tools to or conquistador-analogues who had relatively small tech advantages, to the AGI situation. I think that there is still a useful role for raw economic power measurements, in that they provide a sort of upper bound on how much capability difference is needed to conquer the world. If an AGI acquires resources equivalent to controlling >50% of the world's entire GDP, it can probably take over the world if it goes for the maximally brute force approach of just using direct military force. Presumably the PONR for that situation would be awhile before then, but at least we know that an advantage of a certain size would be big enough given no assumptions about the effectiveness of unproven technologies of persuasion or manipulation or specific vulnerabilities in human civilization. So we can use our estimate of how doubling time may increase, anchor on that gap and estimate down based on how soon we think the PONR is, or how many 'cheat' pathways that don't involve economic growth there are. The whole idea of using brute economic advantage as an upper limit 'anchor' I got from Ajeya's Post about using biological anchors to forecast what's required for TAI - if we could find a reasonable lower bound for the amount of advantage needed to attain DSA we could do the same kind of estimated distribution between them. We would just need a lower limit - maybe there's a way of estimating it based on the upper limit of human ability since we know no actually existing human has used persuasion to take over the world but as you point out they've come relatively close. I realize that's not a great method, but is there any better alternative given that this is a situation we've never encountered before, for trying to predict what level of capability is necessary for DSA? Or perhaps you just think that anchoring your prior estimate based on economic power advantage as an upper bound is so misleading it's worse than having a completely ignorant prior. In that case, we might have to say that there are just so many unprecedented ways that a transformative AI could obtain a DSA that we can just have no idea in advance what capability is needed, which doesn't feel quite right to me. Comment by SDM on Ten Causes of Mazedom · 2021-01-18T13:35:19.544Z · LW · GW Finally got round to reading your sequence and it looks like we disagree a lot less than I thought, since your first three causes are exactly what I was arguing for in my reply, This is probably the crux. I don't think we tend to go to higher simulacra levels now, compared to decades ago. I think it's always been quite prevalent, and has been roughly constant through history. While signalling explanations definitely tell us a lot about particular failings, they can't explain the reason things are worse now in certain ways, compared to before. The difference isn't because of the perennial problem of pervasive signalling. It has more to do with economic stagnation and not enough state capacity. These flaws mean useful action gets replaced by useless action, and allow more room for wasteful signalling. As one point in favour of this model, I think it's worth noting that the historical comparisons aren't ever to us actually succeeding at dealing with pandemics in the past, but to things like "WWII-style" efforts - i.e. thinking that if we could just do x as well as we once did y then things would have been a lot better. This implies that if you made an institution analogous to e.g. the weapons researchers of WW2 and the governments that funded them, or NASA in the 1960s, without copy-pasting 1940s/1960s society wholesale, the outcome would have been better. To me that suggests it's institution design that's the culprit, not this more ethereal value drift or increase in overall simulacra levels. I think you'd agree with most of that, except that you see a much more significant causal role for the cultural factors like increased fragility and social atomisation. There is pretty solid evidence for both being real problems, Jon Haidt presents the best case to take these seriously, although it's not as definitive as you make out (E.g. Suicide rates are basically a random walk), and your explanation for how they lead to institutional problems is reasonable, but I wonder if they are even needed as explanations when your first three causes are so strong and obvious, Essentially I see your big list like this: Main Drivers: Cause 1: More Real Need For Large Organizations (includes decreasing low hanging fruit) Cause 2: Laws and Regulations Favor Large Organizations Cause 3: Less Disruption of Existing Organizations Cause 5: Rent Seeking is More Widespread and Seen as Legitimate Real but more minor: Cause 4: Increased Demand for Illusion of Safety and Security Cause 8: Atomization and the Delegitimization of Human Social Needs Cause 7: Ignorance Cause 9: Educational System Cause 10: Vicious Cycle No idea but should look into: Cause 6: Big Data, Machine Learning and Internet Economics Essentially my view is that if you directly addressed the main drivers with large legal or institutional changes the other causes of mazedom wouldn't fight back. I believe that the 'obvious legible institutional risks first' view is in line with what others who've written on this problem like Tyler Cowen or Sam Bowman think, but it's a fairly minor disagreement since most of your proposed fixes are on the institutional side of things anyway. Also, the preface is very important - these are some of the only trends that seem to be going the wrong way consistently in developed countries for a while now, and they're exactly the forces you'd expect to be hardest to resist. The world is better for people than it was back then. There are many things that have improved. This is not one of them. Comment by SDM on Review of Soft Takeoff Can Still Lead to DSA · 2021-01-10T20:29:34.382Z · LW · GW Currently the most plausible doom scenario in my mind is maybe a version of Paul’s Type II failure. (If this is surprising to you, reread it while asking yourself what terms like “correlated automation failure” are euphemisms for.) This is interesting, and I'd like to see you expand on this. Incidentally I agree with the statement, but I can imagine both more and less explosive, catastrophic versions of 'correlated automation failure'. On the one hand it makes me think of things like transportation and electricity going haywire, on the other it could fit a scenario where a collection of powerful AI systems simultaneously intentionally wipe out humanity. Clock-time leads shrink automatically as the pace of innovation speeds up, because if everyone is innovating 10x faster, then you need 10x as many hoarded ideas to have an N-year lead. What if, as a general fact, some kinds of progress (the technological kinds more closely correlated with AI) are just much more susceptible to speed-up? I.e, what if 'the economic doubling time' stops being so meaningful - technological progress speeds up abruptly but other kinds of progress that adapt to tech progress have more of a lag before the increased technological progress also affects them? In that case, if the parts of overall progress that affect the likelihood of leaks, theft and spying aren't sped up by as much as the rate of actual technology progress, the likelihood of DSA could rise to be quite high compared to previous accelerations where the order of magnitude where the speed-up occurred was fast enough to allow society to 'speed up' the same way. In other words - it becomes easier to hoard more and more ideas if the ability to hoard ideas is roughly constant but the pace of progress increases. Since a lot of these 'technologies' for facilitating leaks and spying are more in the social realm, this seems plausible. But if you need to generate more ideas, this might just mean that if you have a very large initial lead, you can turn it into a DSA, which you still seem to agree with: • Even if takeoff takes several years it could be unevenly distributed such that (for example) 30% of the strategically relevant research progress happens in a single corporation. I think 30% of the strategically relevant research happening in a single corporation at beginning of a multi-year takeoff would probably be enough for DSA. Comment by SDM on Fourth Wave Covid Toy Modeling · 2021-01-10T10:37:49.241Z · LW · GW I meant, 'based on what you've said about Zvi's model' I.e. Nostalgebraist says zvi says Rt never goes below 1 - if you look at the plot he produced Rt is always above 1 given Zvi's assumptions, which the London data falsified. Comment by SDM on Fourth Wave Covid Toy Modeling · 2021-01-09T19:33:11.765Z · LW · GW • It seems better to first propose a model we know can match past data, and then add a tuning term/effect for "pandemic fatigue" for future prediction. To get a sense of scale, here is one of the plots from my notebook: https://64.media.tumblr.com/823e3a2f55bd8d1edb385be17cd546c7/673bfeb02b591235-2b/s640x960/64515d7016eeb578e6d9c45020ce1722cbb6af59.png The colored points show historical data on R vs. the 6-period average, with color indicating the date. Thanks for actually plotting historical Rt vs infection rates! Whereas, it seems more natural to take (3) as evidence that (1) was wrong. In my own comment, I also identified the control system model of any kind of proportionality of Rt to infections as a problem. Based on my own observations of behaviour and government response, the MNM hypothesis seems more likely (governments hitting the panic button as imminent death approaches, i.e. hospitals begin to be overwhelmed) than a response that ramps up proportionate to recent infections. I think that explains the tight oscillations. I'd say the dominant contributor to control systems is something like a step function at a particular level near where hospitals are overwhelmed, and individual responses proportionate to exact levels of infection are a lesser part of it. You could maybe operationalize this by looking at past hospitalization rates, fitting a logistic curve to them at the 'overwhelmed' threshold and seeing if that predicts Rt. I think it would do pretty well. This tight control was a surprise and is hard to reproduce in a model, but if our model doesn't reproduce it, we will go on being surprised by the same thing that surprised us before. My own predictions are essentially based on continuing to expect the 'tight control' to continue somehow, i.e. flattening out cases or declining a bit at a very high level after a large swing upwards. It looks like (subsequent couple of days data seem to confirm this), Rt is currently just below 1 in London - which would outright falsify any model that claims Rt never goes below 1 for any amount of infection with the new variant, given our control system response, which according to your graph, the infections exponential model does predict. If you ran this model on the past, what would it predict? Based on what you've said, Rt never goes below one, so there would be a huge first wave with a rapid rise up to partial herd immunity over weeks, based on your diagram. That's the exact same predictive error that was made last year. I note - outside view - that this is very similar to the predictive mistake made last Febuary/March with old Covid-19 - many around here were practically certain we were bound for an immediate (in a month or two) enormous herd immunity overshoot. Comment by SDM on Eight claims about multi-agent AGI safety · 2021-01-07T19:48:32.210Z · LW · GW Humans have skills and motivations (such as deception, manipulation and power-hungriness) which would be dangerous in AGIs. It seems plausible that the development of many of these traits was driven by competition with other humans, and that AGIs trained to answer questions or do other limited-scope tasks would be safer and less goal-directed. I briefly make this argument here. Note that he claims that this may be true even if single/single alignment is solved, and all AGIs involved are aligned to their respective users. It strikes me as interesting that much of the existing work that's been done on multiagent training, such as it is, focusses on just examining the behaviour of artificial agents in social dilemmas. The thinking seems to be - and this was also suggested in ARCHES - that it's useful just for exploratory purposes to try to characterise how and whether RL agents cooperate in social dilemmas, what mechanism designs and what agent designs promote what types of cooperation, and if there are any general trends in terms of what kinds of multiagent failures RL tends to fall into. For example, it's generally known that regular RL tends to fail to cooperate in social dilemmas, 'Unfortunately, selfish MARL agents typically fail when faced with social dilemmas'. From ARCHES: One approach to this research area is to continually ex-amine social dilemmas through the lens of whatever is the leading AI devel-opment paradigm in a given year or decade, and attempt to classify interest-ing behaviors as they emerge. This approach might be viewed as analogous to developing “transparency for multi-agent systems”: first develop inter-esting multi-agent systems, and then try to understand them. There seems to be an implicit assumption here that something very important and unique to multiagent situations would be uncovered - by analogy to things like the flash crash. It's not clear to me that we've examined the intersection of RL and social dilemmas enough to notice if this were true, if it were true, and I think that's the major justification for working on this area. Comment by SDM on Fourth Wave Covid Toy Modeling · 2021-01-07T14:11:35.278Z · LW · GW One thing that you didn't account for - the method of directly scaling the Rt by the multiple on the R0 (which seems to be around 1.55), is only a rough estimate of how much the Rt will increase by when the effective Rt is lowered in a particular situation. It could be almost arbitrarily wrong - intuitively, if the hairdressers are closed, that prevents 100% of transmission in hairdressers no matter how much higher the R0 of the virus is. For this reason, the actual epidemiological models (there aren't any for the US for the new variant, only some for the UK), have some more complicated way of predicting the effect of control measures. This from Imperial College: We quantified the transmission advantage of the VOC relative to non-VOC lineages in twoways: as an additive increase in R that ranged between 0.4 and 0.7, and alternatively as amultiplicative increase in R that ranged between a 50% and 75% advantage. We were not ableto distinguish between these two approaches in goodness-of-fit, and either is plausiblemechanistically. A multiplicative transmission advantage would be expected if transmissibilityhad increased in all settings and individuals, while an additive advantage might reflect increasesin transmissibility in specific subpopulations or contexts. The multiplicative 'increased transmissibility' estimate will therefore tend to underestimate the effect of control measures. The actual paper did some complicated Bayesian regression to try and figure out which model of Rt change worked best, and couldn't figure it out. Measures like ventilation, physical distancing when you do decide to meet up, and mask use will be more multiplicative in how the new variant diminishes their effect. The parts of the behaviour response that involve people just not deciding to meet up or do things in the first place, and anything involving mandatory closures of schools, bars etc. will be less multiplicative. I believe this is borne out in the early data. Lockdown 1 in the UK took Rt down to 0.6. The naive 'multiplicative' estimate would say that's sufficient for the new variant, Rt=0.93. The second lockdown took Rt down to 0.8, which would be totally insufficient. You'd need Rt for the old variant of covid down to 0.64 on the naive multiplicative estimate - almost what was achieved in March. I have a hard time believing it was anywhere near that low in the Tier 4 regions around Christmas. But the data that's come in so far seems to indicate that Tier 4 + Schools closed has either levelled off or caused slow declines in infections in those regions where they were applied. First, the random infection survey - London and South East are in decline and East of England has levelled off (page 3). The UKs symptom study, which uses a totally different methodology, confirms some levelling off and declines in those regions - page 6. It's early days, but clearly Rt is very near 1, and likely below 1 in London. The Financial Times cottoned on to this a few days late but no-one else seems to have noticed. I think this indicates a bunch of things - mainly that infections caused by the new variant can and will be stabilized or even reduced by lockdown measures which people are willing to obey. It's not impossible if it's already happening. To start, let’s also ignore phase shifts like overloading hospitals, and ignore fatigue on the hopes that vaccines coming soon will cancel it out, although there’s an argument that in practice some people do the opposite. I agree with ignoring fatigue, but ignoring phase shifts? If it were me I'd model the entire control system response as a phase shift with the level for the switch in reactions set near the hospital overwhelm level - at least on the policy side, there seems to be an abrupt reaction specifically to the hospital overloading question. The British government pushed the panic button a few days ago in response to that and called a full national lockdown. I'd say the dominant contributor to control systems is something like a step function at a particular level near where hospitals are overwhelmed, and individual responses proportionate to exact levels of infection are a lesser part of it. I think the model of the control system as a continuous response is wrong, and a phased all-or-nothing response for the government side of things, plus taking into account non-multiplicative effects on the Rt, would produce overall very different results - namely that a colossal overshoot of herd immunity in a mere few weeks is probably not happening. I note - outside view - that this is very similar to the predictive mistake made last Febuary/March with old Covid-19 - many around here were practically certain we were bound for an immediate (in a month or two) enormous herd immunity overshoot. Comment by SDM on Covid 12/31: Meet the New Year · 2021-01-05T11:40:03.554Z · LW · GW Many of the same thoughts were in my mind when I linked when I linked that study on the previous post. ---- IMO, it would help clarify arguments about the "control system" a lot to write down the ideas in some quantitative form. ... This tells you nothing about the maximum power of my heating system. In colder temperatures, it'd need to work harder, and at some low enough temperature T, it wouldn't be able to sustain 70F inside. But we can't tell what that cutoff T is until we reach it. "The indoor temperature right now oscillates around 70F" doesn't tell you anything about T. I agree, and in fact the main point I was getting at with my initial comment is that in the two areas I talked about - namely the control system and the overall explanation for failure, there's an unfortunate tendency to toss out quantitative arguments or even detailed models of the world and instead resort to intuitions and qualitative arguments - and then it has a tendency to turn into a referendum on your personal opinions about human nature and the human condition, which isn't that useful for predicting anything. You can see this in how the predictions panned out - as was pointed out by some anonymous commenter, control system 'running out of power' arguments generally haven't been that predictively accurate when it comes to these questions. The rule-of-thumb that I've used - the Morituri Nolumus Mori effect - has fared somewhat better than the 'control system will run out of steam sooner or later' rule-of-thumb, both when I wrote that post and since. The MNM tends to predict last-minute myopic decisions that mostly avoid the worst outcomes, while the 'out of steam' explanation led people to predict that social distancing would mostly be over by now. But neither is a proper quantitative model. In terms of actually giving some quantitative rigour to this question - it's not easy. I made an effort in my old post, by saying how far a society can stray from a control system equilibrium is indicated by how low they managed to get Rt - but the 'gold standard' is to just work off model projections trained on already existing data like I tried to do. As to the second question - overall explanation, there is some data to work off of, but not much. We know that preexisting measures of state capacity don't predict covid response effectiveness, which along with other evidence suggests the 'institutional schlerosis' hypothesis I referred to in my original post. Once again, I think that a clear mechanism - 'institutional sclerosis as part of the great stagnation' - is a much better starting point for unravelling all this than the 'simulacra levels are higher now' perspective that I see a lot around here. That claim is too abstract to easily falsify or derive genuine in-advance predictions. Comment by SDM on Covid 12/31: Meet the New Year · 2021-01-01T14:07:59.917Z · LW · GW I live in Southern England and so have a fair bit of personal investment in all this, but I'll try to be objective. My first reaction, upon reading the LSHTM paper that you referred to, is 'we can no longer win, but we can lose less' - i.e. we are all headed for herd immunity one way or another by mid-year, but we can still do a lot to protect people. That would have been my headline - it's over for suppression and elimination, but 'it's over' isn't quite right. Your initial reaction was different: Are We F***ed? Is it Over? Yeah, probably. Sure looks like it. The twin central points last were that we were probably facing a much more infectious strain (70%), and that if we are fucked in this way, then it is effectively already over in the sense that our prevention efforts would be in vain. The baseline scenario remains, in my mind, that the variant takes over some time in early spring, the control system kicks in as a function of hospitalizations and deaths so with several weeks lag, and likely it runs out of power before it stabilizes things at all, and we blow past herd immunity relatively quickly combining that with our vaccination efforts. You give multiple reasons to expect this, all of which make complete sense - Lockdown fatigue, the inefficiency of prevention, lags in control systems, control systems can't compensate etc. I could give similar reasons to expect the alternative - mainly that the MNM predicts the extreme strength of control systems and that it looks like many places in Europe/Australia did take Rt down to 0.6 or even below! But luckily, none of that is necessary. This preprint model via the LessWrong thread has a confidence interval for increased infectiousness of 50%-74%. I would encourage everyone to look at the scenarios in this paper since they neatly explain exactly what we're facing and mean we don't have to rely on guestimate models and inference about behaviour changes. This model is likely highly robust - it successfully predicted the course of the UK's previous lockdown, with whatever compliance we had then. They simply updated it by putting in the increased infectiousness of the new variant. Since that last lockdown was very recent, compliance isn't going to be wildly different, weather was cold during the previous lockdown, schools were open etc. The estimate for the increase in R given in this paper seems to be the same as that given by other groups e.g. Imperial College. So what does the paper imply? Essentially a Level 4 lockdown (median estimate) flattens out case growth but with schools closed a L4 lockdown causes cases to decline a bit (page 10). 10x-ing the vaccination rate from 200,000 to 2 million reduces the overall numbers of deaths by more than half (page 11). And they only model a one-month lockdown, but that still makes a significant difference to overall deaths (page 11). We managed 500k vaccinations the first week, and it dropped a bit the second week, but with first-doses first and the Oxford/AZ vaccine it should increase again and land somewhere between those two scenarios. Who knows where? For the US, the fundamental situation may look like the first model - no lockdowns at all, so have a look. (Also of note is that the peak demand on the medical system even in the bad scenarios with a level 4 lockdown and schools open is less than 1.5x what was seen during the first peak. That's certainly enough to boost the IFR and could be described as 'healthcare system collapse', since it means surge capacity being used, healthcare workers being wildly overstretched, but to my mind 'collapse' refers to demand that exceeds supply by many multiples such that most people can't get any proper care at all - as was talked about in late feb/early march.) (Edit: the level of accuracy of the LSHTM model should become clear in a week or two) The nature of our situation now is such that every day of delay and every extra vaccinated person makes us incrementally better off. This is a simpler situation than before - before we had the option of suppression, which is all-or-nothing - either you get R under 1 or you don't. The race condition that we're in now, where short lockdowns that temporarily hold off the virus buy us useful time, and speeding up vaccination increases herd immunity and decreases deaths and slackens the burden on the medical system, is a straightforward fight by comparison. You just do whatever you can to beat it back and vaccinate as fast as you can. Now, I don't think you really disagree with me here, except about some minor factual details (I reckon your pre-existing intuitions about what 'Level 4 lockdown' would be capable of doing are different to mine), and you mention the extreme urgency of speeding up vaccine rollout often, We also have a vaccination crisis. WIth the new strain coming, getting as many people vaccinated as fast as possible becomes that much more important. ... With the more reasonable version of this being “we really really really should do everything to speed up our vaccinations, everyone, and to focus them on those most likely to die of Covid-19.” That’s certainly part of the correct answer, and likely the most important one for us as a group. But if I were writing this, my loud headline message would not have been 'It's over', because none of this is over, many decisions still matter. It's only 'over' for the possibility of long term suppression. ***** There's also the much broader point - the 'what, precisely, is wrong with us' question. This is very interesting and complex and deserves a long discussion of its own. I might write one at some point. I'm just giving some initial thoughts here, partly a very delayed response to your reply to me 2 weeks ago (https://www.lesswrong.com/posts/Rvzdi8RS9Bda5aLt2/covid-12-17-the-first-dose?commentId=QvYbhxS2DL4GDB6hF). I think we have a hard-to-place disagreement about some of the ultimate causes of our coronavirus failures. We got a shout-out in Shtetl-Optimized, as he offers his “crackpot theory” that if we were a functional civilization we might have acted like one and vaccinated everyone a while ago ... I think almost everyone on earth could have, and should have, already been vaccinated by now. I think a faster, “WWII-style” approach would’ve saved millions of lives, prevented economic destruction, and carried negligible risks compared to its benefits. I think this will be clear to future generations, who’ll write PhD theses exploring how it was possible that we invented multiple effective covid vaccines in mere days or weeks He's totally right on the facts, of course. The question is what to blame. I think our disagreement here, as revealed in our last discussion, is interesting. The first order answer is institutional sclerosis, inability to properly do expected value reasoning and respond rapidly to new evidence. We all agree on that and all see the problem. You said to me, And I agree that if government is determined to prevent useful private action (e.g. "We have 2020 values")... Implying, as you've said elsewhere, that the malaise has a deeper source. When I said "2020 values" I referred to our overall greater valuation of human life, while you took it to refer to our tendency to interfere with private action - something you clearly think is deeply connected to the values we (individuals and governments) hold today. I see a long term shift towards a greater valuation of life that has been mostly positive, and some other cause producing a terrible outcome from coronavirus in western countries, and you see a value shift towards higher S levels that has caused the bad outcomes from coronavirus and other bad things. Unlike Robin Hanson, though, you aren't recommending we attempt to tell people to go off and have different values - you're simply noting that you think our tendency to make larger sacrifices is a mistake. "...even when the trade-offs are similar, which ties into my view that simulacra and maze levels are higher, with a larger role played by fear of motive ambiguity." This is probably the crux. I don't think we tend to go to higher simulacra levels now, compared to decades ago. I think it's always been quite prevalent, and has been roughly constant through history. While signalling explanations definitely tell us a lot about particular failings, they can't explain the reason things are worse now in certain ways, compared to before. The difference isn't because of the perennial problem of pervasive signalling. It has more to do with economic stagnation and not enough state capacity. These flaws mean useful action gets replaced by useless action, and allow more room for wasteful signalling. As one point in favour of this model, I think it's worth noting that the historical comparisons aren't ever to us actually succeeding at dealing with pandemics in the past, but to things like "WWII-style" efforts - i.e. thinking that if we could just do x as well as we once did y then things would have been a lot better. This implies that if you made an institution analogous to e.g. the weapons researchers of WW2 and the governments that funded them, or NASA in the 1960s, without copy-pasting 1940s/1960s society wholesale, the outcome would have been better. To me that suggests it's institution design that's the culprit, not this more ethereal value drift or increase in overall simulacra levels. There are other independent reasons to think the value shift has been mostly good, ones I talked about in my last post. As a corollary, I also think that your mistaken predictions in the past - that we'd give up on suppression or that the control system would fizzle out, are related to this. If you think we operate at higher S levels than in the past, you'd be more inclined to think we'll sooner or later sleepwalk into a disaster. If you think there is a strong, consistent, S1 drag away from disaster, as I argued way back here, you'd expect strong control system effects that seem surprisingly immune to 'fatigue'. Comment by SDM on New SARS-CoV-2 variant · 2020-12-22T00:00:04.400Z · LW · GW Update: this from public health England explicitly says Rt increases by 0.57, https://twitter.com/DevanSinha/status/1341132723105230848?s=20 "We find that Rt increases by 0.57 [95%CI: 0.25-1.25] when we use a fixed effect model for each area. Using a random effect model for each area gives an estimated additive effect of 0.74 [95%CI: 0.44- 1.29]. an area with an Rt of 0.8 without the new variant would have an Rt of 1.32 [95%CI:1.19-1.50] if only the VOC was present." But for R, if it's 0.6 not 0.8 and the ratio is fixed then another march style lockdown in the UK would give R = 0.6 *(1.32/0.8)= 0.99 Comment by SDM on New SARS-CoV-2 variant · 2020-12-21T20:54:51.473Z · LW · GW EDIT: doubling time would go from 17 days to 4 days (!) with the above change of numbers. This doesn't fit given what is currently observed. The doubling time for the new strain does appear to be around 6-7 days. And the doubling time for London overall is currently 6 days. If the mitigated Rt is +0.66 and the growth rate is +71% figures are inconsistent with each other as you say, then perhaps the second is mistaken and +71% means that the Rt is 71% higher, not the case growth rate, which is vaguely consistent with the Rt is +58% higher estimate from the absolute increase. Or "71% higher daily growth rate" could be right and the +0.66 could be referring to the R0, as you say. This does appear to have been summarized as 'the new strain is 71% more infectious' in many places, and many people have apparently inferred the R0 is >50% higher - hopefully we're wrong. Computer modelling of the viral spread suggests the new variant could be 70 per cent more transmissible. The modelling shows it may raise the R value of the virus — the average number of people to whom someone with Covid-19 passes the infection — by at least 0.4, I think this is what happens when people don't show their work. So either 'R number' is actually referring to R0 and not Rt, or 'growth rate' isn't referring to the daily growth rate but to the Rt/R0. I agree that the first is more plausible. All I'll say is that a lot of people are assuming the 70% figure or something close to it is a direct multiplier to the Rt, including major news organizations like the Times and Ft. But I think you're probably right and the R0 is more like 15% larger not 58/70% higher. EDIT: New info from PHE seems to contradict this, https://t.co/r6GOyXFDjh?amp=1 Comment by SDM on New SARS-CoV-2 variant · 2020-12-21T20:22:49.908Z · LW · GW EDIT: PHE has seemingly confirmed the higher estimate for change in R, ~65%. https://t.co/r6GOyXFDjh?amp=1 What, uh, does the "71% higher growth rate" mean TLDR: I think that it's probably barely 15% more infectious and the math of spread near equilibrium amplifies things. I admit that I have not read all available documents in detail, but I presume that what they said means something like "if ancestor has a doubling time of X, then variant is estimated as having a doubling time of X/(1+0.71) = 0.58X" In the meeting minutes, the R-value (Rt) was estimated to have increased by 0.39 to 0.93, the central estimate being +0.66 - 'an absolute increase in the R-value of between 0.39 to 0.93'. Then we see 'the growth rate is 71% higher than other variants'. You're right that this is referring to the case growth rate - they're saying the daily increase is 1.71 times higher, possibly? I'm going to estimate the relative difference in Rt of the 2 strains from the absolute difference they provided - the relative difference in Rt (Rt(new covid now)/Rt(old covid now)) in the same region, should, I think, be the factor that tells us how more infectious the new strain is. We need to know what the pre-existing, current, Rt of just the old strain of covid-19 is. Current central estimate for covid in the UK overall is 1.15. This guess was that the 'old covid' Rt was 1.13. (0.66+1.13)/1.13 = 1.79 (Rt of new covid now)/1.13(Rt of old covid now) = 1.58, which implies that the Rt of the new covid is currently 58% higher than the old, which should be a constant factor, unless I'm missing something fundamental. (For what it's worth, the Rt in london where the new strain makes up the majority of cases is close to that 1.79) value). So, the Rt and the R0 of the new covid is 58% higher - that would make the R0 somewhere around 4.5-5. Something like that rough conclusion was also reached e.g. here or here or here or here or here, with discussion of 'what if the R0 was over 5' or '70% more infectious' or 'Western-style lockdown will not suppress' (though may be confusing the daily growth rate with the R0). This estimate from different data said the Rt was 1.66/1.13 = 47% higher which is close-ish to the 58% estimate. I may have made a mistake somewhere here, and those sources have made the same mistake, but this seems inconsistent with your estimate that the new covid is 15% more infectious, i.e. the Rt and R0 is 15% higher not 58% higher. This seems like a hugely consequential question. If the Rt of the new strain is more than ~66% larger than the Rt of the old strain, then March-style lockdowns which reduced Rt to 0.6 will not work, and the covid endgame will turn into a bloody managed retreat, to delay the spread and flatten the curve for as long as possible while we try to vaccinate as many people as possible. Of course, we should just go faster regardless: Second, we do have vaccines and so in any plausible model faster viral spread implies a faster timetable for vaccine approval and distribution. And it implies we should have been faster to begin with. If you used to say “we were just slow enough,” you now have to revise that opinion and believe that greater speed is called for, both prospectively and looking backwards. In any plausible model. If you are right then this is just a minor step up in difficulty. Tom Chivers agrees with you, that this is an 'amber light', metaculus seems undecided (probability of UK 2nd wave worse than 1st; increased by 20% to 42% when this news appeared), some of the forecasters seem to agree with you or be uncertain. Comment by SDM on Covid 12/17: The First Dose · 2020-12-18T19:19:39.528Z · LW · GW On the economic front, we would have had to choose either to actually suppress the virus, in which case we get much better outcomes all around, or to accept that the virus couldn’t be stopped, *which also produces better economic outcomes. * Our technological advancement gave us the choice to make massively larger Sacrifices to the Gods rather than deal with the situation. And as we all know, choices are bad. We also are, in my model, much more inclined to make such sacrifices now than we were in the past, So, by 'Sacrifices to the Gods' I assume you're referring to the entirety of our suppression spending - because it's not all been wasted money, even if a large part of it has. In other places you use that phrase to refer specifically to ineffective preventative measures. 'We also are, in my model, much more inclined to make such sacrifices now than we were in the past '- this is a very important point that I'm glad you recognise - there has been a shift in values such that we (as individuals, as well as governments) are guaranteed to take the option of attempting to avoid getting the virus and sacrificing the economy to a greater degree than in 1919, or 1350, because our society values human life and safety differently. And realistically, if we'd approached this with pre-2020 values and pre-2020 technology, we'd have 'chosen' to let the disease spread and suffered a great deal of death and destruction - but that option is no longer open to us. For better, as I think, or for worse, as you think. You can do the abstract a cost-benefit calculation about whether the other harms of the disease have caused more damage than the disease, but it won't tell you anything about whether the act of getting governments to stop lockdowns and suppression measures will be better or worse than having them to try. Robin Hanson directly confuses these two in his argument that we are over-preventing covid. We see variations in both kinds of policy across space and time, due both to private and government choices, all of which seem modestly influenceable by intellectuals like Caplan, Cowen, and I... But we should also consider the very real possibility that the political and policy worlds aren’t very capable of listening to our advice about which particular policies are more effective than others. They may well mostly just hear us say “more” or “less”, such as seems to happen in medical and education spending debates. Here Hanson is equivocating between (correctly) identifying the entire cost of COVID-19 prevention as due to 'both private and government choices' and then focussing on just 'the political and policy worlds' in response to whether we should argue for less prevention. The claim (which may or may not be true) that 'we overall are over-preventing covid relative to the abstract alternative where we don't' gets equated to 'therefore telling people to overall reduce spending on covid prevention will be beneficial on cost-benefit terms'. Telling governments to spend less money is much more likely to work than ordering people to have different values. So making governments spend less on covid prevention diminishes their more effective preventative actions while doing very little about the source of most of the covid prevention spending (individual action). Like-for-like comparisons where values are similar but policy is different (like Sweden and its neighbours), make it clear that given the underlying values we have, which lead to the behaviours that we have observed this year, the imperative 'prevent covid less' leads to outcomes that are across the board worse. Or consider Sweden, which had a relatively non-panicky Covid messaging, no matter what you think of their substantive policies. Sweden didn’t do any better on the gdp front, and the country had pretty typical adverse mobility reactions. (NB: These are the data that you don’t see the “overreaction” critics engage with — at all. And there is more where this came from.) How about Brazil? While they did some local lockdowns, they have a denialist president, a weak overall response, and a population used to a high degree of risk. The country still saw a gdp plunge and lots of collateral damage. You might ponder this graph, causality is tricky and the “at what margin” question is trickier yet, but it certainly does not support what Bryan is claiming about the relevant trade-offs. So, with the firm understanding that given the values we have, and the behaviour patterns we will inevitably adopt, telling people to prevent the pandemic less is worse economically and worse in terms of deaths, we can then ask the further, more abstract question that you ask - what if our values were different? That is, what if the option was available to us because we were actually capable of letting the virus rip. I wanted to put that disclaimer in because discussing whether we have developed the right societal values is irrelevant for policy decisions going forward - but still important for other reasons. I'd be quite concerned if our value drift over the last century or so was revealed as overall maladapted, but it's important to talk about the fact that this is the question that's at stake when we ask if society is over-preventing covid. I am not asking whether lockdowns or suppression are worth it now - they are. You seem to think that our values should be different; that it's at least plausible that signalling is leading us astray and causing us to overvalue the direct damage of covid, like lives lost, in place of concern for overall damage. Unlike Robin Hanson, though, you aren't recommending we attempt to tell people to go off and have different values - you're simply noting that you think our tendency to make larger sacrifices is a mistake. ...even when the trade-offs are similar, which ties into my view that simulacra and maze levels are higher, with a larger role played by fear of motive ambiguity. We might have been willing to do challenge trials or other actual experiments, and have had a much better handle on things quicker on many levels. There are two issues here - one is that it's not at all clear whether the initial cost-benefit calculation about over-prevention is even correct. You don't claim to know if we are over-preventing in this abstract sense (compared to us having different values and individually not avoiding catching the disease), and the evidence that we are over-preventing comes from a twitter poll of Bryan Caplan's extremely libertarian-inclined followers who he told to try as hard as possible to be objective in assessing pandemic costs because he asked them what 'the average American' would value (Come on!!). Tyler Cowen briefly alludes to how woolly the numbers are here, 'I don’t agree with Bryan’s numbers, but the more important point is one of logic'. The second issue is whether our change in values is an aberration caused by runaway signalling or reflects a legitimate, correct valuation of human life. Now, the fact that a lot of our prevention spending has been wasteful counts in favour of the signalling explanation, but on the other hand there's a ton of evidence that we in the past, in general, valued life too little. [There's also the point that this seems like exactly a case where a signalling explanation is hard to falsify, an issue I talked about here, I worry that there is a tendency to adopt self-justifying signalling explanations, where an internally complicated signalling explanation that's hard to distinguish from a simpler 'lying' explanation, gets accepted, not because it's a better explanation overall but just because it has a ready answer to any objections. If 'Social cognition has been the main focus of Rationality' is true, then we need to be careful to avoid overusing such explanations. Stefan Schubert explains how this can end up happening: I think the correct story is that the value shift has been good and bad - valuing human life more strongly has been good, but along with that its become more valuable to credibly fake valuing human life, which has been bad. Comment by SDM on Commentary on AGI Safety from First Principles · 2020-11-25T16:28:27.402Z · LW · GW Yeah - this is a case where how exactly the transition goes seems to make a very big difference. If it's a fast transition to a singleton, altering the goals of the initial AI is going to be super influential. But if it's that there are many generations of AIs that over time become the larger majority of the economy, then just control everything - predictably altering how that goes seems a lot harder at least. Comparing the entirety of the Bostrom/Yudkowsky singleton intelligence explosion scenario to the slower more spread out scenario, it's not clear that it's easier to predictably alter the course of the future in the first compared to the second. In the first, assuming you successfully set the goals of the singleton, the hard part is over and the future can be steered easily because there are, by definition, no more coordination problems to deal with. But in the first, a superintelligent AGI could explode on us out of nowhere with little warning and a 'randomly rolled utility function', so the amount of coordination we'd need pre-intelligence explosion might be very large. In the second slower scenario, there are still ways to influence the development of AI - aside from massive global coordination and legislation, there may well be decision points where two developmental paths are comparable in terms of short-term usefulness but one is much better than the other in terms of alignment or the value of the long-term future. Stuart Russell's claim that we need to replace 'the standard model' of AI development is one such example - if he's right, a concerted push now by a few researchers could alter how nearly all future AI systems are developed for the better. So different conditions have to be met for it to be possible to predictably alter the future long in advance on the slow transition model (multiple plausible AI development paths that could be universally adopted and have ethically different outcomes) compared to the fast transition model (the ability to anticipate when and where the intelligence explosion will arrive and do all the necessary alignment work in time), but its not obvious to me one is easier to meet than the other. For this reason, I think it's unlikely there will be a very clearly distinct "takeoff period" that warrants special attention compared to surrounding periods. I think the period AI systems can, at least in aggregate, finally do all the stuff that people can do might be relatively distinct and critical -- but, if progress in different cognitive domains is sufficiently lumpy, this point could be reached well after the point where we intuitively regard lots of AI systems as on the whole "superintelligent." This might be another case (like 'the AIs utility function') where we should just retire the term as meaningless, but I think that 'takeoff' isn't always a strictly defined interval, especially if we're towards the medium-slow end. The start of the takeoff has a precise meaning only if you believe that RSI is an all-or-nothing property. In this graph from a post of mine, the light blue curve has an obvious start to the takeoff where the gradient discontinuously changes, but what about the yellow line? There clearly is a takeoff in that progress becomes very rapid, but there's no obvious start point, but there is still a period very different from our current period that is reached in a relatively short space of time - so not 'very clearly distinct' but still 'warrants special attention'. At this point I think it's easier to just discard the terminology altogether. For some agents, it's reasonable to describe them as having goals. For others, it isn't. Some of those goals are dangerous. Some aren't. Daniel Dennett's Intentional stance is either a good analogy for the problem of "can't define what has a utility function" or just a rewording of the same issue. Dennett's original formulation doesn't discuss different types of AI systems or utility functions, ranging in 'explicit goal directedness' all the way from expected-minmax game players to deep RL to purely random agents, but instead discusses physical systems ranging from thermostats up to humans. Either way, if you agree with Dennett's formulation of the intentional stance I think you'd also agree that it doesn't make much sense to speak of 'the utility function as necessarily well-defined. Comment by SDM on Covid 11/19: Don’t Do Stupid Things · 2020-11-20T18:42:48.555Z · LW · GW Much of Europe went into strict lockdown. I was and am still skeptical that they were right to keep schools open, but it was a real attempt that clearly was capable of working, and it seems to be working. The new American restrictions are not a real attempt, and have no chance of working. The way I understand it is that 'being effective' is making an efficient choice taking into account asymmetric risk and the value of information, and the long-run trade-offs. This involves things like harsh early lockdowns, throwing endless money at contact tracing, and strict enforcement of isolation. Think Taiwan, South Korea. Then 'trying' is adopting policies that have a reasonable good chance of working, but not having a plan if they don't work, not erring on the side of caution of taking into account asymmetric risk when you adopt the policies, and not responding to new evidence quickly. The schools thing is a perfect example - closing has costs (makes the lockdown less effective and therefore longer), and it wasn't overwhelmingly clear that schools had to close to turn R under 1, so that was good enough. Partially funding tracing efforts, waiting until there's visibly no other choice and then calling a strict lockdown - that's 'trying'. Think the UK and France. And then you have 'trying to try', which you explain in detail. Dolly Parton helped fund the Moderna vaccine. Neat. No idea why anyone needed to do that, but still. Neat. It's reassuring to know that if the administrative state and the pharmaceutical industry fails, we have Dolly Parton. Comment by SDM on Some AI research areas and their relevance to existential safety · 2020-11-20T18:22:22.622Z · LW · GW That said, I remain interested in more clarity on what you see as the biggest risks with these multi/multi approaches that could be addressed with technical research. A (though not necessarily the most important) reason to think technical research into computational social choice might be useful is that examining specifically the behaviour of RL agents from a computational social choice perspective might alert us to ways in which coordination with future TAI might be similar or different to the existing coordination problems we face. (i) make direct improvements in the relevant institutions, in a way that anticipates the changes brought about by AI but will most likely not look like AI research, It seems premature to say, in advance of actually seeing what such research uncovers, whether the relevant mechanisms and governance improvements are exactly the same as the improvements we need for good governance generally, or different. Suppose examining the behaviour of current RL agents in social dilemmas leads to a general result which in turn leads us to conclude there's a disproportionate chance TAI in the future will coordinate in some damaging way that we can resolve with a particular new regulation. It's always possible to say, solving the single/single alignment problem will prevent anything like that from happening in the first place, but why put all your hopes on plan A, when plan B is relatively neglected? Comment by SDM on Some AI research areas and their relevance to existential safety · 2020-11-20T18:10:34.056Z · LW · GW Thanks for this long and very detailed post! The MARL projects with the greatest potential to help are probably those that find ways to achieve cooperation between decentrally trained agents in a competitive task environment, because of its potential to minimize destructive conflicts between fleets of AI systems that cause collateral damage to humanity. That said, even this area of research risks making it easier for fleets of machines to cooperate and/or collude at the exclusion of humans, increasing the risk of humans becoming gradually disenfranchised and perhaps replaced entirely by machines that are better and faster at cooperation than humans. In ARCHES, you mention that just examining the multiagent behaviour of RL systems (or other systems that work as toy/small-scale examples of what future transformative AI might look like) might enable us to get ahead of potential multiagent risks, or at least try to predict how transformative AI might behave in multiagent settings. The way you describe it in ARCHES, the research would be purely exploratory, One approach to this research area is to continually ex-amine social dilemmas through the lens of whatever is the leading AI devel-opment paradigm in a given year or decade, and attempt to classify interest-ing behaviors as they emerge. This approach might be viewed as analogousto developing “transparency for multi-agent systems”: first develop inter-esting multi-agent systems, and then try to understand them. But what you're suggesting in this post, 'those that find ways to achieve cooperation between decentrally trained agents in a competitive task environment', sounds like combining computational social choice research with multiagent RL - examining the behaviour of RL agents in social dilemmas and trying to design mechanisms that work to produce the kind of behaviour we want. To do that, you'd need insights from social choice theory. There is some existing research on this, but it's sparse and very exploratory. My current research is attempting to build on the second of these. As far as I can tell, that's more or less it in terms of examining RL agents in social dilemmas, so there may well be a lot of low-hanging fruit and interesting discoveries to be made. If the research is specifically about finding ways of achieving cooperation in multiagent systems by choosing the correct (e.g. voting) mechanism, is that not also computational social choice research, and therefore of higher priority by your metric? In short, computational social choice research will be necessary to legitimize and fulfill governance demands for technology companies (automated and human-run companies alike) to ensure AI technologies are beneficial to and controllable by human society. ... CSC neglect: As mentioned above, I think CSC is still far from ready to fulfill governance demands at the ever-increasing speed and scale that will be needed to ensure existential safety in the wake of “the alignment revolution”. Comment by SDM on The 300-year journey to the covid vaccine · 2020-11-10T13:03:45.396Z · LW · GW The remedies for all our diseases will be discovered long after we are dead; and the world will be made a fit place to live in, after the death of most of those by whose exertions it will have been made so. It is to be hoped that those who live in those days will look back with sympathy to their known and unknown benefactors. — John Stuart Mill, diary entry for 15 April 1854 Comment by SDM on AGI safety from first principles: Goals and Agency · 2020-11-02T18:00:56.649Z · LW · GW Furthermore, we should take seriously the possibility that superintelligent AGIs might be even less focused than humans are on achieving large-scale goals. We can imagine them possessing final goals which don’t incentivise the pursuit of power, such as deontological goals, or small-scale goals. ... My underlying argument is that agency is not just an emergent property of highly intelligent systems, but rather a set of capabilities which need to be developed during training, and which won’t arise without selection for it Was this line of argument inspired by Ben Garfinkel's objection to the 'classic' formulation of instrumental convergence/orthogonality - that these are 'measure based' arguments that just identify that a majority of possible agents with some agentive properties and large-scale goals will optimize in malign ways, rather than establishing that we're actually likely to build such agents? It seems like you're identifying the same additional step that Ben identified, and that I argued could be satisfied - that we need a plausible reason why we would build an agentive AI with large-scale goals. And the same applies for 'instrumental convergence' - the observation that most possible goals, especially simple goals, imply a tendency to produce extreme outcomes when ruthlessly maximised: • A system that is optimizing a function of n variables, where the objective depends on a subset of size k<n, will often set the remaining unconstrained variables to extreme values; if one of those unconstrained variables is actually something we care about, the solution found may be highly undesirable. We could see this as marking out a potential danger - a large number of possible mind-designs produce very bad outcomes if implemented. The fact that such designs exist 'weakly suggest' (Ben's words) that AGI poses an existential risk since we might build them. If we add in other premises that imply we are likely to (accidentally or deliberately) build such systems, the argument becomes stronger. But usually the classic arguments simply note instrumental convergence and assume we're 'shooting into the dark' in the space of all possible minds, because they take the abstract statement about possible minds to be speaking directly about the physical world. There are specific reasons to think this might occur (e.g. mesa-optimisation, sufficiently fast progress preventing us from course-correcting if there is even a small initial divergence) but those are the reasons that combine with instrumental convergence to produce a concrete risk, and have to be argued for separately. Comment by SDM on SDM's Shortform · 2020-10-30T17:04:06.134Z · LW · GW I think that the notion of Simulacra Levels is both useful and important, especially when we incorporate Harry Frankfurt's idea of Bullshit Harry Frankfurt's On Bullshit seems relevant here. I think its worth trying to incorporate Frankfurt's definition as well, as it is quite widely known, see e.g. this video - If you were to do so, I think you would say that on Frankfurt's definition, Level 1 tells the truth, Level 2 lies, Level 3 bullshits about physical facts but will lie or tell the truth about things in the social realm (e.g. others motives, your own affiliation), and Level 4 always bullshits. How do we distinguish lying from bullshit? I worry that there is a tendency to adopt self-justifying signalling explanations, where an internally complicated signalling explanation that's hard to distinguish from a simpler 'lying' explanation, gets accepted, not because it's a better explanation overall but just because it has a ready answer to any objections. If 'Social cognition has been the main focus of Rationality' is true, then we need to be careful to avoid overusing such explanations. Stefan Schubert explains how this can end up happening: ... It seems to me that it’s pretty common that signalling explanations are unsatisfactory. They’re often logically complex, and it’s tricky to identify exactly what evidence is needed to demonstrate them. And yet even unsatisfactory signalling explanations are often popular, especially with a certain crowd. It feels like you’re removing the scales from our eyes; like you’re letting us see our true selves, warts and all. And I worry that this feels a bit too good to some: that they forget about checking the details of how the signalling explanations are supposed to work. Thus they devise just-so stories, or fall for them. This sort of signalling paradigm also has an in-built self-defence, in that critics are suspected of hypocrisy or naïveté. They lack the intellectual honesty that you need to see the world for what it really is, the thinking goes Comment by SDM on "Scaling Laws for Autoregressive Generative Modeling", Henighan et al 2020 {OA} · 2020-10-30T14:28:12.882Z · LW · GW It may well be a crux - an efficient 'tree search' or a similar goal-directed wrapper around a GPT-based system, that can play a role in real-world open-ended planning (presumably planning for an agent to be effecting outcomes in the real world via its text generation), would have to cover continuous action spaces and possible states containing unknown and shifting sets of possible actions (unlike the discrete and small, relative to the real universe, action space of Go which is perfect for a tree search), running (or approximating running) millions of primitive steps (individual text generations and exchanges) into the future (for long-term planning towards e.g. a multi-decade goal like humans are capable of). That sounds like a problem that's at least as hard as a language-model 'success probability predictor' GPT-N (probably with reward-modelling help, so it can optimize for a specific goal with its text generation). Though such a system would still be highly transformative, if it was human-level at prediction. To clarify, this is Transformative not 'Radically Transformative' - transformative like Nuclear Power/Weapons, not like a new Industrial Revolution or an intelligence explosion. I would expect tree search powered by GPT-6 to be probably pretty agentic. I could imagine (if you found a domain with a fairly constrained set of actions and states, but involved text prediction somehow) that you could get agentic behaviour out of a tree search like the ones we currently have + GPT-N + an RL wrapper around the GPT-N. That might well be quite transformative - could imagine it being very good for persuasion, for example. Comment by SDM on Open & Welcome Thread – October 2020 · 2020-10-30T13:45:47.316Z · LW · GW I don't know Wei Dai's specific reasons for having such a high level of concern, but I suspect that they are similar to the arguments given by the historian Niall Ferguson in this debate with Yascha Mounk on how dangerous 'cancel culture' is. Ferguson likes to try and forecast social and cultural trends years in advance and thinks that he sees a cultural-revolution like trend growing unchecked. Ferguson doesn't give an upper bound on how bad he thinks things could get, but he thinks 'worse than McCarthyism' is reasonable to expect over the next few years, because he thinks that 'cancel culture' has more broad cultural support and might also gain hard power in institutions. Now - I am more willing to credit such worries than I was a year ago, but there's a vast gulf between a trend being concerning and expecting another Cultural Revolution. It feels too much like a direct linear extrapolation fallacy - 'things have become worse over the last year, imagine if that keeps on happening for the next six years!' I wasn't expecting a lot of what happened over the last eight months in the US on the 'cancel culture' side, but I think that a huge amount of this is due to a temporary, Trump- and Covid- and Recession-related heating up of the political discourse, not a durable shift in soft power or people's opinions. I think the opinion polls back this up. If I'm right that this will all cool down, we'll know in another year or so. I also think that Yascha's arguments in that debate about the need for hard institutional power that's relatively unchecked, to get a Cultural-Revolution like outcome, are really worth considering. I don't see any realistic path to that level of hard, governmental power at enough levels being held by any group in the US. Comment by SDM on "Scaling Laws for Autoregressive Generative Modeling", Henighan et al 2020 {OA} · 2020-10-29T20:44:59.458Z · LW · GW I think that it could plausibly be quite transformative in a TAI sense and occur over the next ten years, so perhaps we don't have all that much of a disagreement on that point. I also think (just because we don't have an especially clear idea of how modular intelligence is) that it could be quite uniform and a text predictor could surprise us with humanlike planning. Maybe the text predictor by itself wouldn't be an agent, but the text predictor could be re-trained as an agent fairly easily, or combined into a larger system that uses tree search or something and thus is an agent. This maybe reflects a difference in intuition about how difficult agentive behaviour is to reach rather than language understanding. I would expect a simple tree search algorithm powered by GPT-6 to be... a model with humanlike language comprehension and incredibly dumb agentive behaviour, and that it wouldn't be able to leverage the 'intelligence' of the language model in any significant way, because I see that as a seperate problem requiring seperate, difficult work. But I could be wrong. I think there is a potential bias in that human-like language understanding and agentive behaviour have always gone together in human beings - we have no idea what a human-level language model that wasn't human-level intelligent would be like. Since we can't imagine it, we tend to default to imagining a human-in-a-box. I'm trying to correct for this bias by imagining that it might be quite different. Comment by SDM on Covid Covid Covid Covid Covid 10/29: All We Ever Talk About · 2020-10-29T20:18:28.926Z · LW · GW If you are keeping schools open in light of the graphs above, and think you are not giving up, I don’t even know how to respond. I think the French lockdown probably won't work without school closures, and this probably will be noticed soon when the data comes through establishing that it doesn't work, and I think that it's extremely dumb to not close schools given that the risk for closing vs not closing at this point is extremely asymmetric, but this isn't 'giving up' knowingly (and I infer that you're suggesting Macron may be trying to show that he is trying while actually giving up) - this is simply Macron and his cabinet not intuitively understanding asymmetric risk and not realizing that it's much better to do far more than what was sufficient, compared to doing something that just stands an okay chance of being sufficient to suppress, in order to avoid costs later. I think that there is a current tendency - and I see it in some of your statements about the beliefs of the 'doom patrol' - to use signalling explanations almost everywhere, and sometimes that shades into accepting a lower burden of proof, even if the explanation doesn't quite fit. For example, the European experience over the summer is mostly a story of a hideous but predictable failure to understand the asymmetric risk and costs of opening up / investing more vs less in tracing, testing and enforcement. Signalling plays a role in explaining this irrationality, certainly, but as I explained in last week's comment wedging everything into a box of 'signalling explanations' doesn't always work. Maybe it makes more sense in the US, where the coronavirus response has been much more politicised. Stefan Schubert has a great blog post on this tendency: It seems to me that it’s pretty common that signalling explanations are unsatisfactory. They’re often logically complex, and it’s tricky to identify exactly what evidence is needed to demonstrate them. And yet even unsatisfactory signalling explanations are often popular, especially with a certain crowd. It feels like you’re removing the scales from our eyes; like you’re letting us see our true selves, warts and all. And I worry that this feels a bit too good to some: that they forget about checking the details of how the signalling explanations are supposed to work. Thus they devise just-so stories, or fall for them. This sort of signalling paradigm also has an in-built self-defence, in that critics are suspected of hypocrisy or naïveté. They lack the intellectual honesty that you need to see the world for what it really is, the thinking goes I think that a few of your explanations fall into this category. They’re pushing the line that even after both of you have an effective vaccine you still need to socially distance. Isn't this... true? Given that an effective vaccine will take time to distribute (best guess 25 million doses by early next spring), that there will be a long period where we're approaching herd immunity and the risk is steadily decreasing as more people become immune, Fauci is probably worried about people risk compensating during this interval, so he's trying to emphasise that a vaccine won't be perfectly protective and might take a while, maybe exaggerating both claims, while not outright lying. I agree that this type of thinking can shade into doom-mongering and sometimes outright lying about how long vaccines might take but this seems like solidly consequentialist lying to promote social distancing (SL 2), not bullshitting (SL 3). Maybe they've gotten the behavioural response wrong, and it's much better go be truthful, clear and give people reasonable hope (I think it is), but that's a difference in strategy, not pure SL3 bullshit. Why are you so confident that it's the latter? I don’t think this is something being said in order to influence behavior, or even to influence beliefs. That is not the mindset we are dealing with at this point. It’s not about truth. It’s not about consequentialism. We have left not only simulacra level 1 but also simulacra level 2 fully behind. It’s about systems that instinctively and continuously pull in the direction of more fear, more doom, more warnings, because that is what is rewarded and high status and respectable and serious and so on, whereas giving people hope of any kind is the opposite. That’s all this is. That's a bold claim to make about someone with a history like Fauci's, and since 'the priority with first vaccinations is to prevent symptoms and preventing infection is a bonus' is actually true, if misleading, I don't think it's warranted. This just sounds exactly like generic public health messaging aimed at getting people to wear masks now by making them not focus on the prospect of a vaccine. Plus it might even be important to know, especially when you consider that vaccination will happen slowly and Fauci doesn't want people to risk compensate after some people around them have been vaccinated but they haven't been. I don't think Fauci is thinking beyond saying whatever he needs to say to drive up mask compliance right now, which is SL 2. Your explanation that Dr Fauci has lost track of whether or not vaccines actually prevent infection might be true - but it strikes me as weird and confusing, something you'd expect of a more visibly disordered person, and the kind of thing you'd need more evidence of than what he said in that little clip. I think those explanations absolutely have their place, especially for explaining some horrible public health messaging by some politicians and public-facing experts and most of the media, but I think this particular example is overuse of signalling explanations in the way argued for in the article I linked above. At the very least I think the SL2 consequentalist lying explanation is simpler and has a plausible story behind it, so I don't know why you'd go for the less clear SL3 explanation with apparent certainty. Essentially, Europe chose to declare victory and leave home without eradication, and the problem returned, first slowly, now all at once, as it was bound to do without precautions. We did take plenty of precautions, they were just wholly inadequate relative to the potential damage of a second wave. A lot of this was not understanding the asymmetric risk. Most of Europe had precautions that might work and testing and tracing systems that were catching some of the infected and various shifting rules about social distancing and it was at least unclear if they would be sufficient. I can't speak about other countries, but people in the UK were intellectually extremely nervous about the reopening and most people consistently polled saying it was too soon to reopen. For a while it worked - including in July when there was a brief increase in the UK that was reversed successfully. The number of people I see around me wearing masks has been increasing steadily ever since the start of the pandemic. So it was easy and convenient to say, 'it's a risk worth taking, it's worked out so far' at least for a while - even though any sane calculation of the risks should have said we ought to have invested vastly more than we did in testing, tracing, enforcement, supported isolation etc. even if things looked like they were under control. Not that giving up is obviously the wrong thing to do! But that does not seem to be Macron’s plan. ... We are going to lock you down if you misbehave, so if you misbehave all you’re doing is locking yourself down. She’s right, of course, that things will keep getting worse until we change the trajectory and make them start getting better, but no the interventions to regain control are exactly the same either way. You either get R below 1, or you don’t. Except that the more it got out of control first, the more voluntary adjustments you’ll see, and the more people will be immune, so the more out of control it gets the easier it is to control later. ... And also the longer you wait, the longer you have to spend with stricter measures. The measures don't need to be stricter unless you can't tolerate as long with high infection rates, in which case you need infection rates to go down much faster. I don't know if makes me and Tyler Cowen and most epidemiologists part of the 'doom patrol' if we say that you'll need a longer interval of either voluntary behaviour change to avoid infection or a longer lockdown the more you wait. (Note that I'm not denying that there are such doomers. Some of the things you mention, like people explicitly denying coronavirus treatment has made the disease less deadly and left hospitals much better able to cope, aren't really things in Europe or the UK and I was amazed to learn people in the US are claiming things that insane, but we have our own fools demanding pointless sacrifices - witness the recent ban Wales put on buying 'nonessential goods' within supermarkets) If by 'giving up' you mean 'not changing the government mandated measures currently on offer to be more like a lockdown', given the situation France is in right now, it seems undeniably the wrong thing to do to rely on voluntary behaviour changes and hope that there's no spike that overwhelms hospitals (again, asymmetric risk!) - worse for the economy, lives and certainly for other knock-on effects like hospital overloading. A lot of estimations of the marginal cost of suppression measures completely miss the point that the costs and benefits just don't separate out neatly, as I argue here. Tyler Cowen: I think back to when I was 12 or 13, and asked to play the Avalon Hill board game Blitzkrieg. Now, as the name might indicate, you win Blitzkrieg by being very aggressive. My first real game was with a guy named Tim Rice, at the Westwood Chess Club, and he just crushed me, literally blitzing me off the board. I had made the mistake of approaching Blitzkrieg like chess, setting up my forces for various future careful maneuvers. I was back on my heels before I knew what had happened. Due to its potential for exponential growth, Covid-19 is more like Blitzkrieg than it is like chess. You are either winning or losing (badly), and you would prefer to be winning. A good response is about trying to leap over into that winning space, and then staying there. If you find that current prevention is failing a cost-benefit test, that doesn’t mean the answer is less prevention, which might fail a cost-benefit test all the more, due to the power of the non-local virus multiplication properties to shut down your economy and also take lives and instill fear. You still need to come up with a way of beating Covid back. 'Giving up' is not actually giving up. At least in Europe, given the state of public behaviour and opinion about the virus, 'giving up' just means Sweden's 'voluntary suppression' in practice. There is no outcome where we uniformly line up to variolate ourselves and smoothly approach herd immunity. The people who try to work out the costs and benefits of 'lockdowns' are making a meaningless false comparison between 'normal economy' and 'lockdown: First and foremost, the declaration does not present the most important point right now, which is to say October 2020: By the middle of next year, and quite possibly sooner, the world will be in a much better position to combat Covid-19. The arrival of some mix of vaccines and therapeutics will improve the situation, so it makes sense to shift cases and infection risks into the future while being somewhat protective now. To allow large numbers of people today to die of Covid, in wealthy countries, is akin to charging the hill and taking casualties two days before the end of World War I. ... What exactly does the word “allow” mean in this context? Again the passivity is evident, as if humans should just line up in the proper order of virus exposure and submit to nature’s will. How about instead we channel our inner Ayn Rand and stress the role of human agency? Something like: “Herd immunity will come from a combination of exposure to the virus through natural infection and the widespread use of vaccines. Here are some ways to maximize the role of vaccines in that process.”>In that sense, as things stand, there is no “normal” to be found. An attempt to pursue it would most likely lead to panic over the numbers of cases and hospitalizations, and would almost certainly make a second lockdown more likely. There is no ideal of liberty at the end of the tunnel here. In Europe, we will have more lockdowns. I'm not making the claim that this is what we should do, or that this is what's best for the economy given the dreadful situation we've landed ourselves in, or that's what we'll almost certainly end up doing given political realities - though I think these are all true. What I'm saying is that, whether (almost certainly) by governments caving to political pressure or (if they hold out endlessly like Sweden) by voluntary behaviour change, we'll shut down the economy in an attempt to avoid catching the virus. Anything else is inconceivable and requires lemming-like behaviour from politicians and ordinary people. So, given that it's going to happen, would you rather it be chaotic and late and uncoordinated, or sharper and earlier and hopefully shorter? If we're talking about government policy, there really isn't all that much compromise on the marginal costs of lockdowns vs the economy to be had if you're currently in the middle of a sufficiently rapid acceleration. Comment by SDM on "Scaling Laws for Autoregressive Generative Modeling", Henighan et al 2020 {OA} · 2020-10-29T18:58:56.093Z · LW · GW I'm still a bit puzzled by the link between human level on text prediction and 'human level' unconditionally - if I recall our near-bet during the forecasting tournament, our major disagreement was on whether direct scaling of GPT like systems takes us near to AGI. I often think that (because we don't have direct experience with any verbal intelligences in capability between GPT-3 and human brains) we're often impoverished when trying to think about such intelligences. I imagine that a GPT-6 that is almost 'human level on text prediction' could still be extremely deficient in other areas - it would be very weird to converse with, maybe like an amnesiac or confabulator that's very articulate and with good short-term memory. If language models scale to near-human performance but the other milestones don't fall in the process, and my initial claim is right, that gives us very transformative AI but not AGI. I think that the situation would look something like this: If GPT-N reaches par-human: discovering new action sets managing its own mental activity (?) cumulative learning human-like language comprehension perception and object recognition efficient search over known facts So there would be 2 (maybe 3?) breakthroughs remaining. It seems like you think just scaling up a GPT will also resolve those other milestones, rather than just giving us human-like language comprehension. Whereas if I'm right and also those curves do extrapolate, what we would get at the end would be an excellent text generator, but it wouldn't be an agent, wouldn't be capable of long-term planning and couldn't be accurately described as having a utility function over the states of the external world, and I don't see any reason why trivial extensions of GPT would be able to do that either since those seem like problems that are just as hard as human-like language comprehension. GPT seems like it's also making some progress on cumulative learning, though it might need some RL-based help with that, but none at all on managing mental activity for longterm planning or discovering new action sets. Comment by SDM on Security Mindset and Takeoff Speeds · 2020-10-29T18:47:16.938Z · LW · GW In terms of inferences about deceptive alignment, it might be useful to go back to the one and only current example we have where someone with somewhat relevant knowledge was led to wonder whether deception had taken place - GPT-3 balancing brackets. I don't know if anyone ever got Eliezer's1000 bounty, but the top-level comment on that thread at least convinces me that it's unlikely that GPT-3 via AI Dungeon was being deceptive even though Eliezer thought there was a real possibility that it was.

Now, this doesn't prove all that much, but one thing it does suggest is that on current MIRI-like views about how likely deception is, the threshold for uncertainty about deception is set far too low. That suggests your people at OpenSoft might well be right in their assumption.

Comment by SDM on Have the lockdowns been worth it? · 2020-10-20T13:45:48.275Z · LW · GW
• How long do we expect to have to wait for a vaccine or much more effective treatment?

I can't think of a better source on this than the Good Judgment project's COVID-19 recovery dashboard.

• How does the economic and related damage vary for voluntary vs involuntary suppression?

This is incredibly complicated and country-specific and dependent on all sorts of factors but maybe this graph from the Financial Times is a good place to start, it tells us how things have gone so far.

• How does the total number and spread of infections vary for voluntary vs involuntary suppression?

This is even harder than the previous question. 'All we can say for sure is "It was possible to get R<1 in Sweden in the spring with less stringent measures'. If you consider that Sweden suffered considerably more death than its comparable neighbours, then you can project that the initial surge in deaths in badly-hit locked down countries like the UK could have been much higher with voluntary measures, but how much higher is difficult to assess. I think that between-country comparisons are almost useless in these situations.

This is also where accounting for coronavirus deaths and debilitations comes into play. 'Anti-lockdown' arguments sometimes focus on the fact that even in badly-hit countries, the excess death figures have been in the rough range of +10%, (though with around 11 years of life lost). There are ways of describing this that make it seem 'not so bad' or 'not worth shutting the country down for', by e.g. comparing it to deaths from the other leading causes of death, like heart disease. This assumes there's a direct tradeoff where we can 'carry on as normal' while accepting those deaths and avoid the economic damage, but there is no such tradeoff to be made. There's just the choice as to which way you place the additional nudges of law and public messaging on top of a trajectory you're largely committed to by individual behaviour changes.

And if you do try to make the impossible, hypothetical 'tradeoff economy and lives' comparison between 'normal behaviour no matter what' and virus suppression, then the number of excess deaths to use for comparison isn't the number we in fact suffered, but far higher, given the IFR of 0.5-1%, it's on the order of +100% excess deaths (600,000 in the UK and 2 million in the US).

But again, such a comparison isn't useful, as it's not a policy that could be enacted or adopted, in fact it would probably require huge state coercion to force people to return to 'normal life'.

The basic point that it wouldn't be worth sacrificing everything to reduce excess deaths by 10% and save a million life-years is true, but that point is turned into a motte-and-bailey, where the motte is that there exists a level of damage at which a particular suppression measure (full lockdowns) is no longer worth it, and the bailey is that in all the situations we are in now most suppression measures are not worth it.

• To what degree do weaker legally mandated measures earlier spare us from stronger legally mandated measures (or greater economic damage from voluntary behaviour change) later?

This raises the difficult question of how much to take into account panic over overwhelmed hospitals and rising cases. Tyler Cowen:

In that sense, as things stand, there is no “normal” to be found. An attempt to pursue it would most likely lead to panic over the numbers of cases and hospitalizations, and would almost certainly make a second lockdown more likely.

Comment by SDM on The Treacherous Path to Rationality · 2020-10-19T17:07:22.990Z · LW · GW

The Rationality community was never particularly focused on medicine or epidemiology. And yet, we basically got everything about COVID-19 right and did so months ahead of the majority of government officials, journalists, and supposed experts.

...

We started discussing the virus and raising the alarm in private back in January. By late February, as American health officials were almost unanimously downplaying the threat, we wrote posts on taking the disease seriously, buying masks, and preparing for quarantine.

...

The rationalists pwned COVID

This isn't true. We did see it coming more clearly than most of the governmental authorities and certainly were ahead of public risk communication, but we were on average fairly similar or even a bit behind the actual domain experts.

This article summarizes interviews with epidemiologists on when they first realized COVID-19 was going to be a huge catastrophe and how they reacted. The dates range from January 15th with the majority in mid-late February. See also this tweet from late February, from a modeller working of the UK's SAGE, confirming he thinks uncontrolled spread is taking place.

I have an email dated 27 Feb 2020 replying to a colleague: "My thoughts on Covid-19 - pandemic is very likely." It was such a dry, intellectual statement, and I remember feeling incredulous that I could write those words with such ease and certainty while feeling total uncertainty and fear about how this could play out.

...

Two moments stand out for me. One was in the first week of February, when I saw early signals that there could be substantial transmission before people show symptoms. Despite hopes of rapid containment, it was clear contact tracing alone would struggle to contain outbreaks

...

On January 23, I was at an NIH meeting related to potential pandemic pathogen research. Everyone had heard the news [that Wuhan had been locked down] and was beginning to discuss whether this would be a big deal. Over the next several weeks the concern turned more grave.

I believe February 27th was the same day as 'Seeing the Smoke', when it became accepted wisdom around here that coronavirus would be a huge catastrophe. Feb 27th was a day before I said I thought this would be a test-run for existential risk. And late January, we were in the same position as the NIH of 'beginning to discuss whether this would be a big deal' without certainty. The crucial difference was understanding the asymmetric risk - A failure, but not of prediction.

So why didn't the domain experts do anything if so? I've been reading the book Rage by Bob Woodward which includes interviews with Fauci and other US officials from January and February. There was a constant emphasis on how demanding strict measures early would be 'useless' and achieve nothing from as early as the end of December!

I'm growing to think that a lot of health experts had an implicit understanding that the systems around them in the west were not equipped to carry out their best plans of action. In other words, they saw the smoke under the door, decided that if they yelled 'fire' before it had filled up the room nobody would believe them and then decided to wait a bit before yelling 'fire'. But since we weren't trying to produce government policy, we weren't subject to the same limitations.

Comment by SDM on Have the lockdowns been worth it? · 2020-10-19T16:43:04.491Z · LW · GW

An important consideration is that the 'thing that the US, UK and China have been doing, and what Sweden didn’t', may not refer to anything. There are two meanings of 'lockdowns have not been worth it' - 'allow the natural herd immunity to happen and carry on as normal, accepting the direct health damage while saving the economy' or 'we shouldn't adopt legally mandatory measures to attempt to suppress the virus and instead adopt voluntary measures to attempt to suppress the virus'. The latter of these is the only correct way to interpret 'thing Sweden did that the other countries didn't'. The first of these is basically a thought-experiment, not a possible state of affairs, because people won't carry on as usual. So it can't be used for cost-benefit comparisons.

In terms of behaviour, there is far more similarity between what the US and Sweden 'did' than what the US and China 'did'. Tyler Cowen has written several articles emphasising exactly this point. What Sweden 'did' was an uncoordinated, voluntary attempt at the same policy that China, Germany, the UK and the US attempted with varying levels of seriousness - social distancing to reduce the R effectively below 1, suppressing the epidemic. This thread summarizes the 'voluntary suppression' that countries like Sweden ended up with. Tyler Cowen writes an article attempting to 'right the wrong question':

"The most compassionate approach that balances the risks and benefits of reaching herd immunity, is to allow those who are at minimal risk of death to live their lives normally to build up immunity to the virus through natural infection, while better protecting those who are at highest risk. We call this Focused Protection."

What exactly does the word “allow” mean in this context? Again the passivity is evident, as if humans should just line up in the proper order of virus exposure and submit to nature’s will. How about instead we channel our inner Ayn Rand and stress the role of human agency? Something like: “Herd immunity will come from a combination of exposure to the virus through natural infection and the widespread use of vaccines. Here are some ways to maximize the role of vaccines in that process.”

So, the question cannot be "should we allow the natural herd immunity to happen and carry on as normal, accepting the direct health damage while protecting the economy" - that is not actually a possible state of affairs given human behaviour. We can ask whether a better overall outcome is achieved with legally required measures to attempt suppression, rather than an uncoordinated attempt at suppression, but since people will not carry on as normal we can't ask 'has the economic/knock-on cost of lockdowns been worth the lives saved' without being very clear that the counterfactual may not be all that different.

The most important considerations have to be,

• How long do we expect to have to wait for a vaccine or much more effective treatment? If not long, then any weaker suppression is 'akin to charging the hill and taking casualties two days before the end of World War I'. If a long time, then we must recognise that in e.g. the US given that a slow grind up to herd immunity through infection will eventually occur.
• How does the economic and related damage vary for voluntary vs involuntary suppression? The example of Sweden compared to its neighbours is illustrative here.
• How does the total number and spread of infections vary for voluntary vs involuntary suppression? You can't rerun history for a given country with vs without legally mandated suppression measures.
• To what degree do weaker legally mandated measures earlier spare us from stronger legally mandated measures (or greater economic damage from voluntary behaviour change) later?
• Edit: Tyler Cowen released another article arguing for a new consideration that I didn't list - what reference class to place Coronavirus in - 'external attack on the nation' or 'regular cause of death'. Since, for fairly clear rule-utilitarian/deontological reasons, governments should care more about defending their citizens from e.g. wars and terrorist attacks compared to random accidents that kill similar numbers of people. I also think this is a key disagreement between pro/anti-'lockdown' positions.

To emphasise this last point, although it falls under 'questioning the question', the focus on Lockdowns can be counterproductive when there are vastly more cost-effective measures that could have been attempted by countries like the UK that had very low caseloads through the summer - like funding enforcement and support for isolation and better contact tracing, mask enforcement, and keeping events outdoors. These may fall under some people's definition of 'lockdown' since some of them are legally mandatory social distancing, but their costs and benefits are wildly different from stay-at-home orders. Scepticism of 'Lockdowns' must be defined to be more specific.

Comment by SDM on Covid 10/15: Playtime is Over · 2020-10-17T15:13:25.876Z · LW · GW

The other group claims their goal is to save lives while preventing economic disaster. In practice, they act as if their goal was to destroy as much economic and social value as possible in the name of the pandemic as a Sacrifice to the Gods, and to pile maximum blame upon those who do not go along with this plan, while doing their best to slow down or block solutions that might solve the pandemic without sufficiently destroying economic or social value.

There are less cynical ways to view countermeasures that go too far. I'd compare it, especially early on, to many of us developing mild OCD because of how terrifying things were - compliance was also very high early on.

they act as if their goal was to destroy as much economic and social value as possible in the name of the pandemic as a Sacrifice to the Gods

...

they act as if their goal was to have everyone ignore the pandemic, actively flouting all precautions

A lot of the response in Europe/UK has not looked like this, or like your opposite side but it still hasn't been very good.

The UK/Europe response been more like an inefficient, clumsy attempt to strike a 'balance' between mitigation and saving the economy, while showing no understanding of how to make good tradeoffs - e.g opening the universities while banning small gatherings. It looks more like an attempt to do all the 'good' things at once for the economy and health and get the reputational/mood affiliation benefits from both. E.g. in the UK in summer we half-funded the tracing and isolation infrastructure, ignored that compliance was low and gave subsides to people eating out at pubs and restaurants after suppressing the virus hard and at great cost, and now might be employing incredibly costly lockdown measures again when we could have fully squashed with a bit of extra effort in the summer when numbers were almost zero - and that's the same story as most of Europe.

That's more a failure to understand/respond to opportunity costs than either of the failures you describe, though it has aspects of both. It doesn't look like they were acting with the goal of getting people to adhere to the costliest measures possible, though - witness the reluctance to reimpose restrictions now.

The pandemic has enough physical-world, simulacra-level-1 impact on people to steer most ordinary people’s individual physical actions towards what seems to them like useful ones that preserve economic and social value while minimizing health risks. And it manages to impose some **amount of similar restrictions on the collective and rhetorical actions. **

This is the part that I like to emphasise, and the reason that we're still bound for a better outcome than most March predictions implied is because of a decent level of public awareness of risk imposing a brake on the very worst outcomes - the Morituri Nolumus Mori. Many of us didn't properly anticipate how much physical reality would end up hemming in our actions, as I explained in that post.

That doesn’t mean equivalence between sides, let alone equivalence of individuals. But until the basic dynamics are understood, one can’t reasonably predict what will happen next.

This is also worth emphasising. In general, though not in the examples you mention from e.g. California, going too hard works better than going too soft because there just is no pure 'let it rip' option - there's a choice between coordinated and uncoordinated suppression. It looks like voluntary behaviour has (in Europe and the US) mattered relatively more than expected. Countries that relied on voluntary behaviour change like Sweden didn't have the feared uncontrolled spread but also didn't do that well - they ended up with a policy of effective ‘voluntary suppression’ with a slightly different tradeoff – economic damage slightly less than others, activity reduction slower and more chaotic, more deaths. This was essentially a collective choice by the Swedish people despite their government.

that’s probably not true, and probably not true sooner rather than later. Immunity and testing continue to increase, our treatments continue to improve, and vaccines are probably on their way on a timescale of months. Despite the best efforts of both camps, it would greatly surprise me if we are not past the halfway point.

The initial estimates said that 40-50% infected is a reasonable lower bound for when weak mitigation plus partial herd immunity would end the pandemic naturally. I think that's still true. So, it would all have been 'worth it', in pure death terms, if significantly fewer than that many people end up catching coronavirus before much better treatments or vaccines end the epidemic by other means. Last time I checked that's still likely.

Comment by SDM on A voting theory primer for rationalists · 2020-08-31T16:51:54.777Z · LW · GW
You seem to be comparing Arrow's theorem to Lord Vetinari, implying that both are undisputed sovereigns?

It was a joke about how if you take Arrow's theorem literally, the fairest 'voting method' (at least among ranked voting methods), the only rule which produces a definite transitive preference ranking and which meets the unanimity and independence conditions is 'one man, one vote', i.e. dictatorship.

And frankly, I think that the model used in the paper bears very little relationship to any political reality I know of. I've never seen a group of voters who believe "I would love it if any two of these three laws pass, but I would hate it if all three of them passed or none of them passed" for any set of laws that are seriously proposed and argued-for.

Such a situation doesn't seem all that far-fetched to me - suppose there are three different stimulus bills on offer, and you want some stimulus spending but you also care about rising national debt. You might not care which bills pass, but you still want some stimulus money, but you also don't want all of them to pass because you think the debt would rise too high, so maybe you decide that you just want any 2 out of 3 of them to pass. But I think the methods introduced in that paper might be most useful not to model the outcomes of voting systems, but for attempts to align an AI to multiple people's preferences.

Comment by SDM on Forecasting Thread: AI Timelines · 2020-08-29T11:11:04.240Z · LW · GW

I'll take that bet! If I do lose, I'll be far too excited/terrified/dead to worry in any case.

Comment by SDM on Covid 8/27: The Fall of the CDC · 2020-08-28T11:32:12.947Z · LW · GW
I’m still periodically scared in an existential or civilization-is-collapsing-in-general kind of way, but not in a ‘the economy is about to collapse’ or ‘millions of Americans are about to die’ kind of way.
I’m not sure whether this is progress.

It definitely is progress. If we were in the latter situation, there would be nothing at all to do except hope you personally don't die, whereas in the former there's a chance for things to get better - if we learn the lesson.

By strange coincidence, it's exactly 6 months since I wrote this, and I think it's important to remember just how dire the subjective future seemed at the end of February - that (subjectively, anyway) could have happened, but didn't.

Comment by SDM on SDM's Shortform · 2020-08-28T10:50:18.165Z · LW · GW
The tl;dr is that instead of thinking of ethics as a single unified domain where "population ethics" is just a straightforward extension of "normal ethics," you split "ethics" into a bunch of different subcategories:
Preference utilitarianism as an underdetermined but universal morality
"What is my life goal?" as the existentialist question we have to answer for why we get up in the morning
"What's a particularly moral or altruistic thing to do with the future lightcone?" as an optional subquestion of "What is my life goal?" – of interest to people who want to make their life goals particularly altruistically meaningful

This is very interesting - I recall from our earlier conversation that you said you might expect some areas of agreement, just not on axiology:

(I say elements because realism is not all-or-nothing - there could be an objective 'core' to ethics, maybe axiology, and much ethics could be built on top of such a realist core - that even seems like the most natural reading of the evidence, if the evidence is that there is convergence only on a limited subset of questions.)

I also agree with that, except that I think axiology is the one place where I'm most confident that there's no convergence. :)
Maybe my anti-realism is best described as "some moral facts exist (in a weak sense as far as other realist proposals go), but morality is underdetermined."

This may seem like an odd question, but, are you possibly a normative realist, just not a full-fledged moral realist? What I didn't say in that bracket was that 'maybe axiology' wasn't my only guess about what the objective, normative facts at the core of ethics could be.

Following Singer in the expanding circle, I also think that some impartiality rule that leads to preference utilitarianism, maybe analogous to the anonymity rule in social choice, could be one of the normatively correct rules that ethics has to follow, but that if convergence among ethical views doesn't occur the final answer might be underdetermined. This seems to be exactly the same as your view, so maybe we disagree less than it initially seemed.

In my attempted classification (of whether you accept convergence and/or irreducible normativity), I think you'd be somewhere between 1 and 3. I did say that those views might be on a spectrum depending on which areas of Normativity overall you accept, but I didn't consider splitting up ethics into specific subdomains, each of which might have convergence or not:

Depending on which of the arguments you accept, there are four basic options. These are extremes of a spectrum, as while the Normativity argument is all-or-nothing, the Convergence argument can come by degrees for different types of normative claims (epistemic, practical and moral)

Assuming that it is possible to cleanly separate population ethics from 'preference utilitarianism', it is consistent, though quite counterintuitive, to demand reflective coherence in our non-population ethical views but allow whatever we want in population ethics (this would be view 1 for most ethics but view 3 for population ethics).

(This still strikes me as exactly what we'd expect to see halfway to reaching convergence - the weirder and newer subdomain of ethics still has no agreement, while we have reached greater agreement on questions we've been working on for longer.)

It sounds like your contrasting my statement from The Case for SFE ("fit all one’s moral intuitions into an overarching theory based solely on intuitively appealing axioms") with "arbitrarily halting the search for coherence" / giving up on ethics playing a role in decision-making. But those are not the only two options: You can have some universal moral principles, but leave a lot of population ethics underdetermined.

Your case for SFE was intended to defend a view of population ethics - that there is an asymmetry between suffering and happiness. If we've decided that 'population ethics' is to remain undetermined, that is we adopt view 3 for population ethics, what is your argument (that SFE is an intuitively appealing explanation for many of our moral intuitions) meant to achieve? Can't I simply declare that my intuitions say different, and then we have nothing more to discuss, if we already know we're going to leave population ethics undetermined?

Comment by SDM on Forecasting Thread: AI Timelines · 2020-08-26T14:35:28.173Z · LW · GW

The 'progress will be continuous' argument, to apply to our near future, does depend on my other assumptions - mainly that the breakthroughs on that list are separable, so agentive behaviour and long-term planning won't drop out of a larger GPT by themselves and can't be considered part of just 'improving up language model accuracy'.

We currently have partial progress on human-level language comprehension, a bit on cumulative learning, but near zero on managing mental activity for long term planning, so if we were to suddenly reach human level on long-term planning in the next 5 years, that would probably involve a discontinuity, which I don't think is very likely for the reasons given here.

If language models scale to near-human performance but the other milestones don't fall in the process, and my initial claim is right, that gives us very transformative AI but not AGI. I think that the situation would look something like this:

If GPT-N reaches par-human:

discovering new action sets
managing its own mental activity
(?) cumulative learning
human-like language comprehension
perception and object recognition
efficient search over known facts

So there would be 2 (maybe 3?) breakthroughs remaining. It seems like you think just scaling up a GPT will also resolve those other milestones, rather than just giving us human-like language comprehension. Whereas if I'm right and also those curves do extrapolate, what we would get at the end would be an excellent text generator, but it wouldn't be an agent, wouldn't be capable of long-term planning and couldn't be accurately described as having a utility function over the states of the external world, and I don't see any reason why trivial extensions of GPT would be able to do that either since those seem like problems that are just as hard as human-like language comprehension. GPT seems like it's also making some progress on cumulative learning, though it might need some RL-based help with that, but none at all on managing mental activity for longterm planning or discovering new action sets.

As an additional argument, admittedly from authority - Stuart Russell also clearly sees human-like language comprehension as only one of several really hard and independent problems that need to be solved.

A humanlike GPT-N would certainly be a huge leap into a realm of AI we don't know much about, so we could be surprised and discover that agentive behaviour and having a utility function over states of the external world spontaneously appears in a good enough language model, but that argument has to be made, and you need that argument to hold and GPT to keep scaling for us to reach AGI in the next five years, and I don't see the conjunction of those two as that likely - it seems as though your argument rests solely on whether GPT scales or not, when there's also this other conceptual premise that's much harder to justify.

I'm also not sure if I've seen anyone make the argument that GPT-N will also give us these specific breakthroughs - but if you have reasons that GPT scaling would solve all the remaining barriers to AGI, I'd be interested to hear it. Note that this isn't the same as just pointing out how impressive the results scaling up GPT could be - Gwern's piece here, for example, seems to be arguing for a scenario more like what I've envisaged, where GPT-N ends up a key piece of some future AGI but just provides some of the background 'world model':

Models like GPT-3 suggest that large unsupervised models will be vital components of future DL systems, as they can be ‘plugged into’ systems to immediately provide understanding of the world, humans, natural language, and reasoning.

If GPT does scale, and we get human-like language comprehension in 2025, that will mean we're moving up that list much faster, and in turn suggests that there might not be a large number of additional discoveries required to make the other breakthroughs, which in turn suggests they might also occur within the Deep Learning paradigm, and relatively soon. I think that if this happens, there's a reasonable chance that when we do build an AGI a big part of its internals looks like a GPT, as gwern suggested, but by then we're already long past simply scaling up existing systems.

Alternatively, perhaps you're not including agentive behaviour in your definition of AGI - a par-human text generator for most tasks that isn't capable of discovering new action sets or managing its mental activity is, I think a 'mere' transformative AI and not a genuine AGI.