Mostly it's just the conjunction of so many events. Even if each battle has a 99% chance of success (which seems high), the chance to win 291 battles in a row is 5.4%.tag on On the limits of idealized values
Ok. I'm glad you noticed, in the linked post, that utilitarianism doesn't have a cheent model of obligation.lincolnquirk on Two non-obvious lessons from microcovid.org
Thanks for writing this, and for writing the software! Microcovid was quite impactful in my own life and the software is shockingly thoughtful — you put so many tiny little details in to help me decide (oh, it’s a taxi ride and I don’t think the driver is going to wear a mask, but I can open the window…)
I think this must be a result of your & your team’s hard work talking to nonrationalists, as you note, but I also think you must have really good product instincts. Just talking to people is not, in my view, enough to produce a product that thoughtful — you also have to figure out what to do with the data. Nice work :)christiankl on How do you deal with people cargo culting COVID-19 defense?
I am curious ,what does the data say?
Generally, there are a lot of questions where we don't have good data.
The question of whether to require FFP-2 masks is not one of compare a FFP-2 mask to no mask. The point of FFP-2 masks is that they can seal so that all the air gets filtered which a surgery mask can't. Requiring people to wear FFP-2 masks (not allowing any passenger who wears just a surgery mask on public transportation) but not do anything to help them to wear them in a way that seals is what's strange.yimbygeorge on How do you deal with people cargo culting COVID-19 defense?
I am curious ,what does the data say? Is wearing even a poorly fitting mask better at preventing you from spreading covid compared to not wearing amask at all?christiankl on Unbundling Humans, or, Unbundling Human Creation.
When it comes to SexTech, it's unclear to what extends that increases the amount of reproductive sex that people have. It's not in Tinders interests for people to marry, have kids and remove themselves from the dating market.
Japan is a country where "most married men were too busy or tired from work to have sex". Given that Japanese workers also are not working in a way that leads to much economic growth, their problem isn't just about childcare being expensive or timeconsuming. It's that too much human capital goes into busywork.
The key reason why childcare needs a lot of time from parents is that attitudes towards what a parent should provide for their children have changed. When faced with parents spending exponentially more time on childcare over the years, saying "How about the state does the childcare?" seems to avoid the core issue.
The hard issue is about understanding what activities actually have to need to be done or are useful.neel-nanda-1 on Empirical Observations of Objective Robustness Failures
This seems like really great work, nice job! I'd be excited to see more empirical work around inner alignment.
One of the things I really like about this work is the cute videos that clearly demonstrate 'this agent is doing dumb stuff because its objective is non-robust'. Have you considered putting shorter clips of some of the best bits to Youtube, or making GIFs? (Eg, a 5-10 second clip of the CoinRun agent during train, followed by a 5-10 second clip of the CoinRun agent during test). It seemed that one of the major strengths of the CoastRunners clip was how easily shareable and funny it was, and I could imagine this research getting more exposure if it's easier to share highlights. I found the Google Drive pretty hard to navigateneel-nanda-1 on Irrational Modesty
Seconded, that line really hit home for meadamshimi on Environmental Structure Can Cause Instrumental Convergence
Sorry for the awkwardness (this comment was difficult to write). But I think it is important that people in the AI alignment community publish these sorts of thoughts. Obviously, I can be wrong about all of this.
Despite disagreeing with you, I'm glad that you published this comment and I agree that airing up disagreements is really important for the research community.
In particular, I don't think the paper provides a simple description for the set of MDPs that the main claim in the abstract applies to ("We prove that for most prior beliefs one might have about the agent's reward function […], one should expect optimal policies to seek power in these environments."). Nor do I think that the paper justifies the relevance of that set of MDPs. (Why is it useful to prove things about it?)
There's a sense in which I agree with you: AFAIK, there is no formal statement of the set of MDPs with the structural properties that Alex studies here. That doesn't mean it isn't relatively easy to state:
The first set of MDPs is quite restrictive (because you need an exact injection), which is why IIRC Alex extends the results to the sets of RSDs, which captures a far larger class of MDPs. Intuitively, this is the class of MDPs such that some action leads to more infinite horizon behaviors than another for the same state. I personally find this class quite intuitive, and also I feel like it captures many real world situations where we worry about power and instrumental convergence.
Also, there may be a misconception that this paper formalizes the instrumental convergence thesis. That seems wrong, i.e. the paper does not seem to claim that several convergent instrumental values can be identified. The only convergent instrumental value that the paper attempts to address AFAICT is self-preservation (avoiding terminal states).
Once again, I agree in part with the statement that the paper doesn't IIRC explicitly discuss different convergent instrumental goals. On the other hand, the paper explicitly says that it focus on a special case of the instrumental convergence thesis.
An action is instrumental to an objective when it helps achieve that objective. Some actions are instrumental to many objectives, making them robustly instrumental. The claim that power-seeking is robustly instrumental is a specific instance of the instrumental convergence thesis:
Several instrumental values can be identified which are convergent in the sense that their attainment would increase the chances of the agent’s goal being realized for a wide range of final goals and a wide range of situations, implying that these instrumental values are likely to be pursued by a broad spectrum of situated intelligent agents [Bostrom, 2014].
That being said, you just made me want to look more into how well power-seeking captures different convergent instrumental goals from Omohundro's paper, so thanks for that. :)romeostevensit on ELI12: how do libertarians want wages to work?
The basic idea is that without gov forcing out competition via monopoly the market provides arbitration services.