## Posts

SoerenMind's Shortform 2021-06-11T20:19:14.580Z
FHI paper published in Science: interventions against COVID-19 2020-12-16T21:19:00.441Z
How to do remote co-working 2020-05-08T19:38:11.623Z
How important are model sizes to your timeline predictions? 2019-09-05T17:34:14.742Z
What are some good examples of gaming that is hard to detect? 2019-05-16T16:10:38.333Z
Any rebuttals of Christiano and AI Impacts on takeoff speeds? 2019-04-21T20:39:51.076Z
Some intuition on why consciousness seems subjective 2018-07-27T22:37:44.587Z
Updating towards the simulation hypothesis because you think about AI 2016-03-05T22:23:49.424Z
Working at MIRI: An interview with Malo Bourgon 2015-11-01T12:54:58.841Z
Meetup : 'The Most Good Good You Can Do' (Effective Altruism meetup) 2015-05-14T18:32:18.446Z
Meetup : Utrecht- Brainstorm and ethics discussion at the Film Café 2014-05-19T20:49:07.529Z
Meetup : Utrecht - Social discussion at the Film Café 2014-05-12T13:10:07.746Z
Meetup : Utrecht 2014-04-20T10:14:21.859Z
Meetup : Utrecht: Behavioural economics, game theory... 2014-04-07T13:54:49.079Z
Meetup : Utrecht: More on effective altruism 2014-03-27T00:40:37.720Z
Meetup : Utrecht: Famine, Affluence and Morality 2014-03-16T19:56:44.267Z
Meetup : Utrecht: Effective Altruism 2014-03-03T19:55:11.665Z

Comment by SoerenMind on Prefer the British Style of Quotation Mark Punctuation over the American · 2021-09-13T16:45:49.313Z · LW · GW

One thing I dislike about the 'punctuation outside quotes' view is that it treats "!" and "?" differently than a full stop.

"This is an exclamation"!
"Is this a question"?

Seems less natural to me than:

"This is an exclamation!"
"Is this a question?"

I think have this intuition because it is part of the quote that it is an exclamation or a question.

Comment by SoerenMind on What 2026 looks like (Daniel's Median Future) · 2021-08-16T13:06:12.476Z · LW · GW

Yes I completely agree. My point is that the fine-tuned version didn't have better final coding performance than the version trained only on code. I also agree that fine-tuning will probably improve performance on the specific tasks we fine-tune on.

Comment by SoerenMind on What 2026 looks like (Daniel's Median Future) · 2021-08-11T16:10:39.702Z · LW · GW

Most importantly I expect them to be fine-tuned on various things (perhaps you can bundle this under "higher-quality data"). Think of how Codex and Copilot are much better than vanilla GPT-3 at coding. That's the power of fine-tuning / data quality.

Fine-tuning GPT-3 on code had little benefit compared to training from scratch:

Surprisingly, we did not observe improvements when starting from a pre-trained language model, possibly because the finetuning dataset is so large. Nevertheless, models fine-tuned from GPT converge more quickly, so we apply this strategy for all subsequent experiments.

I wouldn't categorize Codex under "benefits of fine-tuning/data quality" but under "benefits of specialization". That's because GPT-3 is trained on little code whereas Codex only on code.  (And the Codex paper didn't work on data quality more than the GPT-3 paper.)

Comment by SoerenMind on What 2026 looks like (Daniel's Median Future) · 2021-08-11T10:11:34.057Z · LW · GW

# 2023

The multimodal transformers are now even bigger; the biggest are about half a trillion parameters [...] The hype is insane now

This part surprised me. Half a trillion is only 3x bigger than GPT-3. Do you expect this to make a big difference? (Perhaps in combination with better data?). I wouldn't, given that GPT-3 was >100x bigger than GPT-2.

Maybe your'e expecting multimodality to help? It's possible, but worth keeping in mind that according to some rumors, Google's multimodal model already has on the order of 100B parameters.

On the other hand, I do expect more than half a trillion parameters by 2023 as this seems possible financially, and compatible with existing supercomputers and distributed training setups.

Comment by SoerenMind on Why not more small, intense research teams? · 2021-08-06T09:18:15.000Z · LW · GW

In my experience, this worked extremely well. But that was thanks to really good management and coordination which would've been hard in other groups I used to be part of.

Comment by SoerenMind on What made the UK COVID-19 case count drop? · 2021-08-04T15:57:39.118Z · LW · GW

This wouldn't explain the recent reduction in R because Delta has already been dominant for a while.

Comment by SoerenMind on What made the UK COVID-19 case count drop? · 2021-08-04T15:56:43.232Z · LW · GW

The  of Delta is ca. 2x the R0 of the Wuhan strain and this doubles the effect of new immunity on

In fact, the ONS data gives me that ~7% of Scotland had Delta so that's a reduction in  of *7% = 6*7% = 0.42 just from very recent and sudden natural immunity.

That's not [edited: forgot to say "not"] enough to explain everything, but there are more factors:

1) Heterogenous immunity: the first people to become immune are often high-risk people who go to superspreader events etc.

2) Vaccinations also went up. E.g. if 5% of Scotland got vaccinated in the relevant period, and that gives a 50% protection against being infected or infecting others (conditional on being infected), that's another reduction in  of ca. 6*0.05 =  0.18.

3) Cases were rising and that usually leads to behavior changes like staying at home, cancelling events, and doing more LFD tests at home.

Comment by SoerenMind on How should my timelines influence my career choice? · 2021-08-04T15:38:05.497Z · LW · GW

Another heuristic is to choose the option where you're most likely to do exceptionally well. (Cf heavy tailed impact etc). Among other thing this, this pushes you to optimize for the timelines scenario where you can be very successful, and to do the job with the best personal fit.

Comment by SoerenMind on ($1000 bounty) How effective are marginal vaccine doses against the covid delta variant? · 2021-07-25T16:08:33.777Z · LW · GW Age around 30 and not overweight or obviously unhealthy Comment by SoerenMind on ($1000 bounty) How effective are marginal vaccine doses against the covid delta variant? · 2021-07-25T16:06:42.076Z · LW · GW

Some standard ones like masks, but not at all times. They probably were in close or indoor contact with infected people without precautions.

Comment by SoerenMind on ($1000 bounty) How effective are marginal vaccine doses against the covid delta variant? · 2021-07-22T18:40:31.974Z · LW · GW 1. FWIW I've seen multiple double-mRNA-vaccinated people in my social circles who still got infected with delta (and in one case infected someone else who was double vaccinated). Two of the cases I know were symptomatic (but mild). Comment by SoerenMind on ($1000 bounty) How effective are marginal vaccine doses against the covid delta variant? · 2021-07-22T18:36:01.011Z · LW · GW

According to one expert, the immune system essentially makes bets on how often it will face a given virus and how the virus will mutate in the future:

https://science.sciencemag.org/content/372/6549/1392

By that logic, being challenged more often means that the immune system should have a stronger and longer-lasting response:

The immune system treats any new exposure—be it infection or vaccination—with a cost-benefit threat analysis for the magnitude of immunological memory to generate and maintain. There are resource-commitment decisions: more cells and more protein throughout the body, potentially for decades. Although all of the calculus involved in these immunological cost-benefit analyses is not understood, a long-standing rule of thumb is that repeated exposures are recognized as an increased threat. Hence the success of vaccine regimens split into two or three immunizations.

The response becomes even stronger when challenging the immune system with different versions of the virus, in particular a vaccine and the virus itself (same link).

Heightened response to repeated exposure is clearly at play in hybrid immunity, but it is not so simple, because the magnitude of the response to the second exposure (vaccination after infection) was much larger than after the second dose of vaccine in uninfected individuals. [...] Overall, hybrid immunity to SARS-CoV-2 appears to be impressively potent.

For SARS-CoV-2 this leads to a 25-100x stronger antibody response. It also comes with enhanced neutralizing breadth, and therefore likely some protection against future variants.

Based on this, the article above recommends combining different vaccine modalities such as mRNA (Pfizer, Moderna) and vector (AZ) (see also here).

Lastly, your question may be hard to answer without data, if we extrapolate from a similar question where the answer seems hard to predict in advance:

Additionally, the response to the second vaccine dose was minimal for previously infected persons, indicating an immunity plateau that is not simple to predict.

Comment by SoerenMind on Formal Inner Alignment, Prospectus · 2021-07-02T18:10:14.473Z · LW · GW

Suggestion for content 2: relationship to invariant causal prediction

Lots of people in ML these days seem excited about getting out of distribution generalization with techniques like invariant causal prediction. See e.g. this, this, section 5.2 here and related background. This literature seems promising but in discussions about inner alignment it's missing. It seems useful to discuss how far it can go in helping solve inner alignment.

Comment by SoerenMind on Formal Inner Alignment, Prospectus · 2021-07-02T18:00:16.191Z · LW · GW

Suggestion for content 1: relationship to ordinary distribution shift problems

When I mention inner alignment to ML researchers, they often think of it as an ordinary problem of (covariate) distribution shift.

My suggestion is to discuss if a solution to ordinary distribution shift is also a solution to inner alignment. E.g. an 'ordinary' robustness problem for imitation learning could be handled safely with an approach similar to Michael's: maintain a posterior over hypotheses , with a sufficiently flexible hypothesis class, and ask for help whenever the model is uncertain about the output y for a new input x.

One interesting subtopic is whether inner alignment is an extra-ordinary robustness problem because it is adversarial: even the tiniest difference between train and test inputs might cause the model to misbehave. (See also this.)

Comment by SoerenMind on Formal Inner Alignment, Prospectus · 2021-07-02T17:32:03.134Z · LW · GW

Feedback on your disagreements with Michael:

I agree with "the consensus algorithm still gives inner optimizers control of when the system asks for more feedback".

Most of your criticisms seem to be solvable by using a less naive strategy for active learning and inference, such as Bayesian Active Learning with Disagreement (BALD). Its main drawback is that exact posterior inference in deep learning is expensive since it requires integrating over a possibly infinite/continuous hypothesis space.  But approximations exist.

BALD (and similar methods) help with most criticisms:

• It only needs one run,  not 100. Instead, it samples hypotheses (let's say 100) from a posterior .
• It doesn't suffer from dependence between runs because there's only 1 run. It just has to take iid samples from its own posterior (many inference techniques do this).
• It doesn't require that the true hypothesis is always right. Instead each hypothesis defines a distribution over answers and it only gets ruled out when it puts 0% chance on the human's answer. (For imitation learning, that should never happen)
• It doesn't require that  one among the 100 hypotheses that is safe  inputs. Drawback: It requires the weaker condition that  input we encounter,  one hypothesis (among 100) that is safe.
• It converges faster because it actively searches for inputs where hypotheses disagree.
• (Bayesian ML can even be adversarially robust with exact posterior inference.)

Apologies if I missed details from Michael's paper.

Comment by SoerenMind on Comment on the lab leak hypothesis · 2021-06-12T11:03:10.183Z · LW · GW

Re 1) the codons, according to Christian Drosten, have precedence for evolving naturally in viruses. That could be because viruses evolve much faster than e.g. animals. Source: search for 'codon' and use translate here: https://www.ndr.de/nachrichten/info/92-Coronavirus-Update-Woher-stammt-das-Virus,podcastcoronavirus322.html

The link also has a bunch of content about the evolution of furin cleavage sites, from a leading expert.

Comment by SoerenMind on SoerenMind's Shortform · 2021-06-11T20:19:14.935Z · LW · GW

Favoring China in the AI race
In a many-polar AI deployment scenario,  a crucial challenge is to solve coordination problems between non-state actors: ensuring that companies don't cut corners, monitoring them, just to name a few challenges. And in many ways, China is better than western countries at solving coordination problems within their borders. For example, they can use their authority over companies as these tend to be state-owned or owned by some fund that is owned by a fund that is state owned. Could this mean that, in a many-polar scenario, we should favor China in the race to build AGI?

Of course, the benefits of China-internal coordination may be outweighed by the disadvantages of Chinese leadership in AI. But these disadvantages seem smaller in a many-polar world because many actors, not just the Chinese government, share ownership of the future.

Comment by SoerenMind on Suggestions of posts on the AF to review · 2021-06-08T10:44:29.385Z · LW · GW

Thanks - I agree there's value to public peer review. Personally I'd go further than notifying authors and instead ask for permission. We already have a problem where many people (including notably highly accomplished authors) feel discouraged from posting due to the fear of losing reputation. Worse, your friends will actually read reviews of your work, unlike OpenReview. And I wouldn't want to make this worse by implicitly making authors opt into a public peer review if that makes sense.

There are also some differences between forums and academia. Forums allow people to share unpolished work and see how the community reacts. I worry that highly visible public reviews may discourage some authors from posting this work, unless it's obvious that they won't get a highly visible negative review for their off-the-cuff thoughts without opting into it.  Which seems doable within your (very useful) approach. I agree there's a fine line here; just want to point out that not everyone is emotionally ready for this.

Comment by SoerenMind on Habryka's Shortform Feed · 2021-06-08T07:30:13.872Z · LW · GW

There's also a strong chance that delta is the most transmissible variant we know even without its immune evasion (source: I work on this, don't have a public source to share). I agree with your assessment that delta is a big deal.

Comment by SoerenMind on Suggestions of posts on the AF to review · 2021-06-08T07:18:02.601Z · LW · GW

This seems useful. But do you ask the authors for permission to review and give them an easy way out? Academic peer review is for good reasons usually non-public. The prospect of having one's work reviewed in public seems likely to be extremely emotionally uncomfortable for some authors and may discourage them from writing.

Comment by SoerenMind on The case for aligning narrowly superhuman models · 2021-05-22T18:43:39.068Z · LW · GW

Google seems to have solved some problem like the above for a multi-language-model (MUM):

"Say there’s really helpful information about Mt. Fuji written in Japanese; today, you probably won’t find it if you don’t search in Japanese. But MUM could transfer knowledge from sources across languages, and use those insights to find the most relevant results in your preferred language."

Comment by SoerenMind on MIRI location optimization (and related topics) discussion · 2021-05-16T16:52:08.304Z · LW · GW

Some reactions:

• The Oxford/London nexus  seems like a nice combination. It's 38min by train between the two, plus getting to the stations (which in London can be a pain).
• Re intellectual life "behind the walls of the colleges": I haven't perceived much intellectual life in my college, and much more outside. Maybe the part inside the colleges is for undergraduates?
• I don't have experience with long-range commuting into Oxford. But you can commute in 10-15 minutes by bike from the surrounding villages like Botley / Headington.
Comment by SoerenMind on MIRI location optimization (and related topics) discussion · 2021-05-12T21:54:36.776Z · LW · GW

I don't think anyone has mentioned Oxford, UK yet? It's tiny. You could literally live on a farm here and still be 5-10 minutes from the city centre. And obviously it's a realistic place for a rationalist hub. I haven't perceived anti-tech sentiment here but haven't paid attention either.

Comment by SoerenMind on Three reasons to expect long AI timelines · 2021-04-25T10:11:00.176Z · LW · GW

I agree that 1-3 need more attention, thanks for raising them.

Many AI scientists in the 1950s and 1960s incorrectly expected that cracking computer chess would automatically crack other tasks as well.

There’s a simple disconnect here between chess and self-supervised learning.  You're probably aware of it but it it's worth mentioning. Chess algorithms were historically designed to win at chess. In contrast, the point of self-supervised learning is to extract representations that are useful in general. For example, to solve a new tasks we can feed the representations into a linear regression, another general algorithm. ML researchers have argued for ages that this should work and we already have plenty of evidence that it does.

Comment by SoerenMind on The case for aligning narrowly superhuman models · 2021-03-24T17:17:07.597Z · LW · GW

How useful would it be to work on a problem where the LM "knows" can not be superhuman but it still knows how to do well and needs to be incentivized to do so? A currently prominent example problem is that LMs produce "toxic" content:
https://lilianweng.github.io/lil-log/2021/03/21/reducing-toxicity-in-language-models.html

Comment by SoerenMind on Demand offsetting · 2021-03-23T18:00:55.861Z · LW · GW

Put differently, buying eggs only hurt hens via some indirect market effects, and I’m now offsetting my harm at that level before it turns into any actual harm to a hen.

I probably misunderstand but isn't this also true about other offsetting schemes like convincing people to go vegetarian? They also lower demand.

Comment by SoerenMind on Acetylcholine = Learning rate (aka plasticity) · 2021-03-18T14:21:28.148Z · LW · GW

Related,  Acetylcholine has been hypothesized to signal to the rest of the brain that unfamiliar/uncertain things are about to happen
https://www.sciencedirect.com/science/article/pii/S0896627305003624
http://www.gatsby.ucl.ac.uk/~dayan/papers/yud2002.pdf

Comment by SoerenMind on Where is human level on text prediction? (GPTs task) · 2021-03-03T19:19:44.518Z · LW · GW

FWIW I wouldn't read much into it if LMs were outperforming humans at next-word-prediction. You can improve on it by having superhuman memory and doing things like analyzing the author's vocabulary. I may misremember but I thought we've already outperformed humans on some LM dataset?

Comment by SoerenMind on Will OpenAI's work unintentionally increase existential risks related to AI? · 2021-01-04T16:59:59.372Z · LW · GW

No. Amodei led the GPT-3 project, he's clearly not opposed to scaling things. Idk why they're leaving but since they're all starting a new thing together, I presume that's the reason.

Comment by SoerenMind on New SARS-CoV-2 variant · 2020-12-21T19:12:54.793Z · LW · GW

Some expert commentary here:  https://www.sciencemag.org/news/2020/12/mutant-coronavirus-united-kingdom-sets-alarms-its-importance-remains-unclear

Noteworthy:

• We previously thought a strain from Spain was spreading faster than the rest but it was just because og people returning from holiday in Spain.
• Chance events can help a strain spread faster.
• The UK (and Denmark) do more gene sequencing than other countries - that may explain why they picked up the new variant first.
• The strain has acquired 17 mutations at once which is very high. Not clear what that means.
Comment by SoerenMind on Continuing the takeoffs debate · 2020-11-24T11:00:08.501Z · LW · GW

For example, moving from a 90% chance to a 95% chance of copying a skill correctly doubles the expected length of any given transmission chain, allowing much faster cultural accumulation. This suggests that there’s a naturally abrupt increase in the usefulness of culture

This makes sense when there's only one type of thing to teach / imitate. But some things are easier to teach and imitate than others (e. g. catching a fish vs. building a house). And while there may be an abrupt jump in the ability to teach or imitate each particular skill, this argument doesn't show that there will be a jump in the number of skills that can be taught /imitated. (Which is what matters)

Comment by SoerenMind on Covid Covid Covid Covid Covid 10/29: All We Ever Talk About · 2020-11-01T11:25:40.762Z · LW · GW

Right, to be clear that's the sort of number I have in mind and wouldn't call far far lower.

Comment by SoerenMind on Covid Covid Covid Covid Covid 10/29: All We Ever Talk About · 2020-10-31T11:38:19.679Z · LW · GW

the infection fatality rate is far, far lower [now]

Just registering that, based on my reading of people who study the IFR over time, this is a highly contentious claim especially in the US.

Comment by SoerenMind on interpreting GPT: the logit lens · 2020-08-31T22:56:19.199Z · LW · GW

Are these known facts? If not, I think there's a paper in here.

Comment by SoerenMind on Will OpenAI's work unintentionally increase existential risks related to AI? · 2020-08-21T14:32:08.715Z · LW · GW
But what if they reach AGI during their speed up?

I agree, but I think it's unlikely OpenAI will be the first to build AGI.

(Except maybe if it turns out AGI isn't economically viable).

Comment by SoerenMind on Will OpenAI's work unintentionally increase existential risks related to AI? · 2020-08-17T19:13:39.771Z · LW · GW

OpenAI's work speeds up progress, but in a way that's likely smooth progress later on. If you spend as much compute as possible now, you reduce potential surprises in the future.

Comment by SoerenMind on Are we in an AI overhang? · 2020-08-02T22:02:09.752Z · LW · GW

Last year it only took Google Brain half a year to make a Transformer 8x larger than GPT-2 (the T5). And they concluded that model size is a key component of progress. So I won't be surprised if they release something with a trillion parameters this year.

Comment by SoerenMind on Delegate a Forecast · 2020-07-30T19:49:55.138Z · LW · GW

I'm not sure if a probability counts as continuous?

If so, what's the probability that this paper would get into Nature (main journal) if submitted? Or even better, how much more likely is it to get into The Lancet Public Health vs Nature? I can give context by PM. https://doi.org/10.1101/2020.05.28.20116129

Comment by SoerenMind on The Puzzling Linearity of COVID-19 · 2020-06-30T14:03:59.371Z · LW · GW

https://www.medrxiv.org/content/10.1101/2020.05.22.20110403v1

"Why are most COVID-19 infection curves linear?

Many countries have passed their first COVID-19 epidemic peak. Traditional epidemiological models describe this as a result of non-pharmaceutical interventions that pushed the growth rate below the recovery rate. In this new phase of the pandemic many countries show an almost linear growth of confirmed cases for extended time-periods. This new containment regime is hard to explain by traditional models where infection numbers either grow explosively until herd immunity is reached, or the epidemic is completely suppressed (zero new cases). Here we offer an explanation of this puzzling observation based on the structure of contact networks. We show that for any given transmission rate there exists a critical number of social contacts, Dc, below which linear growth and low infection prevalence must occur. Above Dc traditional epidemiological dynamics takes place, as e.g. in SIR-type models. When calibrating our corresponding model to empirical estimates of the transmission rate and the number of days being contagious, we find Dc ~ 7.2. Assuming realistic contact networks with a degree of about 5, and assuming that lockdown measures would reduce that to household-size (about 2.5), we reproduce actual infection curves with a remarkable precision, without fitting or fine-tuning of parameters. In particular we compare the US and Austria, as examples for one country that initially did not impose measures and one that responded with a severe lockdown early on. Our findings question the applicability of standard compartmental models to describe the COVID-19 containment phase. The probability to observe linear growth in these is practically zero."

Comment by SoerenMind on The ground of optimization · 2020-06-22T23:42:14.710Z · LW · GW

Seconded that the academic style really helped, particularly discussing the problem and prior work early on. One classic introduction paragraph that I was missing is "what have prior works left unaddressed?".

Comment by SoerenMind on FHI paper on COVID-19 government countermeasures · 2020-06-08T23:26:03.789Z · LW · GW

Think of it like one-sided vs two-sided. You can have a 95% CI that overlaps with zero, like [-2, 30], because 2.5% of the probability mass is on >30 and 2.5% on <-2, but still the probability of >0 effect can be >95%. This can also happen with Frequentist CIs.

A credible interval is the Bayesian analog to a confidence interval.

Comment by SoerenMind on FHI paper on COVID-19 government countermeasures · 2020-06-08T23:14:13.747Z · LW · GW

We have no info on that, sorry. That's because we have a single feature which is switched on when most schools are closed. Universities were closed 75% of the time when that happened IIRC.

Comment by SoerenMind on How to do remote co-working · 2020-05-09T12:46:02.445Z · LW · GW

Yes these are also great options. I used them in the past but somehow didn't keep it up.

Co-working with a friend is good option for people like myself who benefit from having someone who expects me to be there (and who I'm socially comfortable with).

Comment by SoerenMind on The Puzzling Linearity of COVID-19 · 2020-04-25T02:53:53.433Z · LW · GW

Maybe this is not the type of explanation you're looking for but logistic curves (and other S-curves) look linear for surprisingly long.

Comment by SoerenMind on Will COVID-19 survivors suffer lasting disability at a high rate? · 2020-04-21T23:27:28.613Z · LW · GW

The second study has a classic 'adjusting for observed confounders' methodology which comes with classic limitations such as that you don't observe all confounders. For example, they control for alcohol, drug abuse, but not smoking (!)

The first study also acknowledges possible confounding but I haven't checked it in detail.

Comment by SoerenMind on Any rebuttals of Christiano and AI Impacts on takeoff speeds? · 2020-03-30T22:16:46.847Z · LW · GW

Looking forward to it :)

Comment by SoerenMind on AGI in a vulnerable world · 2020-03-30T22:02:19.827Z · LW · GW

I'm using the colloquial meaning of 'marginal' = 'not large'.

Comment by SoerenMind on AGI in a vulnerable world · 2020-03-27T13:40:32.672Z · LW · GW

Hmm, in my model most of the x-risk is gone if there is no incentive to deploy. But I expect actors will deploy systems because their system is aligned with a proxy. At least this leads to short-term gains. Maybe the crux is that you expect these actors to suffer a large private harm (death) and I expect a small private harm (for each system, a marginal distributed harm to all of society)?

Comment by SoerenMind on AGI in a vulnerable world · 2020-03-27T13:08:31.714Z · LW · GW

I agree that coordination between mutually aligned AIs is plausible.

I think such coordination is less likely in our example because we can probably anticipate and avoid it for human-level AGI.

I also think there are strong commercial incentives to avoid building mutually aligned AGIs. You can't sell (access to) a system if there is no reason to believe the system will help your customer. Rather, I expect systems to be fine-tuned for each task, as in the current paradigm. (The systems may successfully resist fine-tuning once they become sufficiently advanced.)

I'll also add that two copies of the same system are not necessarily mutually aligned. See for example debate and other self-play algorithms.

Comment by SoerenMind on AGI in a vulnerable world · 2020-03-26T18:10:58.245Z · LW · GW

This reasoning can break if deployment turns out to be very cheap (i.e. low marginal cost compared to fixed cost); then there will be lots of copies of the most impressive system. Then it matters a lot who uses the copies. Are they kept secret and only deployed for internal use? Or are they sold in some form? (E.g. the supplier sells access to its system so customers can fine-tune e.g. to do financial trading.)