What's holding back outsourcing to cloud labs?

post by ChristianKl · 2020-10-28T23:07:19.509Z · LW · GW · 1 comment

This is a question post.

Contents

  Answers
    7 ryan_b
    2 jeff_y
None
1 comment

I would have expected Emerald Cloud Lab or similar competitors to go a lot and be successful over the last five years. As far as I know, like Emerald Cloud Lab only had modest growth and there aren't competitors who grew strongly. Outsourcing to cloud labs seems like it allows the laberatory to have benefits of scale and virtualization that drives down costs and is easier to use then working in a wet lab. Is there something holding back this trend that I'm not seeing? Alternatively, what's going on?

Answers

answer by ryan_b · 2020-10-29T15:06:44.679Z · LW(p) · GW(p)

Disclaimer: I do not work in a lab, and never have beyond a short stint as a research assistant in undergrad.

That being said, I can think of several reasons. In no particular order:

  • Habit: researchers are used to what they have been doing, and do not want to change.
  • Control: they are a hands-on scientist who likes to roll up their sleeves.
  • Secrecy: if a cloud lab does the experiment, then a cloud lab has the data.
  • Not Free: why spend money on a cloud lab when they have already-paid-for equipment and interns in their lab?
  • Training: most of the actual routine work is done by grad students; this is a critical part of their training as scientists. If experiments are outsourced, how will they learn to use the equipment and to design experiments of their own?
  • Cognitive burden: another tool chain to remember? It's not even a python script!
  • Publication bias: I have read several accounts of papers being rejected because they used the wrong code in their analysis; the reviewers preferred R or Python. Do any journals accept a description of a proprietary software workflow from a single company in the methods section?
  • Experimental design: things have improved from a replication standpoint since the replication crisis, but I don't see much movement on the bias towards novel results. The scalability argument Emerald Cloud Labs is making doesn't appeal as much if the goal of an experimenter is to design the most novel possible experiment.
  • Inadequate discovery equilibrium: this is essentially another facet of the previous point, but researchers may assume that because the experiments are so easy to replicate and scale that their efforts will not be sufficiently rewarded, even if they can think of good experiments to run.
  • Too few non-academic researchers: business investment in R&D has plummeted from its previous levels, as most corporations moved to shift investment into shorter-payoff projects. They are likely not even evaluating this kind of product anymore.
  • Competition from computers: a significant chunk of the big data/machine learning revolution is going into producing better models and simulations; this directly competes with the repeatability and scalability pitch that cloud labs are making. Come to think of it, the best use might be validating or building a model or simulation.
comment by ChristianKl · 2020-10-29T22:11:01.776Z · LW(p) · GW(p)

Competition from computers: a significant chunk of the big data/machine learning revolution is going into producing better models and simulations; this directly competes with the repeatability and scalability pitch that cloud labs are making. Come to think of it, the best use might be validating or building a model or simulation.

My expectation is that nothing coming out of big data/machine learning models at the moment is going to be trusted directly but needs to be verified in actual experiment. Do you believe differently? 

Replies from: ryan_b
comment by ryan_b · 2020-10-30T14:07:29.986Z · LW(p) · GW(p)

Only slightly, and that a matter of emphasis. In my view the crux of the matter is the relationship between modelling and the traditional lab is very similar to the relationship between a cloud lab and the traditional lab; both are adding value by improving scale and repetition.

Weighing against my point, it does appear to me that the areas where modelling is emphasized the most are areas where experiments are very difficult or impossible, like nuclear fusion or climate science.

I do not see anywhere on Emerald Cloud Labs' website claims that they offer experiments which cannot be achieved in a traditional lab. This leads me to suspect that the feedback loop between modelling and the traditional lab is better than that between a cloud lab and a traditional lab, because in spite of the similar value-add pitch, it remains the case that the cloud lab is primarily a substitute for the traditional lab, and modelling is primarily a complement.

Another detail I thought of: we remain stuck very much in the mode of hypothesis->experiment->data being a package deal. If became popular to disentangle them, like through likelihood functions or through one of the compression paradigms [LW · GW], then bulk data generation becomes independently valuable and it would make a lot of sense to run lots of permutations of the same basic experiment, without even a specific hypothesis in mind.

Replies from: ChristianKl
comment by ChristianKl · 2020-10-30T16:22:48.680Z · LW(p) · GW(p)

I do not see anywhere on Emerald Cloud Labs' website claims that they offer experiments which cannot be achieved in a traditional lab. 

Yes, it's more about being able to do experiments more efficiently then about making new kinds of experiments. 

The problem of modeling is that the modeling results are not the real world. If you care about which molecule binds to which protein you can model reactions for a lot of different reactions to find good candidates to validate in real experiments. The cloud lab actually gives you the real experiment.

answer by jeff_y · 2021-11-02T15:28:49.778Z · LW(p) · GW(p)

ECL is a good fit for my use case, which is running a small-scale, independent, non-profit research lab. I agree with a lot of what @ryan_b said. In addition, it's very expensive. Billed annually, a nonprofit can get the price down to $24k / month. There's a similar discount for startups. Pay monthly, and add some tutoring, and you're looking at almost $49k / month.

In comparison, renting a bench (with standard equipment) in Cambridge, MA costs around $3k - 4k / month, and other bio incubators might charge as little as $500 / month.

1 comment

Comments sorted by top scores.

comment by johnswentworth · 2020-10-29T16:28:41.147Z · LW(p) · GW(p)

Possibly-relevant subquestion: how do grantmakers feel about grantees using cloud labs?