Posts

Comments

Comment by sun_harmonics on [deleted post] 2024-10-20T04:22:46.517Z

In Machines of Loving Grace, Dario Amodei writes, “...[biological] data is often lacking—not so much in quantity, but quality: there is always a dearth of clear, unambiguous data that isolates a biological effect of interest from the other 10,000 confounding things that are going on, or that intervenes causally in a given process, or that directly measures some effect (as opposed to inferring its consequences in some indirect or noisy way)... Given all this, many biologists have long been skeptical of the value of AI and “big data” more generally in biology. … there’s still a perception that AI is (and will continue to be) useful in only a limited set of circumstances. A common formulation is “AI can do a better job analyzing your data, but it can’t produce more data or improve the quality of the data. Garbage in, garbage out”. But I think that pessimistic perspective is thinking about AI in the wrong way. If our core hypothesis about AI progress is correct, then the right way to think of AI is not as a method of data analysis, but as a virtual biologist who performs all the tasks biologists do…”

The lab MedARC, for example, analyzes large amounts of brain imaging data of with machine learning pipelines, but neuroscientists are critical of this type of study: https://www.reddit.com/r/datascience/s/eGS8hZrRTn

Open labs such as MedARC that work with volunteers on subjects that are superficially attractive don't seem to be good opportunities. In reality many of the tasks are probably fake work, and they offer no real compensation. Amodei's discussion of data in biology might make us cautious about other tasks in bioology too, but I think this is an instance of a general sort of error that might occur in any subject.