Five routes of access to scientific literaturepost by DirectedEvolution (AllAmericanBreakfast) · 2022-07-03T20:53:47.044Z · LW · GW · 4 comments
Free legal options Piracy Paying tuition Paying for individual articles and personal subscriptions Old research New research Real problems Paying for DeepDyve Conclusion and future directions None 4 comments
Tl;dr: It seems like you can get access to a large amount of scientific literature on DeepDyve for $500. This might be enough to meet the needs of researchers outside the university system.
The eleventh virtue is scholarship. Study many sciences and absorb their power as your own. Each field that you consume makes you larger. If you swallow enough sciences the gaps between them will diminish and your knowledge will become a unified whole. If you are gluttonous you will become vaster than mountains. It is especially important to eat math and science which impinges upon rationality: Evolutionary psychology, heuristics and biases, social psychology, probability theory, decision theory. But these cannot be the only fields you study. The Art must have a purpose other than itself, or it collapses into infinite recursion. - Eliezer Yudkowsky, Twelve Virtues of Rationality
Information Wants To Be Free. Information also wants to be expensive... That tension will not go away. - Stewart Brand
We have a literature on LessWrong on the costs and benefits of scholarship [LW · GW]. Yet we might still neglect [LW · GW] it. For a good, broad treatment of content fragmentation, check out Who Has All the Content? from 2017. Here, I promote this virtue by helping the reader consider the tradeoffs of five routes of access to the scientific literature.
Epistemic limitations: I haven't used all the services I'm describing. This is based on my own research needs, and it's possible that others would find it easier or harder to access what they need. This isn't comprehensive: it doesn't look at the cost of textbooks, for example.
Moral limitations: Some people think pirating the scientific literature is wrong, others think that information wants to be free. I'm not taking a side here. I'm just listing practical options, as well as some reasons why people might practice scientific literature piracy - even when they have legal access to the article they're looking for.
Free legal options
Within-section Tl;dr: This is probably adequate if you rarely need access to scientific articles, but happen to need a handful of new papers for some non-urgent reason.
There are free or low-cost alternatives to piracy. Preprints sometimes appear on a aRxiv.org. Sometimes, universities may offer public access to their articles for a moderate fee. If you email the corresponding authors (often indicated by a little 'letter' icon next to their name in the byline), they will often send articles for free on request.
However, these alternatives are very slow. Last night, I spend about 3 hours doing biomedical research. In the process, I accessed 30 articles. In most of them, I needed to extract just a few pieces of information from the body of the article. Often, I wasn't completely sure whether it contained what I needed - or even exactly what piece of information I needed to find. This is the problem of "not knowing what you don't know."
Waiting for days or weeks for the chance of getting access to the article I need would be a little like trying to do scientific research with severe brain damage. Everything would take vastly longer than it should. I would waste a tremendous amount of time. It would be a nuisance, both for me and for the authors.
Within-section Tl;dr: Sci-hub has old articles, but not many post-2021 articles. And who knows if they'll stick around?
There are ways to pirate scientific literature. Sci-hub seems to be the most common way, and is apparently now adding articles again. There are also a range of piratical alternatives to sci-hub. However, if I wanted to read a recent paper, such as "Delivering precision oncology to patients with cancer," published in Nature Medicine about 6 weeks ago, it's not on sci-hub, Library Genesis (libgen), z-library, or Pirate Bay. Scientific piracy may sound awesome, but, much like actual piracy, its golden age might be in the past.
Within-section Tl;dr: If you have it, great, but this is primarily aimed at people who don't.
The most expensive way to access the scientific literature is probably by paying college tuition. As a master's student at a major US research university, I have institutional access to any scientific journal I want to read. One nice part is that I can copy/paste and download any articles my university provides access to. It may be possible to get institutional access by taking a single remote class as a non-matriculated student at a cheap university, but I don't know what the best option would be.
It's so annoying to use my institutional access to get articles that I often use sci-hub anyway. For some articles, I logged in once and never had to bother again. For others, I simply can't find a way to make that happen, and have to go through a slow, four-click search via my university's library web page to get to the article. It's faster just to pop over to sci-hub and get it, so I often do that instead when possible.
Paying for individual articles and personal subscriptions
Within-section Tl;dr: Unless you expect to read < 12 recent articles per year and you need them urgently, it's probably better to either email the authors or purchase a DeepDyve subscription.
I obtained a rough estimate of the cost of buying personal subscriptions to the journals I'd need to keep up with the times in my fields of interest.
I'm not reporting the specific article names, but this is an easy experiment to repeat for your own field. It took me a couple hours. My browser normally automatically logs me in via the institutional access I'm paying for, but I ensured articles were truly closed-access by looking at them in incognito mode to reveal the paywall.
Since I'm in the life sciences, I looked up the cost of subscribing to the top 20 journals in Life and Earth Sciences by the 2022 Google Scholar metrics. Most are open-access. Several are in Nature, and a Nature+ subscription gives you access to all their articles for $30/month. The lowest annual total cost added up to $1567/year, which in some cases is from getting a 15% discount for buying a three year subscription.
I also used my browser history from that 30-article research binge I did last night. Again, most articles were open-access. Others were in journals on the Google Scholar top 20 list I'd just priced. Beyond these, Annual Review of Pharmacology and Toxicology and Cell Chemical Biology offer annual personal subscriptions for a combined total of $655. On top of that, Pharmaceutical Research and Journal of Biomedical Materials Research appear to only offer institutional subscriptions. Individual readers only have the option to buy access to individual articles, which would have added $52 to the cost of my research binge.
All of the articles from my research binge were open access, on sci-hub, or had preprints on bioRxiv. This is in part because I was studying old literature on a topic that was new to me, rather than attempting to keep up with the latest. To look at this issue, I searched for the name of my field in Google Scholar, limited to articles published in 2022. I took the top 10 articles that popped up and priced them.
Almost none of them had personal subscription options, and 8/10 were paid access. Purchasing the 7 paywalled individual articles that had no subscription option would have cost $246, and the subscription to Critical Reviews in Analytical Chemistry would have cost an additional $391. Within the 7 paywalled, institutional access-only articles, there were 5 journals, with 2 of the journals each containing 2 of the articles.
This seems to be the area in which "keeping up with the latest" becomes extremely expensive for those without institutional access. Of course, it may also be the area where people without institutional access have the least need to read the articles.
A use case where this may create real problems is a student in between their undergraduate and PhD work, or in between an MS and a PhD. They might want to keep up with their field, both to prepare for and to choose their university and research direction, and be unable to do so. Another example is an entrepreneur who wants to launch a biotechnology startup. A third example is a freelance biomedical researcher, or a patient who is trying to research their own medical issues.
Paying for DeepDyve
Within-section Tl;dr: This is probably the way to go if you need access to new literature.
DeepDyve is a business offering access to "100M+ papers." Their Pro subscription costs $499/year. Given the pricing information I found below, this seems to be a much better option for high-demand users relative to the cost of buying personal subscriptions and individual articles. This provided access to all the articles listed in "New Research," where accessibility was lowest via both personal purchases or piracy.
I haven't used DeepDyve beyond just a cursory examination of its search results, and I can't vouch for its quality or coverage of other fields. Given what I know at this time, it's where I'd start, should I find myself cut off from institutional access. It might even cover all my needs.
One limitation of DeepDyve is that it only gives you access to reading online, not downloading or printing. Most crucially for me, this appears to prohibit copy and paste. I copy important bits of articles constantly, and this would be a crucial inhibitor for me. The way I'd most likely deal with this is by using the "snip" feature on my computer to take a screenshot of the text I want, then using image-to-text to convert it. This would slow me down, but not by too much.
As a side note, one of the most helpful research tools I've found in the last two years is the ability to take "snip" screenshots of a portion of your screen. On Windows, the "snip" app is an annoying popup window. I use Ubuntu, and can initiate a "snip" to my clipboard just by pressing ctrl-shift-a, clicking and dragging across the screen area I want to screenshot, and releasing the mouse button. I use this all day, every day.
Conclusion and future directions
Unavoidably, this is an of-the-moment post rather than the sort of timeless content we typically prefer on LessWrong. I see this as a complement to articles like The Best Textbooks on Every Subject [LW · GW] and The Best Software For Every Need [LW · GW]. While this post focuses on content delivery, it would be great to have a review of tools for article discovery as well.
I'm also interested in "program discovery" or "lab discovery." How does a student with a particular set of research interests efficiently find a compatible lab in an area they want to live in? Right now, I'm doing this for myself by trawling through faculty publication lists at universities I'm considering, as well as looking at the institutional affiliations of papers I read. I also email faculty and ask them for suggestions, but usually don't get any useful information from them. Overall, this process is slow and inefficient. I'd love to see it done better.
In conclusion, let's have a toast to scholarship!
Comments sorted by top scores.