How and why to turn everything into audio

post by KatWoods (ea247), AmberDawn · 2022-08-11T08:55:55.871Z · LW · GW · 20 comments

Contents

    Read while doing other things
    Listen and read at the same time
  How to convert text to audio
    What I look for in a text-to-voice app
    Apps I use and recommend
      Evie for books on your phone
      @Voice for articles on your phone
      Natural Reader for articles on your computer
    Other apps
    You can get EA-related audio content from the Nonlinear Library
None
20 comments

If you love podcasts and audiobooks and find yourself occasionally facing that particular nerd-torture of discovering that an obscure book isn’t available on Audible, read on.

I’m kind of obsessed with listening to content (hence building the Nonlinear Library [EA · GW]), and there are easy ways to turn pretty much all reading materials into audio, including most books and even blog posts, like LessWrong.

In this post I’ll share my system to turn everything into audio and my rationale for people who haven’t yet discovered the joys of reading with your ears.

If you’re already sold on listening to everything, skip to the section “Apps I use and recommend” for the practical nitty-gritty of how to turn everything into audio.

Read while doing other things

Have you ever reluctantly dragged yourself away from a really engaging EA post [EA · GW] so that you could make dinner or drive to work? Have you ever procrastinated on doing the household chores because what you’re reading is so much more exciting and probably higher impact too, now that you think about it? If you have an audio version, you don’t have to choose between reading and chores: you can keep reading as you cook or travel. It’s a way to actually productively multi-task.

Generally, if you convert your books to audio, you can read at times when your mind is not occupied, but holding a physical book would be inconvenient. For example, you can read while you are:

With audiobooks, you can consume content almost all the time, if you want. This is great if you’re a bibliophile whose bookshelves and internet browsers are overflowing with enticing unread material: you can get through more content and make the more mundane parts of your day more interesting.

Listen and read at the same time

A little-known life hack: you can read a book with your eyes and listen to the audio version at the same time. It’s called immersive reading or two-channel reading. I find that this requires much less energy than reading on its own. It’s more like a movie, where if your attention flags, it’ll draw your mind back in. There’s also something about engaging more of your senses. When I do this, I can stay focused for longer and reading feels more relaxing. So, if you’d like to read but are feeling tired or are struggling to read something important but kind of dry [EA · GW], try listening and reading at the same time.

How to convert text to audio

Lots of people like the idea of listening to books and articles, but aren’t sure how to get texts into audio format. Here’s what I look for in a text-to-speech app, and some recommendations of specific apps to try.

What I look for in a text-to-voice app

In my opinion, the best text-to-voice apps display the text you’re reading as they play the audio, and allow you to skip around by clicking on the text. This means that if you don’t catch something, or your attention wanders, you can easily go back to the last part you heard. You can also skim the text and skip forward if you’re bored or if the current section isn’t relevant.

Here’s an example of what I mean:


On voices, sometimes people give up on these apps very quickly because they find the reading voices annoying.

Firstly, if you haven't tried for awhile, try again. The voices have gotten way better in the last couple of years. Something to do with AI getting better at a completely non-terrifying pace. Nothing to worry about there...

Secondly, if you find the voices weird to start with, I suggest that you persevere for a while - most people get used to the voices after only a few hours. I even feel affectionate and nostalgic towards some of the weird robot voices in my apps - I’ve read so many books that way!

I also have an Android so I don’t know what apps are best if you have an iPhone. If you have any recommendations for people with different phones, please post them in the comments!

In fact, make recommendations for any apps that you use in the comments. These apps work well enough for my uses, but they’ve all got their own issues and I make no claim that they’re the best out there.

Apps I use and recommend

Evie for books on your phone

My favorite app for listening to books is Evie. It’s free if you’re not fussy about voices. It costs money if you want to pay for nicer voices.

Unfortunately, it’s only available on Android, and it only works on DRM-free books. Most ebooks that you buy on e-readers like Kindle are DRM-protected. There are two solutions to this:

  1. You buy your book the usual way, then get a DRM-free version from LibGen.
  2. Remove the DRM from Kindle books using Epubor Ultimate, turning it into a PDF, then opening it in Evie.

@Voice for articles on your phone

@Voice is my favorite app for listening to articles on your phone. It’s also free. It allows you to make playlists of articles that you want to read. Also, like Evie, you can see the text as you listen and skip around.

Natural Reader for articles on your computer

Natural Reader is my go-to for listening to articles on a computer. It’s more difficult to pause than @Voice and Evie, and you can’t make playlists, but it’s convenient if you’re on a computer and want to listen as you read.

Other apps

The Nonlinear Library uses Beyondwords.io and Amazon Polly voices. These work well, but they’re designed for industrial use; they don’t work as well for personal use. It’s better if there’s a regular source of content that you want to convert for a large number of people and send it to podcast platforms. If you want to do this for your personal use, Tayl is probably better, though I don’t have as much experience with it.

Lots of people like Audible, and the voices are recorded by humans rather than generated by text-to-speech algorithms. However, you can only listen to audiobooks that they have in their collection, so you can’t find an article and listen to it with Audible. Additionally, it’s expensive and doesn’t show you where you are in the text. However, if you have a strong preference for non-robotic voices, this might be the app for you.

Kindle has an immersive reading mode that allows you to listen and read at the same time if you own both the ebook and audiobook. You can often buy the audiobook at a discount if you buy it along with the ebook. This is more expensive than using Evie but might be a good option for people who prefer human readers.

Many people really like Pocket, so you might want to give it a shot. For me and many others I’ve asked, it’s glitchy to the point of unusability. The main problems for me are that if you pause, it’ll often lose your place, and that it doesn’t have the feature where it highlights the text as you read and allows you to start from where you want. Its playlist feature is also extremely rigid.

These apps are what I use, but honestly, they’re still quite bad. For whatever reason, the TTS industry seems terrible, which is part of why I decided to build the Nonlinear Library [EA · GW] which automatically turns top rationalist content into podcast format. The Nonlinear Library now has separate feeds [EA · GW] for the EA Forum, LessWrong, and the Alignment Forum, as well as “top of all time” playlists [EA · GW] featuring around 400 of the most upvoted posts from each forum for your binging pleasure.

This post was written collaboratively by Kat Woods and Amber Dawn Ace as part of Nonlinear’s experimental Writing Internship program. The ideas are Kat’s; Kat explained them to Amber, and Amber wrote them up. We would like to offer this service to other EAs who want to share their as-yet unwritten ideas or expertise.

If you would be interested in working with Amber to write up your ideas, fill out this form.

20 comments

Comments sorted by top scores.

comment by Gunnar_Zarncke · 2022-08-11T11:11:16.797Z · LW(p) · GW(p)

Hearing may work better for some people, but for some, like me, reading works much better - and faster. Also: Humans Who Are Not Concentrating Are Not General Intelligences [LW · GW] - if you just passively absorb audio, then it better be aligned with your values (corollary: Also applies to music).

This may sound critical, but it is intended in the spirit of the Rule of Equal and Opposite Advice. I upvoted this because there is a lot of good advice, and there are people who need to hear it.

Some thoughts on the related topic of video can be found in this thread [LW(p) · GW(p)]. 

Replies from: angmoh
comment by angmoh · 2022-08-11T23:44:22.212Z · LW(p) · GW(p)

The speed issue is the #1 factor stopping me from trying audiobooks. A book might take me 4-8 hours to read but the internet tells me audio is 2-3x slower. I have a lot of other prejudices against audiobooks (flipping / skimming less easy, less focus on the task etc) but that's the main one.

Replies from: korin43, rsaarelm, phil-scadden, mikbp
comment by Brendan Long (korin43) · 2022-08-13T19:01:10.876Z · LW(p) · GW(p)

I largely agree, but most apps will let you listen to things at faster speeds if you want to (my max speed is 1.25x - 1.5x depending on content dryness though).

comment by rsaarelm · 2022-08-12T05:32:27.183Z · LW(p) · GW(p)

Still makes sense if you listen when walking or driving when you couldn't read a book anyway. I mostly listen to podcasts instead of audiobooks though, a book is a really long commitment compared to a podcast episode.

comment by Phil Scadden (phil-scadden) · 2022-08-12T01:19:13.462Z · LW(p) · GW(p)

Couldnt agree more. I have no patience for audio and video. Too slow. Might watch instructional on video if I cant find decent manual. Not much into conferences either - just let me see the papers.

comment by mikbp · 2022-10-07T16:20:15.347Z · LW(p) · GW(p)

How can you read 2-3x faster than a person speaks (1x)? Do you mean that when you "read" you just skim most of the time and really read only the parts you are interested in?

As others mention, most readers allow you to increase the speed of the the audio. Up to 2x, for light content and with headphones is usually alright if you can concentrate on the audio. Faster than that, I find it really difficult to follow, so you are probably still faster.

Replies from: pjeby
comment by pjeby · 2022-10-08T00:56:48.317Z · LW(p) · GW(p)

How can you read 2-3x faster than a person speaks (1x)?

From Wikipedia:

Subvocalization readers (Mental readers) generally read at approximately 250 words per minute, auditory readers at approximately 450 words per minute and visual readers at approximately 700 words per minute. Proficient readers are able to read 280–350 wpm without compromising comprehension.

Conversational speech is generally 100 to 180wpm, so even subvocalizing readers already have a leg up. "Proficient" readers by Wikipedia's definition are easily in the 2-3x range over this, and visual readers even more so.

comment by Mitchell_Porter · 2022-08-11T18:32:35.447Z · LW(p) · GW(p)

What's the best way to turn audio to text?

Replies from: ea247, rsaarelm
comment by KatWoods (ea247) · 2022-08-12T09:34:19.439Z · LW(p) · GW(p)

I'm not super into that, but I've heard good things from people about Otter.ai

comment by rsaarelm · 2022-08-12T05:29:49.949Z · LW(p) · GW(p)

Podcast transcription services probably. They seem to cost around $1 per minute nowadays. I expect they'll keep getting disrupted by AI. There's already audio transcription AIs like the autogenerated subtitles on youtube, but they get context-dependent ambiguous words wrong. Seems like an obvious idea to plug them to a GPT style language model that can recognize the topic being talked about and uses that to pick an appropriate transcription for homonyms.

comment by mikbp · 2022-10-07T17:10:23.221Z · LW(p) · GW(p)

I love to listen to stuff. Books, articles, podcasts, music, radio... But I must warn the audience that modern life leaves us very few opportunities for mind-wander, and that listening to articles/books while doing chores, commuting, exercising, etc. diminishes these even further. So be mindful of this and try freeing some time for you to wander / not concentrate in anything.

comment by MondSemmel · 2023-08-08T15:07:14.995Z · LW(p) · GW(p)

Thanks for this post! From what I can tell, there have been a bunch of advancements in this field in the last year, though. Given that, I'm wondering whether you still use and recommend this current set of apps and services. Or do you have any updated recommendations?

Replies from: ea247
comment by KatWoods (ea247) · 2023-08-10T09:47:21.470Z · LW(p) · GW(p)

Yes, it has improved a lot! Due to laziness I haven't really switched, but Speechify seems like a winner compared to Natural Reader on a laptop. But I almost never read with my ears when I'm at my laptop, so can't say for sure. 

Inspired by this comment trying Speechify on my phone for a book and it's taking ages to "process" it, so I might give up. I do have bad internet right now, but also, I often have bad internet, so I can't work with something that requires good internet all the time. Also opened the epub in such a way where I can't seem to click on the word and then skip to that section. Or at least, I can, but it's super glitchy and selects a massive chunk.

Bear in mind, could also be that it just takes some getting used to. Evie is also a pain in the ass in many ways, but I'm just used to it now :P 

comment by mikbp · 2022-10-07T17:01:41.866Z · LW(p) · GW(p)

To read a website in your computer (blog post, news article, etc):

  • If you use Firefox, the built-in tool "Reader view" (you can access it pressing F9 or clicking a small Written-paper-icon at the right-most side of the url bar, left of the Bookmarking star) has an option to listen to the text. You can control the speed (up to a point, it does not allow to speed it up enough, in my view) and the voice. It is not awesome, but for the standards of the (free) text-to-speech options, I find it good and used it very often. A very useful plus of the Reader view is that it indicates the approximate time one would need to read the text.
  • If you use Chrome, you can download an extension called Reader View. It basically does the same than the Firefox Reader View (and it looks very similar as well, I actually believe that it is deliberate). There are other extensions offering more or less the same. I settled for this one because it also indicates the approximate time one needs to read the text.

Reading PDFs with headers/footers:

  • Every app I tried suck at reading PDFs with headers/footers! I have not found a way to make the reader ignore them. However, there is an hilarious workaround: open the pdf file with MS Word and use the in-built tool to read the text. It takes a while for word to open long pdfs, but Words "understands" the pdf's headers and footers (it formats them as headers/footers in word), and the reading tool do not read them. 

All these are not perfect but alright in my opinion. However, listening to (or reading+listening to) a text with too many citations is very tedious. (Free) readers do not handle them well. They read them with a very weird and slow pace.

comment by DirectedEvolution (AllAmericanBreakfast) · 2022-08-12T00:43:15.387Z · LW(p) · GW(p)

It's great to make readers aware of their options for TTS. I like your idea of adding an audio component to your reading. Readers can also just read out loud, although I do (weakly) find that it seems different to read aloud, vs. listening to a recording of yourself, vs. listening to a recording of someone else reading the text.

I also agree with your implied claim that the limiting factor on comprehension [LW · GW] is having the time to process it.

Without wanting to be critical of your good and useful post, I doubt that TTS versions will help much with the time-and-attention bottleneck for learning new information.

The reason I think TTS will not help appreciably with comprehending new information at a faster rate, at least among this audience, is that most adults can't passively absorb new, complex information. The "sponge" metaphore for learning is misleading.

 Information intake requires constant, fine-grained adjustments. If you didn't understand the last sentence, you need to go back and re-read it. If you've forgotten how the current argument ties into the overall piece, you need to scroll back even further. If figures and diagrams are mentioned, you need to be able to see them. If an unfamiliar word is used, or a reference calls your attention, you need to be able to follow that link. You need time to work through your own thoughts as they arise during the reading. And of course, you need to apply the information to solve problems, not just read it.

This is just a sample of the many micro-tasks that go into learning. Virtually all of them become impossible when you are doing other tasks, or absorbing the text in audiobook form.

I suspect there are certain learning tasks that could be done well via audiobook. Examples include speed reading and reading for entertainment, as well as practicing flashcards.

Replies from: TekhneMakre
comment by TekhneMakre · 2022-10-07T16:27:57.033Z · LW(p) · GW(p)

I find this is true in different amounts for different kinds of content. I'd never listen to a math book but would almost always prefer an audio history book if it exists.

comment by Gordon Seidoh Worley (gworley) · 2022-08-11T15:51:40.941Z · LW(p) · GW(p)

Thanks for the app suggestions. I use Pocket today but as you noted it has some limitations and can be glitchy at times. I'll give your suggested apps a try!

comment by TekhneMakre · 2022-10-07T16:29:11.824Z · LW(p) · GW(p)

I strongly dislike the robotic voices, but strongly like audio. I'm hoping that AI really dials in TTS. 

comment by mingyuan · 2022-08-16T10:02:56.845Z · LW(p) · GW(p)

I use WebOutLoud on iPhone. You can use it for free (with lots of free voices to choose from) and it allows you to follow along / skip around the text. It's not a mindblowingly perfect app, but I can't really think of anything in particular I'd change about it.

Also, just saying, when I started using TTS I was surprised at how natural-sounding computer-generated voices have become. Not that you'd mistake them for human, but the cadence is pretty decent.

comment by janshi · 2022-09-28T07:09:06.027Z · LW(p) · GW(p)

Hint: Macs and iOS devices come with build-in “accessibility” tools that read out loud everything on screen. The voices can be improved even more by downloading the “Siri enhanced” voice in the settings.