Podcast with Oli Habryka on LessWrong / Lightcone Infrastructure

post by DanielFilan · 2023-02-05T02:52:06.632Z · LW · GW · 20 comments

This is a link post for https://thefilancabinet.com/episodes/2023/02/05/6-oliver-habryka.html

OK sorry to over-advertise but it seemed like this one would be of interest to the LessWrong and EA communities. Episode description below, audio is here, or search for "The Filan Cabinet Habryka" wherever you listen to podcasts.


In this episode I speak with Oliver Habryka, head of Lightcone Infrastructure, the organization that runs the internet forum LessWrong, about his projects in the rationality and existential risk spaces. Topics we talk about include:

20 comments

Comments sorted by top scores.

comment by Rafael Harth (sil-ver) · 2023-02-05T23:59:08.782Z · LW(p) · GW(p)

That was great. I legit had no idea how big of a role Oliver played in making LW 2.0 happen; I always assumed he was just hired.

Replies from: MondSemmel
comment by MondSemmel · 2023-02-06T18:24:25.173Z · LW(p) · GW(p)

Agreed, this was a great podcast.

Replies from: DanielFilan
comment by DanielFilan · 2023-02-07T02:16:58.371Z · LW(p) · GW(p)

Glad to hear that people liked it :)

comment by Nisan · 2023-02-06T09:32:35.251Z · LW(p) · GW(p)

Is there a transcript available?

Replies from: MondSemmel, DanielFilan, MondSemmel
comment by MondSemmel · 2023-02-14T08:07:50.433Z · LW(p) · GW(p)

After way more effort than I thought it could possibly require, there is now a full transcript here [LW · GW].

Replies from: DanielFilan
comment by DanielFilan · 2023-02-14T08:28:24.360Z · LW(p) · GW(p)

Indeed - it feels like it should be so easy to turn audio into text. Did you do it by using otter then manually going over it? FWIW if you use rev.com, you can save a lot of time by spending quite a bit of money.

Replies from: MondSemmel
comment by MondSemmel · 2023-02-14T09:02:58.072Z · LW(p) · GW(p)

I used a service with an OpenAI Whisper backend as a first pass (specifically, revolvdiv this time), then manually transcribed everything, discovered that leaving all the speech filler words in made the transcript very hard to read, and did another editing pass.

I agree that, if I do this again in the future, rev.com would be a relevant choice.

Anyway, ultimately the hard part was not mainly turning audio into text, but doing so at a (self-inflicted, probably unreasonably) high standard of accuracy. No, even that's not quite right. The problem is that you want high accuracy (so you don't put words into someone's mouth), but not regarding the literal spoken words (which are full of filler, and word repetitions, and unintelligible mumbling, and sentences that don't have correct grammar - all because people don't speak like they write), but rather the meaning the speakers wanted to convey.

But also, this is the kind of thing at which one gets much better with experience, which I lacked.

comment by DanielFilan · 2023-02-06T16:42:09.601Z · LW(p) · GW(p)

No, sorry. Since a few people have asked: transcripts are pretty money- and time-consuming to produce, and I wanted to have a podcast where I make the trade-off of having more episodes but with less polish.

comment by MondSemmel · 2023-02-06T16:50:09.917Z · LW(p) · GW(p)

If there isn't, I recommend to the podcast creator to consult with e.g. the Clearer Thinking podcast team on how they do cost-effective partly-automated transcripts nowadays. Here's an article on their thinking from early 2022, which was before e.g. OpenAI Whisper was released.

I think this LW post would be significantly more useful with a full transcript, even if automated, for instance because it's easier to discuss quotes in the comments. (On the other hand, there's a risk of getting misquoted or directing excessive scrutiny to language that's less polished than it would be in essay form, or that may suffer from outright transcription errors.)

Replies from: DanielFilan
comment by DanielFilan · 2023-02-06T18:39:25.244Z · LW(p) · GW(p)

You mean this? FWIW I do transcriptions for AXRP - it takes a bunch of time to get something I percieve as "not embarrassingly bad", and as mentioned in the sister comment to yours, I'm basically making the tradeoff for this episode of publishing more episodes while having them be less polished.

Replies from: DanielFilan, MondSemmel
comment by DanielFilan · 2023-02-06T18:43:37.660Z · LW(p) · GW(p)

Concretely, I guess it would cost me ~$270 plus 2-4 hours of my time.

comment by MondSemmel · 2023-02-06T22:29:28.696Z · LW(p) · GW(p)

Yes, that's the article I meant. I understand the tradeoff you're making, and given the costs you cite, I can totally see that that's not worthwhile for you, especially if higher quality trades off against higher quantity.

That said, I've mailed the Clearer Thinking podcast team to ask about more details regarding their current transcription workflow (which is currently a combination of automatic transcription via Otter.ai, followed by a hired human transcriptionist, to minimize required staff time), and will post any responses I get.

Alternatively, if someone offered to pay me $200, or $40 per hour I actually needed (whichever is lower), I'd produce the transcript myself. (As a general matter of economic arbitrage, nobody who's being paid California salaries should spend their own time to produce transcripts themselves.)

Replies from: T3t, DanielFilan
comment by RobertM (T3t) · 2023-02-07T05:57:35.023Z · LW(p) · GW(p)

@MondSemmel [LW · GW]  Lightcone will pay for this.  DM me if you want to discuss details :)

Replies from: MondSemmel
comment by MondSemmel · 2023-02-07T10:57:32.442Z · LW(p) · GW(p)

Sure, I've sent you a DM.

comment by DanielFilan · 2023-02-07T02:16:31.945Z · LW(p) · GW(p)

especially if higher quality trades off against higher quantity.

Yeah - this podcast is a side-project of my main podcast which is a side-project relative to my day job (CHAI PhD student), so time minimization is of the essence.

Alternatively, if someone offered to pay me $200, or $40 per hour I actually needed (whichever is lower), I'd produce the transcript myself.

I'll chuck in US$30 of my own money. (would be more if I had a better sense of the quality bar you were going to reach)

comment by Quadratic Reciprocity · 2023-02-14T08:33:21.830Z · LW(p) · GW(p)

I thought it was interesting when Oli said that there are so many good ideas in mechanism design and that the central bottleneck of mechanism design is that nobody understands UI design to take advantage of them. Would be very interested if other folks have takes or links to good mechanism design ideas that are neglected/haven't been properly tried enough or people/blogs that talk about stuff like that. 

comment by MondSemmel · 2023-02-06T16:52:33.949Z · LW(p) · GW(p)

I may be blind, but the link to the audio doesn't seem to allow me to actually download the audio. Which wouldn't be so bad if the Google Podcasts site didn't cause a bunch of issues for me, e.g. when I rewind back by 10s, the audio cuts off for 10++ seconds, which defeats the purpose of rewinding.

EDIT: Thanks for the audio links!

Replies from: adamzerner, DanielFilan, DanielFilan
comment by Adam Zerner (adamzerner) · 2023-02-06T17:39:58.642Z · LW(p) · GW(p)

You can download it on Player FM. Click the three horizontal dots, then "Download/Open", then right click the audio player, then "Save Audio As".

comment by DanielFilan · 2023-02-14T01:37:43.179Z · LW(p) · GW(p)

OK: now on the official podcast website there are dropbox links to download the mp3s. This is sort of a 'beta' feature because I'm not sure how many people want mp3s or whether these links are robust to me reorganizing my dropbox, but since another person wanted the mp3s, there they are.

comment by DanielFilan · 2023-02-06T18:46:33.170Z · LW(p) · GW(p)

Notes:

  • There are various podcast apps where you can download the podcast to your phone. These apps generally allow audio to be sped up, rewound easily, etc, and might be suitable for your needs.
  • The RSS feed that google podcasts and others are using is hosted on this website, which should also let you download episodes. I think this URL is less likely to stop working (precisely because I normally don't think about or advertise this website).