[Site Feature] Link Previews

raemon

[Site Feature] Link Previews

post by Raemon · 2019-09-17T23:03:12.818Z · LW · GW · 14 comments

Links in posts and comments now display a hover-preview when linking to LessWrong content.

For instance, this link to an old post by Scott Alexander [LW · GW] will display the title, first couple paragraphs, and other meta-data.

For comments, it will display the post-name and the comment-in-question, such as on this shortform comment by Buck about effective tutoring [LW(p) · GW(p)].

We're considering ways to effectively do hover previews for particular common external links (such as wikipedia and arxiv), but for the immediate future, all external links just provide a simple "url" hoveover.

14 comments

Comments sorted by top scores.

comment by Said Achmiz (SaidAchmiz) · 2019-09-18T21:11:10.092Z · LW(p) · GW(p)

We’re considering ways to effectively do hover previews for particular common external links (such as wikipedia and arxiv), but for the immediate future, all external links just provide a simple “url” hoveover.

I’ve been helping gwern develop exactly this feature for gwern.net (which now has hover previews for internal links, in-page links, all external links, and even locally hosted PDFs and other media), so I’d be happy to share how we did it, tips, etc. Feel free to find me or gwern on IRC to chat about this.

(It’s particularly easy to do this for Wikipedia, since they’ve got an excellent REST API.)

Replies from: gwern, Raemon

↑ comment by gwern · 2019-09-18T22:08:07.041Z · LW(p) · GW(p)

The basic idea behind the popups is that at the Markdown->HTML compile time, I do a lookup in a hashmap of (URL, (Title, Author, Date, DOI, Summary)), and if there is an entry, it gets inlined into the HTML as some quiet metadata; if there isn't, various scripts are called to try to generate metadata, and if nothing works, it falls back to a headless web browser taking a screenshot and saving it to a file. Then at runtime in the user browser, some JS checks every link for the metadata and if it exists, and the user mouses-over, it popups a form with the metadata filled in. If there is no metadata, it tries to fetch SHA1(URL) from a gwern.net folder under the assumption there will be the fallback screenshot.

There are a lot of fiddly details in how exactly you scrape from Arxiv or Pubmed Central or use the WP REST API, unfortunately. So many links must be handled by hand-writing definitions.

The relevant files in call order:

https://www.gwern.net/hakyll.hs
https://www.gwern.net/LinkMetadata.hs
https://www.gwern.net/linkAbstract.R
https://www.gwern.net/linkScreenshot.sh
https://www.gwern.net/static/js/popups.js
the generated/hand-written 'database' files:

↑ comment by Raemon · 2019-09-18T23:36:44.936Z · LW(p) · GW(p)

Yeah, I was happy when I saw you/gwern were also doing this. We've looked a bit at the your implementation and may ping you about the details once we get to our own external links.

comment by Raemon · 2019-09-18T18:30:02.577Z · LW(p) · GW(p)

At least one person wasn't a fan of the little "LW" tag that appears after LessWrong post-links. I'm curious to get multiple opinions here.

I'm particularly interested in what people expect to want in the longterm, when we have hoverpreviews that go to many common sites such as the EA Forum, wikipedia, arxiv (and possibly if we're lucky generic wordpress blogposts and the like). In that world, do you think you'd want a visual indicator for which links go where? (see gwern.net for an example of what it might look like to have multiple indicators)

There's also an option where we just use color-coding for onsite vs offsite links, possible example of what that might look like here:

Replies from: gwern, SaidAchmiz, jacobjacob

↑ comment by gwern · 2019-10-01T23:33:08.776Z · LW(p) · GW(p)

One problem with link coloring is that link coloring is already used by the browser as semantic annotation and has been for decades by every web browser I am aware of: specifically, whether a link has been visited before or is novel. When I look at your screenshot, I can't read it as 'off vs on site', I can only read it as 'ah, Raemon has not yet read about foreign Sequences'. It's too ingrained, and the colors are fighting >20 years of browser conditioning. Adding more colors to overload coloring doesn't help this, as that makes even more to learn (what would it be, dark green for 'unread on-site', light green for 'read on-site' etc?). It is also somewhat difficult to tune them, and more obscurely, you have issues with color-blindness (depending on the colors you pick, some <10% amount of readers will struggle or be entirely unable to perceive the difference).

Link icons, on the other hand, are additions, rather than overloads, used at least somewhat occasionally online already, can be understood by anyone who isn't blind (assuming grayscale like mine), and relatively self-explanatory (assuming good choice of logos).

Replies from: habryka4

↑ comment by habryka (habryka4) · 2019-10-02T00:08:29.200Z · LW(p) · GW(p)

Yeah, this and a bunch of other reasons caused us to mostly make the call against link-coloring. We are currently going with annotations, though I would want to make them a lot smaller than they currently are on gwern.net (I think they are fine on gwern.net for the kind of content that you produced, but would be too distracting for LW content).

↑ comment by Said Achmiz (SaidAchmiz) · 2019-09-18T21:12:35.764Z · LW(p) · GW(p)

I’m a fan of link icons, and not a fan of color coding.

Replies from: Raemon

↑ comment by Raemon · 2019-09-18T23:58:19.636Z · LW(p) · GW(p)

After thinking a bit my current plan is actually to use the little degree symbol° that you used on Read the Sequences, which is sort of the minimum viable "this is slightly different" symbol without drawing too much attention.

↑ comment by jacobjacob · 2019-09-19T13:29:38.898Z · LW(p) · GW(p)

I liked the little "LW".

Though why does it appear at the end of the link?

Since all links begin in the same place but end in different places there's an annoying micro-suspense before I reach the end of the link and figure out whether it's LW or not.

Replies from: Raemon

↑ comment by Raemon · 2019-09-19T21:26:53.513Z · LW(p) · GW(p)

Curious how you feel about the LW compared to the "little circle option"

(Seems like there's a direct tradeoff in "how obvious it is", where being obvious is good if you want to immediately see LW links, and bad if you're just trying to read the text like normal)

The "micro-suspense" thing is an interesting aspect of the experience. I think it's pretty rare for formatting to include meta-data at the beginning like that (whereas there's an established convention for footnote-like-things), which I think makes it look more professional. So I'd lean towards keeping it to the end, but it's an interesting argument to keep in mind.

Replies from: leggi

↑ comment by leggi · 2019-09-20T03:30:36.430Z · LW(p) · GW(p)

I definitely appreciate the idea of a "safe link" marker to other pages on LW.

The LW is (fairly) obvious as to what it means (especially to new users), and I don't find it obtrusive ( I am on a biggish screen).

Full disclosure - I like seeing the LW on my links!

comment by johnswentworth · 2019-09-18T18:19:51.008Z · LW(p) · GW(p)

Awesome job, I love this.

comment by John_Maxwell (John_Maxwell_IV) · 2019-09-18T19:53:40.563Z · LW(p) · GW(p)

Nice! I know people have complained about jargon use on LW in the past. Have you thought about an option new users could activate which autolinks jargon to the corresponding wiki entry/archive post?

Replies from: Raemon

↑ comment by Raemon · 2019-09-18T20:06:18.083Z · LW(p) · GW(p)

I've thought about things in that space but not that particular implementation, which is interesting. One problem is that an author might intend a different use of a given jargon term (esp. since some jargon is actually just "regular english words", such as "focusing.")

A couple other options are:

The author receives a prompt to consider adding a link to a jargon word as they use it (which they can ignore)
At the bottom of the post, there's a collection of jargon-links that the author can decide whether they match the author's use.

Both of those run the risk of being annoying for the writer.