Proposal: Community Curated Anki Decks For MIRI Recommended Courses

post by iconreforged · 2014-05-07T18:48:32.833Z · LW · GW · Legacy · 16 comments

Spaced repetition is optimal for recalling factual information. It won't necessarily teach you anything that you haven't already learned. It helps you retain knowledge, and won't necessarily help you develop skills. But, within the domain of factual information that you can already comprehend, spaced repetition systems are pretty optimal. So, if you want to train your brain on a bunch of Spanish-to-English sentence translations, or stock market tickers, or definitions, or sample questions, you should use something like Anki. 

Once you start using spaced repetition, you learn that one of the biggest limits is the card-making process. Making your own cards is time-consuming, although experience will make you much faster. Experience will also teach you what makes a better card. The 20 Rules of Formatting Knowledge pretty much spells it out for you, but I still had to make my own cards, find the sticking points, and edit them until I got a good sense for proper context and suitably short, distinct answers. 

You can find other people's shared decks and skip the card making process yourself, but not without some new problems. First, you are going to learn more making the cards yourself than studying someone else's. If it's subtle material, you can make cards that fill in the gaps in your particular understanding. But if someone else studies something and makes cards to fill in their own gaps, that means that what you're studying may not cover material that you don't know you don't know. It would be nice if everyone's shared decks were a completely thorough treatment of the material, but, alas, it is not so. And the only way that you can tell is by comparing the deck and the material during your own studying.

Shared decks also just aren't all that good sometimes. Someone, I can't recall who, wrote a script to scrape the entire LW wiki and then cloze-delete the title from the article. I appreciate the idea of SRSing the LW wiki, and scripting the whole thing was undoubtedly really efficient. However, the result was usually question and answer text hundreds of words long, with tables of content in the middle, and probably too many cards of insufficient value. 

Despite their problems, I think that shared decks have way more potential than their current use suggests. A well-crafted deck that gives its subject matter a thorough treatment could be more valuable than a textbook, and about as difficult to compose. But, looking at some of the best Anki decks I've come across, it will likely take more than one person to get such a deck off the ground. 

Anki's .apkg files are sorta unwieldy to edit collaboratively, because there's not really a way to merge edits from multiple contributors. Luckily, we can export and import decks as text, and use version control like GitHub to do the same thing. With a GitHub-hosted collaborative deck, a team of people studying a textbook, like Thinking and Deciding, could all make flashcards as they go, add them all to the same deck, remove redundant cards, standardize the layout, tag cards appropriately, and share them with whomever else comes along. Then, anyone else who wants to study the textbook has a high-quality Anki deck to use in conjunction, and if they know how a question can be asked better, or if they find an error, or if the seventh chapter didn't really get much coverage, they can contribute to the deck, too. 

This huge list of material put together by Louie Helm should be Anki-fied. Hopefully we can unite the efforts of many autodidacts and start to curate decks for each of the areas covered. Maybe a group of friends is about to work through a course on Quantum Computing or Set Theory. The rest of LW would benefit from their work making flashcards, but especially so if they leave the project open to collaboration. 

So, the things needed to move forward:

16 comments

Comments sorted by top scores.

comment by ChristianKl · 2014-05-07T19:17:08.149Z · LW(p) · GW(p)

But if someone else studies something and makes cards to fill in their own gaps, that means that what you're studying may not cover material that you don't know you don't know.

This get's Anki wrong. You are not supposed to make cards about things that you don't know. That sets you up for cards that you forget. Anki exist to prevent forgetting of newly learned information.

The first two rules of SRS are according to Wozniak:

  1. Do not learn if you do not understand
  2. Learn before you memorize

When one starts sharing decks it frequently happens that people violate those rules and try to learn Anki cards for material that they haven't learned beforehand.

Replies from: TylerJay, iconreforged, wedrifid
comment by TylerJay · 2014-05-07T21:50:08.276Z · LW(p) · GW(p)

Exactly. However, it could be useful to go through someone else's Anki deck after you've already learned the material yourself. Everyone who's ever taken a test after making your own study guide knows that when questions are worded differently from your study guide, you have to think harder about it to answer the question. I suspect that it would help reinforce the learning or fill gaps you may have missed.

comment by iconreforged · 2014-05-08T19:39:23.688Z · LW(p) · GW(p)

I totally agree. Shared decks encourage a lot of SRS vices.

But, given that they exist and that people are going to use them, is there a way to raise the quality of a shared deck significantly above the average? You can page through the shared decks on Anki's shared decks page and dredge up extremely low quality. If you look at the repository of decks by LW users, the average quality is much better, but could still be improved.

I propose that the MIRI courses are valuable, and that people learning them could benefit from Anki decks. I think the best way to make these Anki decks is a wiki-style collaborative effort.

comment by wedrifid · 2014-05-09T17:09:00.485Z · LW(p) · GW(p)

When one starts sharing decks it frequently happens that people violate those rules and try to learn Anki cards for material that they haven't learned beforehand.

Fortunately there are enough exceptions to Wozniak's guideline that violating rule 2 can often be beneficial. For me this applies to any information that I am able to understand (rule 1) from the terse information in the cards themselves. In the same way I tended to learn most efficiently from practice exams when studying.

For material that is too complicated (or simply insufficiently specified) to learn from the Anki deck I do honor rules 1 and 2. This means I keep the deck in my active reviews only if (and when) I am sufficiently curious about the subject matter that I will naturally be inspired to look up every term I encounter and do not understand. Just In Time learning is viable.

comment by [deleted] · 2014-05-09T03:17:59.955Z · LW(p) · GW(p)

You might be interested in an inactive sideproject of mine, Space: https://launchpad.net/space

Space is a markup language that generates Anki cards. I use Space to take notes for all my classes, books I read, and anything else where I want to make the Anki cards it's good at making fast.

Space is very limited, and I haven't had time to update it to Anki 2 yet, but I've been getting more emails about it recently and might get around to doing that in the next few weeks. But it makes a good "source format" for Anki decks, and you might want to consider something like it. Space was not difficult to write, and if you're a decen hacker you could whip together something similar and less constrained in a day or less.

Replies from: wedrifid
comment by wedrifid · 2014-05-10T04:05:32.777Z · LW(p) · GW(p)

But it makes a good "source format" for Anki decks, and you might want to consider something like it.

Excellent idea. I'm now wondering what the best way to implement 'updating' (importing the new stuff from others) would be. ie. Ways to keep add the new stuff while keeping all the memorization data.

Replies from: None
comment by [deleted] · 2014-05-14T19:20:52.855Z · LW(p) · GW(p)

Right now, I just make multiple Space files. It works very well for things that have a natural delineation point, like lectures or book chapters.

I can think of a few ways to do updating from an already-compiled Space file, but none of them are very elegant which is why I never tried to do it before.

Replies from: dvf
comment by dvf · 2014-09-16T10:33:03.815Z · LW(p) · GW(p)

I've tried space. About sharing: how about having unique textual IDs (say, UUIDs) for each space entry, that get carried around in cards and used to update cards in place?

Then space entries.spc deckname would update the entries in deckname, but also invent and add such IDs to any entries in entries.spc that are missing them. Then all we need to take care with is to only check in spc files that have IDs for all entries, which also serves to ensure the spc file is used at least once before checking in.

Since the ID is per space entry, not per card, need to figure out how to deal with intervals. Inheriting the smallest interval in any related old cards for all new cards seems just-workable, but not elegant at all.

But I think the proposal of this page is great, and update would be necessary, so why not. The other approach that might be better would be to change anki itself to use textual sources natively, but that would probably be a much larger change.

Replies from: None
comment by [deleted] · 2014-09-29T19:17:39.713Z · LW(p) · GW(p)

The problem is that the Space file format is intended to be human-readable plain-text. I'm not willing to compromise on that personally; I want to have plain-text files that I can check into a repository.

Without this, I think the best thing you could do is fuzzy matching on the content, but I'm not sure how well this would work in practice.

That said, you could adapt the markup language for use in Anki's editor, storing it in something structured like XML or JSON that would have UUIDs of the type of propose.

Also, thanks for trying space! It's not really polished or even finished and portable, but it's really gratifying to see people touching things I've made.

Replies from: dvf
comment by dvf · 2014-09-30T20:41:24.748Z · LW(p) · GW(p)

My proposal doesn't seem to me to compromise human readability and editability (and certainly doesn't compromise version control) so just to make sure we mean the same thing, an example:

space update mydeck.spc

where mydeck.spc = "

The [quick] brown fox jumped over the [lazy] dog.
;UID=SDFGHJKERTYUICVBN;

Three commonly-used nonsense variable names:
1. foo
2. bar
3. baz
;;
OCaml :: A fast, functional, strongly-typed programming language.
;;"

would find in the deck the first note using the NID, and then add the second and third (having no NIDs, they cannot be found), generate new NIDs, and update mydeck.spc to something like = "

The [quick] brown fox jumped over the [lazy] dog.
;UID=SDFGHJKERTYUICVBN;

Three commonly-used nonsense variable names:
1. foo
2. bar
3. baz
;UID=LUKAGSDSDFGHJKEE;
OCaml :: A fast, functional, strongly-typed programming language.
;UID=HJKERTYUIVHBFJVBN;"

which you can then checkin. The only edit we would expect users to perform on an NID is to delete it when copy pasting to assign a new NID to the edited version, and that is certainly feasible. Do you still find this objectionable?

Replies from: None
comment by [deleted] · 2014-10-15T23:51:39.795Z · LW(p) · GW(p)

I do, for the following reasons:

  • Space shouldn't mutate the file in the general case, because it should be possible to check out a repo of Space files, run Space over all of them, and not have any changes in your tracked files.
  • Currently, ;; is a separator; it's optional after the last card. The change you propose is backwards-incompatible with existing Space files. This is only an issue for me, since I'm likely the only human with a lot of space files, but it's a pain point.
  • Those IDs are not human-readable. You could do high-entropy human-readable IDs with work, but they would necessitate shipping a dictionary with Space. I don't mind doing this so this isn't really a rebuttal.

Generally, the first bullet point is the only really hard problem. There's no global name scheme we can use that will allow people to put global IDs for cards into a Space file. This is reducible to the problem that Bitcoin solves, so we could solve it with Bitcoin, but that'd be a massive pain. There are hackier solutions, but they're hacks, and I'm really hesitant to include something that doesn't get it right the first time because in the very unlikely case Space explodes in between the hack and the fix for the hack, Space will just be crippled forever.

Notice that I can't reply very frequently because as an open feminist on Less Wrong, I'm targeted by a large amount of actively anti-feminist users who downvote productive comments like the above.

comment by [deleted] · 2014-05-08T22:36:48.859Z · LW(p) · GW(p)

Someone learned in IP tell me what kind of licensing or copyright applies here. Should people post these with a Creative Commons or a GPL? Obviously we don't want to start plagiarizing or copyright-violating in the process of making this work. We don't want to abscond with other people's decks and start building on them, I think.

IANAL.

It probably makes the most sense to go with a CC-BY-SA license. That gives MIRI/Less Wrong attribution and linkback, and it keeps the decks freely redistributable.

Adding NC will piss off free culture advocates, won't stop it from being stolen and put on Amazon by chinese bots, and will limit the ability of people to distribute them in weird circumstances. Adding ND prevents them from being improved on.

Though, honestly, what matters more, copyright law or raising the sanity waterline? If you're associated with MIRI/CFAR/a liable legal fiction you shouldn't jeopardize that, but if you're just a person on the Internet making an Anki deck that violates some copyright is pretty safe to do.

So to me the thing that makes the most sense is to use CC-BY-SA if you're very serious about this, but just steal shamelessly if you just want people to learn.

Replies from: dvf
comment by dvf · 2014-09-16T10:26:48.261Z · LW(p) · GW(p)

Though, honestly, what matters more, copyright law or raising the sanity waterline?

The choice you offer is false, in my opinion. If you violate copyright law, you will never gather a community effort, because who wants to work on something that can get DMCA'ed out of existance at any moment?

I think CC-BY-(maybe SA) will work fine, and just use appropriately licensed basic sources like wikipedia.

Replies from: None
comment by [deleted] · 2014-09-29T19:07:37.093Z · LW(p) · GW(p)

Plenty of people will work on things that are in legally nebulous territory. That's the entirety of the WINE project, ReactOS, and a large number of modding communities.

In practice, it's impossible for any of these projects to die, because the material they hack on is distributed between all their members and there's no single point of failure.

Replies from: dvf
comment by dvf · 2014-09-30T20:51:04.611Z · LW(p) · GW(p)

The more nebulous, the fewer contributors. I certainly would prefer to contribute to properly licensed projects; I've had the fun of putting work into a project that for silly license reasons couldn't get into Debian/GSOC/... I'm willing to forgo it in the future.

I haven't done any deck making, so give this low weight, but I imagine a truly collaborative and creative joint project where making up, say, fallacious arguments -> fallacy name notes that are actually challenging is half the fun, and the benefit of copy-pasta is small anyway.

comment by Vlad Sitalo (harcisis) · 2016-09-18T09:46:34.304Z · LW(p) · GW(p)

Hey, I'm wondering if you had any success with this idea? I also thought you might be interested in my Anki plugin, that allows you to make a full-feature Import/Export of Anki decks to/from JSON. What I mean by full-feature is that it exports not just cards converted to JSON, but Notes, Decks, Models, Media etc. So you can do export, modify result, or merge changes from someone else and on Import, those changes would be reflected on your existing cards/decks and no information/metadata/etc would be lost.

You can read more about it here: https://www.reddit.com/r/Anki/comments/50j7i7/crowdanki_comprehensive_json_representation_of/, https://github.com/Stvad/CrowdAnki/, https://ankiweb.net/shared/info/1788670778).