Billion-scale semi-supervised learning for state-of-the-art image and video classification 2019-10-19T15:10:17.267Z · score: 5 (2 votes)
What are your strategies for avoiding micro-mistakes? 2019-10-04T18:42:48.777Z · score: 18 (9 votes)
What are effective strategies for mitigating the impact of acute sleep deprivation on cognition? 2019-03-31T18:31:29.866Z · score: 26 (11 votes)
So you want to be a wizard 2019-02-15T15:43:48.274Z · score: 16 (3 votes)
How do we identify bottlenecks to scientific and technological progress? 2018-12-31T20:21:38.348Z · score: 31 (9 votes)
Babble, Learning, and the Typical Mind Fallacy 2018-12-16T16:51:53.827Z · score: 4 (2 votes)
An1lam's Short Form Feed 2018-08-11T18:33:15.983Z · score: 14 (3 votes)
The Case Against Education: Why Do Employers Tolerate It? 2018-06-10T23:28:48.449Z · score: 17 (5 votes)


Comment by an1lam on An1lam's Short Form Feed · 2019-12-13T21:35:30.350Z · score: 2 (2 votes) · LW · GW

Link post for a short post I just published describing my way of understanding Simpson's Paradox.

Comment by an1lam on Bayesian examination · 2019-12-11T21:47:05.057Z · score: 1 (1 votes) · LW · GW

If you really want to try and get traction on this, I'd recommend emailing Andrew Gelman (stats blogger and stats prof at Columbia). He's written previously (I can't seem to find the article unfortunately) about how statisticians should take their own ideas more seriously with respect to education and at the very least I can see him blogging about this.

Comment by an1lam on An1lam's Short Form Feed · 2019-12-10T12:45:32.897Z · score: 3 (2 votes) · LW · GW

Weird thought I had based on a tweet about gradient descent in the brain: it seems like one under-explored perspective on computational graphs is the causal one. That is, we can view propagating gradients through the computational graph as assessing the effect of an intervention on some variable on all of a nodes' children.

Reason to think this might be useful:

  • *Maybe* this can act as a different lens for examining NN training?

Reasons why this might not be useful:

  • It's not obvious that it makes sense to think of nodes in an NN (or any differential computational graph) as causally related in the sense we usually talk about in causal inference.
  • A causal interpretation of gradients isn't obvious because they're so local, whereas most causal inference focuses on the impact of more non-trivial interventions. OTOH, there are some NN interpretability techniques that try to solve this, so maybe these have better causal interpretations?
Comment by an1lam on Connectome-Specific Harmonic Waves · 2019-12-05T02:29:21.909Z · score: 3 (3 votes) · LW · GW

FFNNs’ inability to process time series data was a contributing factor to the Uber self-crashing car.

I don't see the evidence for this in the linked post and don't recall seeing this in any of the other few articles/pieces I've read on the crash. Can you point me to the evidence for this?

Comment by an1lam on Paper-Reading for Gears · 2019-12-05T00:10:31.696Z · score: 4 (2 votes) · LW · GW

I wonder how hard it would be to formalize this claim about mediation in terms of the space of causal DAGs. I haven't done the work to try it so I'm mostly spitballing.

Informally, I associate mediation with the front-door criteria in causality. So, the usefulness of mediation should reflect that the front-door criterion is empirically easier to satisfy than the back-door (and other) criteria maybe because real causal chains tend to be narrow but long? Thinking about it a bit more, it's probably more like the min cut of real world causal graphs tend to be relatively small?

Comment by an1lam on [1911.08265] Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model | Arxiv · 2019-11-21T16:22:24.267Z · score: 1 (1 votes) · LW · GW

In other words, the real world is just a mixture distribution over simulations :).

Comment by an1lam on The Power to Draw Better · 2019-11-18T15:10:05.843Z · score: 4 (3 votes) · LW · GW

Any resources you'd recommend that describe the constructive style further? I've read Drawing on the Right Side of the Brain and so would be curious to read about this other approach more.

Comment by an1lam on Chris Olah’s views on AGI safety · 2019-11-16T19:20:10.510Z · score: 3 (2 votes) · LW · GW

I'd be curious whether Chris and the other members of the Clarity team have thought about (in their most optimistic vision) how "microscope AI" could enable other sciences -- biology, neuroscience, physics, etc. -- and if so whether they'd be willing to share. AIUI, the lack of interpretability of neural networks is much more of an immediate issue for trying to use deep learning in these areas because just being able to predict tends to be less useful.

Comment by an1lam on Robin Hanson on the futurist focus on AI · 2019-11-14T16:13:53.839Z · score: 1 (1 votes) · LW · GW

Mostly unrelated to your point about AI, your comments about the 100,000 fans having the potential to cause harm rang true to me.

Are there other areas in which you think the many non-expert fans problem is especially bad (as opposed to computer security, which you view as healthy in this respect)?

Then the experts can be reasonable and people can say, “Okay,” and take their word seriously, although they might not feel too much pressure to listen and do anything. If you can say that about computer security today, for example, the public doesn’t scream a bunch about computer security.

Would you consider progress on image recognition and machine translation as outside view evidence for lumpiness? Accuracies on ImageNet, an image classification benchmark, dropped by >10% over a 4-year period (graph below) mostly due to the successful scaling up of a type of neural network.

This also seems relevant to your point about AI researchers who have been in the field for a long time being more skeptical. My understanding is that most AI researchers would not have predicted such rapid progress on this benchmark before it happened.

That said, I can see how you still might argue this is an example of over-emphasizing a simple form of perception, which in reality is much more complicated and involves a bunch of different interlocking pieces.

Comment by an1lam on An1lam's Short Form Feed · 2019-11-13T23:52:49.041Z · score: 1 (1 votes) · LW · GW

In an interesting turn of events, John Carmack announced today that he'll be pivoting to work on AGI.

Comment by an1lam on Open & Welcome Thread - November 2019 · 2019-11-08T01:05:57.689Z · score: 8 (5 votes) · LW · GW

First of all, I apologize, I think my comment was too snarky and took a tone of "this is so surprising" that I regret on reflection.

Under what circumstances do you feel introducing new policy ideas with the word "maybe this could be a good idea" is acceptable? To be clear, I think introducing the idea is totally fine, I just have a decently strong prior against widespread bans on "businesses that do this".

I don't expect anyone important to be reading this thread, certainly not important policymakers. Even if they were, I think it was pretty clear I was spitballing.

Agreed, I was not worried about this.

Let's not fall prey to the halo effect. Eliezer also wrote a long post about the necessity of back-and-forth debate, and he's using a platform which is uniquely bad at this. At some point, one starts to wonder whether Eliezer is a mortal human being who suffers from akrasia and biases just like the rest of us.

Fair enough. I agree the Eliezer point isn't strong evidence.

I didn't make much of an effort to assemble arguments that Twitter is bad. But I think there are good arguments out there. How do you feel about the nuclear diplomacy that's happened on Twitter?

I don't have time to respond at length to this part at the moment (I wanted to reply quickly to apologize mostly) but I agree it's the most useful question to discuss and will try to respond more later. To summarize, I acknowledge it's possible that Twitter is bad for the collective but think people may overestimate the bad parts by focusing on how politicians / people fighting about politics use Twitter (which does seem bad) and that even if it is "bad", it's not clear that banning short response websites would lead to a better long-term outcome. For example, maybe people would just start fighting with pictures on Instagram. I don't think this specific outcome is likely but think it's in a class of outcomes that would result from banning that seems decently likely.

Comment by an1lam on Open & Welcome Thread - November 2019 · 2019-11-07T18:46:43.862Z · score: 8 (4 votes) · LW · GW

Woah, this seems like a big jump to a form of technocracy / paternalism that I would think would typically require more justification than spending a short amount of time brainstorming in a comment thread why the thing millions of people use daily is actually bad.

Like, banning sites from offering free services if a character limit is involved because high status members of communities you like enjoy such sites seems like bad policy and also a weird way to try and convince these people to write on forums you prefer. Now, one counterargument would be "coordination problems" mean those writers would prefer to write somewhere else. But presumably if anyone's aware of "inadequate equilibria" like this and able to avoid it it would be Eliezer.

Re-reading my comment, I realize it may come off as snarky but I'm not really sure how better to convey my surprise that this would be the first idea that comes to mind.

ETA: it's not clear to me that Twitter is so toxic, especially for people in the broad bubble that encompasses LW / EA / tech / etc. I agree it's not the best possible version of a platform by any means, but to say it's obviously net negative seems like a stretch without further evidence.

Comment by an1lam on An1lam's Short Form Feed · 2019-11-05T14:00:23.575Z · score: 12 (3 votes) · LW · GW

Interesting Bill Thurston quote, sadly from his obituary:

I’ve always taken a “lazy” attitude toward calculations. I’ve often ended up spending an inordinate amount of time trying to figure out an easy way to see something, preferably to see it in my head without having to write down a long chain of reasoning. I became convinced early on that it can make a huge difference to find ways to take a step-by-step proof or description and find a way to parallelize it, to see it all together all at once—but it often takes a lot of struggle to be able to do that. I think it’s much more common for people to approach long case-by-case and step-by-step proofs and computations as tedious but necessary work, rather than something to figure out a way to avoid. By now, I’ve found lots of “big picture” ways to look at the things I understand, so it’s not as hard. To prevent mis-interpretation, I think people often look at quotes like this (I've seen similar ones about Feynman) and think "ah yes, see anyone can do it". But IME the thing he's describing is much harder to achieve than the "case-by-case"/"step-by-step" stuff.

Comment by an1lam on Matthew Barnett's Shortform · 2019-11-05T03:57:32.465Z · score: 2 (2 votes) · LW · GW

Weirdly enough, I was doing something today that made me think about this comment. The thought I had is that you caught onto something good here which is separate from the pressure aspect. There seems to be a benefit to trying to separate different aspects of a task more than may feel natural. To use the final exam example, as someone mentioned before, part of the reason final exams feel productive is because you were forced to do so much prep beforehand to ensure you'd be able to finish the exam in a fixed amount of time.

Similarly, I've seen benefit when I (haphazardly since I only realized this recently) clearly segment different aspects of an activity and apply artificial constraints to ensure that they remain separate. To use your VAE blog post example, this would be like saying, "I'm only going to use a single page of notes to write the blog post" to force yourself to ensure you understand everything before trying to write.

YMMV warning: I'm especially bad about trying to produce outputs before fully understanding and therefore may get more bandwidth out of this than others.

Comment by an1lam on Book Review: Design Principles of Biological Circuits · 2019-11-05T02:15:19.805Z · score: 3 (3 votes) · LW · GW

No problem, also, in case it wasn't obvious based on the fact that I commented basically right after you posted this, I liked the post a lot and now plan to read this book ASAP.

Comment by an1lam on Book Review: Design Principles of Biological Circuits · 2019-11-04T23:17:58.749Z · score: 8 (6 votes) · LW · GW

Note: the talk you mentioned was by Drew Endy and the electrical engineering colleague was the one and only Gerald Sussman of SICP fame (source: I don't remember but I'm very confident about being right).

Comment by an1lam on Epistemic Spot Check: The Role of Deliberate Practice in the Acquisition of Expert Performance · 2019-11-03T13:26:59.091Z · score: 1 (1 votes) · LW · GW

In this interview, Don Knuth gives the strong impression that he works more than 4 hours a day not necessarily doing deliberate practice but definitely hard cognitive work (writing a book that most people consider quite challenging to read). That said, Knuth is kind of a monster in general in terms of combining really high technical ability and really high conscientiousness, so it wouldn't surprise me if he's similar to the other outliers you mentioned and is not representative.

A few examples to back up my conscientiousness point:

  • The following story about doing all the problems in the textbook (from that interview):

But Thomas’s Calculus would have the text, then would have problems, and our teacher would assign, say, the even numbered problems, or something like that. I would also do the odd numbered problems. In the back of Thomas’s book he had supplementary problems, the teacher didn’t assign the supplementary problems; I worked the supplementary problems. I was, you know, I was scared I wouldn’t learn calculus, so I worked hard on it, and it turned out that of course it took me longer to solve all these problems than the kids who were only working on what was assigned, at first.

  • Writing an entire compiler on his own over a summer.
  • Finishing his PhD in three years while also consulting with Burroughs.
Comment by an1lam on The Technique Taboo · 2019-10-30T22:18:53.447Z · score: 8 (4 votes) · LW · GW

I agree with this criticism and have made it myself before. However, I do think there's something to it when it comes to especially slow typing (arbitrarily, <= 35 WPM). At least for me, sometimes I need to "clear out my mental buffer" by actually writing out the code I've thought about before I can think about the next thing. When I'm in this position, being able to type relatively quickly seems to helps me stay in the flow and get to the next thought faster.

Also, frankly, sometimes you just need to churn out 10s of lines of code that are somewhat mindless (yes, DRY, but I still think some situations require this) and while doing them fast doesn't save you that much time overall, it certainly can help with maintaining sanity.

Related to that, I think the big problem with hunt and peck typing is that it isn't just slower, it also takes your attention off the flow of the code by forcing you to focus on the characters you're typing.

ETA: all that said, I definitely agree with Said that it's not necessary to learn typing in a formal setting and definitely would not encourage colleges to teach it. I actually did have a computer class in elementary school that taught touch typing but still got much faster mostly by using AIM in middle school.

Comment by an1lam on Turning air into bread · 2019-10-30T14:32:36.271Z · score: 1 (1 votes) · LW · GW

Thanks for sharing your perspective!

Comment by an1lam on Turning air into bread · 2019-10-29T22:31:58.617Z · score: 11 (3 votes) · LW · GW

As crazy as the prophet Malthus sounded, some people continuously try to heed his warning, thankfully we're learning how to coordinate our population growth to support a good life within the limited carrying capacity of our natural resources better and better over time. Time and time again, a wizard makes a new gizmo and we all get away with it.

The modified version of your first paragraph from above feels as or more accurate to me. I'd be curious to hear why you think it won't always be like this (besides X-risk, which I totally understand would lead to a "not always like this" situation).

Comment by an1lam on An1lam's Short Form Feed · 2019-10-29T13:58:44.367Z · score: 5 (2 votes) · LW · GW

Blockchain idea inspired by 80,000 Hours's interview of Vitalik Buterin: a lot of podcasts either have terrible transcriptions or presumably pay a service to transcribe their sessions. However, even these services make minor typos such as "ASX" instead of "ASICs" in the linked interview.

Now, most people who read these transcripts presumably notice at least a subset of these typos but don't want to go through the effort of emailing podcasters to tell them about it. On the flip side, there's no good way for hosts to scalably audit flagged typos to see if they're actually typos. What we really want is a mostly automated mechanism to aggregate flagged typos and accept fixes which multiple people agree upon that only pays people (micro amounts) for correctly identifying typos.

This mechanism seems like something that could live on a blockchain in some sort of smart contract. Obviously, like almost every blockchain application, you *could* do it without a blockchain, but using blockchain makes it easy to audit and distribute the micro-payments rather than having to program the voting scheme from scratch on top of a centralized database.

Comment by an1lam on An1lam's Short Form Feed · 2019-10-24T19:57:41.166Z · score: 3 (2 votes) · LW · GW

If algebra's a deal with the devil where you get the right answer but don't know why, then geometric intuition's a deal with the devil where you always get an answer but don't know whether it's right.

Comment by an1lam on Raemon's Scratchpad · 2019-10-24T19:56:16.073Z · score: 1 (1 votes) · LW · GW

Uh, what if you forget to do your habit troubleshooting habit and then you have to troubleshoot why you forgot it? And then you forget it twice and you have to troubleshoot why you forgot to troubleshoot forgetting to troubleshoot!

(I'm joking about all this in case it's not obvious.)

Comment by an1lam on An1lam's Short Form Feed · 2019-10-24T19:35:35.608Z · score: 3 (3 votes) · LW · GW

Someone should write the equivalent of TAOCP for machine learning.

(Ok, maybe not literally the equivalent. I mean Knuth is... Knuth. So it doesn't seem realistic to expect someone to do something as impressive as TAOCP. And yes, this is authority worship and I don't care. He's Knuth goddamn it.)

Specifically, a book where the theory/math's rigorous but the algorithms are described in their efficient forms. I haven't found this in the few ML books I've read parts of (Bishop's Pattern Recognition and Machine Learning, MacKay's Information Theory, and Tibrishani et Al's Elements of Statistical Learning), so if it's already out there, let me know.

Note that I don't mean that whoever does this should do the whole MMIX thing and write their own language and VM.

Comment by an1lam on Raemon's Scratchpad · 2019-10-24T19:34:35.205Z · score: 3 (2 votes) · LW · GW

Just don't get trapped in infinite recursion and end up overloading your habit stack frame!

Comment by an1lam on bgaesop's Shortform · 2019-10-24T19:34:02.056Z · score: 12 (6 votes) · LW · GW

As one data point, I'm a silent (until now) skeptic in the sense that I got really into meditation in college (mostly of the The Mind Illuminated variety), and felt like I was benefiting but not significantly more than I did from other more mundane activities like exercise, eating well, etc. Given the required time investment of ~1 hour a day to make progress, I just decided to stop at some point.

I don't talk about it much because I know the response will be that I never got to the point of "enlightenment" or needed a teacher (both possibilities that I acknowledge), but I figured given your post, I'd leave a short reply mentioning my not necessarily generalizable experience.

Comment by an1lam on An1lam's Short Form Feed · 2019-10-24T00:05:47.284Z · score: 1 (1 votes) · LW · GW

My takeaway from the article was that, to your point, their brains weren't using more energy. Rather, the best hypothesis was just that their adrenal hormones remained elevated for many hours of the day, leading to higher metabolism during that period. Running an hour a day is definitely not enough to burn 6000 calories for the record (a marathon burns around 3500).

Maybe I wasn't clear, but that's what I meant by the following.

The ar­ti­cle’s claiming that chess play­ers burn more en­ergy purely from the side effects of stress, not be­cause their brains are do­ing more work. So why am I re­vis­it­ing this ques­tion?

Comment by an1lam on An1lam's Short Form Feed · 2019-10-23T14:11:55.288Z · score: 3 (3 votes) · LW · GW
I feel like the center often shifts as I learn more about a topic (because I develop new interests within it). The questions I ask myself are more like "How embarrassed would I be if someone asked me this and I didn't know the answer?" and "How much does knowing this help me learn more about the topic or related topics?" (These aren't ideal phrasings of the questions my gut is asking.)

Those seem like good questions to ask as well. In particular, the second one is something I ask myself although, similar to you, in my gut more than verbally. I also deal with the "center shifting" by revising cards aggressively if they no longer match my understanding. I even revise simple phrasing differences when I notice them. That is, if I repeatedly phrase the answer to a card one way in my head and have it phrased differently on the actual card, I'll change the card.

In my experience, I often still forget things I've entered into Anki either because the card was poorly made or because I didn't add enough "surrounding cards" to cement the knowledge. So I've shifted away from this to thinking something more like "at least Anki will make it very obvious if I didn't internalize something well, and will give me an opportunity in the future to come back to this topic to understand it better instead of just having it fade without detection".

I think both this and the original motivational factor I described apply for me.

I'm confused about what you mean by this. (One guess I have is big-O notation, but big-O notation is not sensitive to constants, so I'm not sure what the 5 is doing, and big-O notation is also about asymptotic behavior of a function and I'm not sure what input you're considering.)

You're right. Sorry about that... I just heinously abuse big-O notation and sometimes forget to not do it when talking with others/writing. Edited the original post to be clearer ("on the order of 10").

I think there are few well-researched and comprehensive blog posts, but I've found that there is a lot of additional wisdom the spaced repetition community has accumulated, which is mostly written down in random Reddit comments and smaller blog posts. I feel like I've benefited somewhat from reading this wisdom (but have benefited more from just trying a bunch of things myself).

Interesting, I've perused the Anki sub-reddit a fair amount, but haven't found many posts that do what I'm looking for, which is both give good guidelines and back them up with specific examples. This is probably the closest thing I've read to what I'm looking for, but even this post mostly focuses on high level recommendations and doesn't talk about the nitty-gritty such as different types of cards for different types of skills. If you've saved some of your favorite links, please share!

I agree that trying stuff myself has worked better than reading.

For myself, I've considered writing up what I've learned about using Anki, but it hasn't been a priority because (1) other topics seem more important to work on and write about; (2) most newcomers cannot distinguish been good and bad advice, so I anticipate having low impact by writing about Anki; (3) I've only been experimenting informally and personally, and it's difficult to tell how well my lessons generalize to others.

Regarding other topics being more important, I admit I mostly wrote up the above because I couldn't stop thinking about it rather than based on some sort of principled evaluation of how important it would be. That said, I personally would get a lot of value out of having more people write up detailed case reports of how they've been using Anki and what does/doesn't work well for them that give lots of examples. I think you're right that this won't necessarily be helpful for newcomers, but I do think it will be helpful for people trying to refine their practice over long periods of time. Given that most advice is targeted at newcomers, while the overall impact may be lower, I'd argue "advice for experts" is more neglected and more impactful on the margin.

Regarding takeaways not generalizing, this is why I think giving lots of concrete examples is good because it basically makes your claims reproducible. That is, someone can go out and try what you described fairly easily and see if it works for them.

Comment by an1lam on An1lam's Short Form Feed · 2019-10-23T14:00:34.746Z · score: 3 (2 votes) · LW · GW

You're right on both accounts. Maybe I should've discussed this in my original post... At least for me, Anki serves different purposes at different stages of learning.

Key definitions tend to be useful in the early stages, especially if I'm learning something on and off, as a way to prevent myself from having to constantly refer back and make it easier to think about what they actually mean when I'm away from the source. E.g., I've been exploring alternate interpretations of d-separation in my head during my commute and it helps that I remember the precise conditions in addition to having a visual picture.

Once I've mastered something, I agree that the "concepts and competencies" ("mental moves" is my preferred term) become more important to retain. E.g., I remember the spectral theorem but wish I remembered the sketch of what it looks like to develop the spectral theorem from scratch. Unfortunately, I'm less clear/experienced on using Anki to do this effectively. I think Michael Nielsen's blog post on seeing through a piece of mathematics is a good first step. Deeply internalizing core proofs from an area presumably should help for retaining the core mental moves involved in being effective in that area. But, this is quite time intensive and also prioritizes breadth over depth.

I actually did mention two things that I think may help with retaining concepts and competencies - Anki-izing the same concepts in different ways (often visually) and Anki-izing examples of concepts. I haven't experienced this yet, but I'm hopeful that remembering alternative visual versions of definitions, analogies to them, and examples of them may help with the types of problems where you can see the solution at a glance if you have the right mental model (more common in some areas than others). For example, I remember feeling (usually after agonizing over a problem for a while) like Linear Algebra Done Right had a lot of exercises where the right geometric intuition or representative example would allow you to see the solution relatively quickly and then just have to convert it to words.

Another idea for how to Anki-ize concepts and competencies better that I haven't tried (yet) but will share anyway is succinctly capturing strategies pop up again and again in similar forms. To use another Linear Algebra Done Right example, there are a lot of exercises with solutions of the form "construct some arbitrary linear map that does what we want" and show it... does what we want. I remember this technique but worry that my pattern matching machinery for the types of problems to which it tends to apply has decayed. On the other hand, if I had an Anki card that just listed short descriptions of a few exercises and asked me which technique was core to their solutions, maybe I'd retain that competency better.

Comment by an1lam on An1lam's Short Form Feed · 2019-10-23T01:23:24.083Z · score: 19 (7 votes) · LW · GW

Anki's Not About Looking Stuff Up

Attention conservation notice: if you've read Michael Nielsen's stuff about Anki, this probably won't be new for you. Also, this is all very personal and YMMV.

In a number of discussions of Anki here and elsewhere, I've seen Anki's value measured in terms of time saved by not having to look stuff up. For example, Gwern's spaced repetition post includes a calculation of when it's worth it to Anki-ize threshold, although I would be surprised if Gwern hasn't already thought about the claim going to make.

While I occasionally use Anki to remember things that I would otherwise have to Google, e.g. statistics, I almost never Anki-ize things so that I can avoid Googling them in the future. And I don't think in terms of time saved when deciding what to Anki-ize.

Instead, (as Michael Nielsen discusses in his posts) I almost always Anki-ize with the goal of building a connected graph of knowledge atoms about an area in which I'm interested. As a result, I tend to evaluate what to Anki-ize based on two criteria:

  1. Will this help me think about this domain without paper or a computer better?
  2. In the Platonic graph of this domain's knowledge ontology, how central is this node? (Pedantic note: it's easier to visualize distance to the root of the tree, but this requires removing cycles from the graph.)

To make this more concrete, let's look at an example of a topic I've been Anki-izing recently, causal inference. I just started Anki-izing this topic a week ago, so it'll be easier for me to avoid idealizing the process. Looking at my cards so far, I have questions about and definitions of things like "d-separation", "sufficient/admissible sets", and "backdoor paths". Notably, for each of these, I don't just have a cloze card to recall the definition, I also have image cards that quiz me on examples and conceptual questions that clarify things I found confusing upon first encountering these concepts. I've found that making these cards has the effect of both forcing me to ensure I understand concepts (because writing cards requires breaking them down) and makes it easier to bootstrap my understanding over the course of multiple days. Furthermore, knowing that I'll remember at least the stuff I've Anki-ized has a surprisingly strong motivational impact on me on a gut level.

All that said, I suspect there are some people for whom Anki-izing wouldn't be helpful.

The first is people who have the time and a career in which they focus on a narrow enough set of topics such that they repeatedly see the same concepts and rarely go for long periods without revisiting them. I've experienced this myself for Python - I learned it well before starting to use Anki and used it every day for many years. So even if I forget some stuff, it's very easy for me to use the language fluently after time away from it.

The second is, for lack of a better term, actual geniuses. Like, if you're John Von Neumann and you legitimately have an approximation of a photographic memory (I'm really skeptical that he actually had an eidetic memory but regardless...) and can understand any concept incredibly quickly, you probably don't need Anki. Also, if you're the second coming if John Von Neumann and you're reading this, cool!

To give another example, Terry Tao is a genius who also has spent his entire life doing math. Probably doesn't need Anki (or advice from me in general in case it wasn't obvious).

Finally, I do think how to use Anki well is an under-explored topic given that there's on the order of 10 actual blog posts about it. Given this, I'm still figuring things out myself, in particular around how to Anki-ize stuff that's more procedural, e.g. "when you see a problem like this, consider these three strategies" or something. If you're also experimenting with Anki, I'd love to hear from you!

Comment by an1lam on Raemon's Scratchpad · 2019-10-20T20:25:38.205Z · score: 2 (2 votes) · LW · GW

In my experience, trade can work well here. That is, you care more about cleanliness than your roommate, but they either care abstractly about your happiness or care about some concrete other thing you care about less, e.g. temperature of the apartment. So, you can propose a trade where they agree to be cleaner than they would be otherwise in exchange for you either being happier or doing something else that they care about.

Semi-serious connection to AI: It's kind of like merging your utility functions but it's only temporary.

Comment by an1lam on Billion-scale semi-supervised learning for state-of-the-art image and video classification · 2019-10-19T20:10:36.540Z · score: 1 (1 votes) · LW · GW

Interesting, I somehow hadn't seen this. Thanks! (Editing to reflect this as well.)

I'm curious - even though this isn't new, do you agree with my vague claim that the fact that this and the paper you linked work pertains to the feasibility of amplification-style strategies?

Comment by an1lam on Declarative Mathematics · 2019-10-19T16:16:28.342Z · score: 3 (2 votes) · LW · GW

To your request for examples, my impression is that Black Box Variational Inference is slowly but surely becoming the declarative replacement to MCMC for a lot of generative modeling stuff.

Comment by an1lam on What's your big idea? · 2019-10-19T15:48:14.990Z · score: 1 (1 votes) · LW · GW

Have you read any of Cosma Shalizi's stuff on computational mechanics? Seems very related to your interests.

Comment by an1lam on Open & Welcome Thread - October 2019 · 2019-10-17T14:34:18.948Z · score: 3 (3 votes) · LW · GW

I, like eigen, am also a fan of your blog! Welcome!

Comment by An1lam on [deleted post] 2019-10-15T23:28:52.911Z

Got it - I agree discussing further probably doesn't make sense without a concrete thing to talk about.

Comment by An1lam on [deleted post] 2019-10-15T21:03:58.472Z

Not saying you're wrong or that your experience is invalid but this does not match my experience trying to do applied ML work or ML research. (More experienced people feel free to chime in and tell me how wrong I am...) Granted, I'm not that experienced, but my experience so far has been that most ideas I've had are either ideas that people had and tried or actually published on.

Further, my sense is that multiple experienced researchers often have all had similar ideas and the one who succeeds is the one who overcomes some key obstacle that others failed to overcome previously, so it's not just that I'm new/bad.

I do think this mostly applies to areas that ML researchers think are interesting and somewhat tractable. So, for example, I wouldnt make this claim about safety research and don't know how much it applies to gesture recognition in particular.

That said, since you're claiming your tools are better, I will note that the ML community does seem open to switching tools in general, as evidenced by the Theano to Tensorflow / PyTorch (somewhat) shifts of the past few years.

Would you be willing to describe at least at a high level what these tools let you do?

Edit: I'm more skeptical of the object-level claims than the title claim. Assuming you're using the classical definition of efficient market, I agree that "ML tooling" adoption doesn't follow efficient market dynamics in at least one respect.

Comment by an1lam on An1lam's Short Form Feed · 2019-10-14T14:50:40.298Z · score: 2 (2 votes) · LW · GW

Thing I desperately want: tablet native spaced repetition software that lets me draw flashcards. Cloze deletions are just boxes or hand-drawn occlusions.

Comment by an1lam on Gears vs Behavior · 2019-10-11T02:43:30.790Z · score: 1 (1 votes) · LW · GW

Good post.

I agree with your point regarding ML's historical focus on blackbox prediction. That said, there has been some intriguing recent work (example 1, example 2), which I've only just started looking at, on trying to learn causal models.

I bring this up because I think the question of how causal model learning happens and how learning systems can do it may potentially be relevant to the work you've been writing about. It's primarily of interest to me for different reasons, related to applying ML systems to scientific discovery. In particular, in domains where coming up with causal hypotheses is harder at scale.

Comment by an1lam on Are there technical/object-level fields that make sense to recruit to LessWrong? · 2019-10-09T00:06:04.877Z · score: 1 (1 votes) · LW · GW

Fixed, same HTTPS problem Raemon commented on above.

Comment by an1lam on What are your strategies for avoiding micro-mistakes? · 2019-10-07T16:10:49.138Z · score: 1 (1 votes) · LW · GW

Just wanted to note this was definitely helpful and not too general. Weirdly enough, I've read parts of the SRE book but for some reason was compartmentalizing it in my "engineering" bucket rather than seeing the connection you pointed out.

Comment by an1lam on Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More · 2019-10-05T23:25:19.524Z · score: 30 (13 votes) · LW · GW

Meta: This is in response to both this and comments further up the chain regarding the level of the debate.

It's worth noting that, at least from my perspective, Bengio, who's definitely not in the LW bubble, made good points throughout and did a good job of moderating.

On the other hand, Russell, obviously more partial to the LW consensus view, threw out some "zingers" early on (such as the following one) that didn't derail the debate but easily could've.

Thanks for clearing that up - so 2+2 is not equal to 4, because if the 2 were a 3, the answer wouldn't be 4? I simply pointed out that in the MDP as I defined it, switching off the human is the optimal solution, despite the fact that we didn't put in any emotions of power, domination, hate, testosterone, etc etc. And your solution seems, well, frankly terrifying, although I suppose the NRA would approve.

Comment by an1lam on [Link] Tools for thought (Matuschak & Nielson) · 2019-10-05T17:21:33.578Z · score: 1 (1 votes) · LW · GW

The latter. To be clear, exercises are great but I think they're often not enough, in particular for topics where it's harder to build intuition just by thinking. The visualizations in that post would be an example of a prototype of the sorts of visualizations I'd want for a data-heavy topic.

Regarding textbook problems,the subset of things for which textbook problems substitute for rather than complement interactive visualizations seems relatively small, especially outside of more theoretical domains. Even for something like math, imagine if an addition to exercises, your textbook let you play with 3Blue1Brown-style visualizations of the thing you're learning about.

To give another example, say I'm learning about economics at the intro level. Typical textbooks will have questions about supply & demand curves, diminishing marginal utility, etc. My claim is that most people will build a deeper understanding of these concepts by having access to some sort of interactive models to probe in addition to the standard exercises at the end of the chapter.

Comment by an1lam on [Link] Tools for thought (Matuschak & Nielson) · 2019-10-05T04:38:59.408Z · score: 2 (2 votes) · LW · GW

One issue I have is his idea that the medium of content should be mnemonically based. This bothers me because I presume that if your content is really good, professionals and experts will read it as well. And since the way that they read is different from the manner of a novice, they should be able to ignore and not be interrupted or slowed by tools designed for novices. I feel like there are two rebuttals to this:

  1. In both this and previous essays, Michael Nielsen has specifically addressed the point that tools of thought should scale up to research-level thinking, so he (and presumably Matuschak) are aware of the issue of only appealing to novices.
  2. I read the post you linked and I agree with your points about well-written books' table of contents acting as a memory aid and outline. But, Matuschak's points about books still ring true to me. Just because books are great tools for learning doesn't mean we can't do better. To give one example of a limitation of books outside of memory not addressed by your post, books don't provide any way for me to answer questions about the ideas being discussed beyond what I can visualize in my head (in particular in cases where the ideas are heavily quantitative). To give one example of what this could look like, Bret Victor has posed the thought experiment of what it would be like if you could explore models of climate change as you read about them.
Comment by an1lam on What are your strategies for avoiding micro-mistakes? · 2019-10-05T04:23:51.960Z · score: 1 (1 votes) · LW · GW

This is a great point. Not sure why I phrased it the way I did originally in retrospect. I updated the question to reflect your point.

Comment by an1lam on What are your strategies for avoiding micro-mistakes? · 2019-10-05T00:12:16.508Z · score: 1 (1 votes) · LW · GW

Yeah, with coding, unit testing plus assertions plus checking my intuitions against the code as John Wentworth described does in fact seem to work fairly well for me. I think the difficulty with algebra is that there's not always an obvious secondary check you can do.

Comment by an1lam on Are there technical/object-level fields that make sense to recruit to LessWrong? · 2019-09-30T02:17:15.170Z · score: 4 (3 votes) · LW · GW

Yeah, I have a ton of confirmation bias pushing me to agree with this (because for me the two are definitely related), but I'll add that I also think spending a lot of time programming helped me make reductionism "a part of me" in a way it wasn't before. There are just very few other activities where you're forced to express what you want or a concept to something that fundamentally can only understand a limited logical vocabulary. Math is similar but I think programming makes the reductionist element more salient because of the compiler and because programming tends to involve more mundane work.

Comment by an1lam on Are there technical/object-level fields that make sense to recruit to LessWrong? · 2019-09-30T02:11:36.217Z · score: 2 (2 votes) · LW · GW

Wow, thanks for your detailed reply! I'm going to just sort of reply to a random sampling of stuff you said (hope that's OK).

I suspect one thing that might appeal to these sorts of people, which we have a chance of being able to provide, is an interesting applied-researcher-targeted semi-plain-language (or highly-visual, or flow-chart/checklist, or otherwise accessibly presented) explanation of certain aspects of statistics that are particularly likely to be relevant to these fields.

Makes sense, I've been learning more statistics recently and would have appreciated something like this too.

Small sample sizes, but I think in the biology reference class, I've seen more people bounce off of Eliezer's writing style than the programming reference class does (fairly typical "reads-as-arrogant" stuff; I didn't personally bounce off it, so I'm transmitting this secondhand). I don't think there's anything to be done about this; just sharing the impression. Personally, I've felt moments of annoyance with random LWers who really don't have an intuitive feel for the nuances for evolution, but Eliezer is actually one of the people who seems to have a really solid grasp on this particular topic.

Speculation but do you think this might also be because people in more applied sciences tend to be more skeptical of long chains of reasoning in general? My sense is that doing biology (or chemisty) lab work gives you a mostly healthy but strong skepticism of theorizing without feedback loops because theorizing about biology is so hard.

Networking and career-development-wise... quite frankly, I think we have some, but not a ton to offer biologists directly.

That's fair. I do think it's worth distinguishing between the rationalist community in a specific case and LW itself, even though they're obviously strongly overlapping. I say this because I can imagine a world where LW attracts a mostly socially separate group of biology-interested folks who post and engage but don't necessarily live in Berkeley.

Comment by an1lam on Attainable Utility Theory: Why Things Matter · 2019-09-28T21:36:27.756Z · score: 1 (1 votes) · LW · GW

Kind of weird question: what are you using to write these? A tablet of some sort presumably?

Comment by an1lam on An1lam's Short Form Feed · 2019-09-27T03:19:38.809Z · score: 3 (3 votes) · LW · GW

Today I attended the first of two talks in a two-part mini-workshop on Variational Inference. It's interesting to think of from the perspective of my recent musings about more science-y vs. engineering mindsets because it highlighted the importance of engineering/algorithmic progress in widening Bayesian methods' applicability

The presenter, who's a fairly well known figure in probabilistic ML and has developed some well known statistical inference algorithms, talked about how part of the reason so much time was spent debating philosophical issues in the past was because Bayesian inference wasn't computationally tractable until the development of Gibbs Sampling in the '90s by Gelfand & Smith.

To be clear, the type of progress I'm talking about is still "scientific" in the sense of it mostly involves applied math and finding good ways to approximate posterior distributions. But, it's "engineering" in the sense that it's the messy sort of work I talked about in my other post, where messy means a lot of the methods don't have a good theoretical backing and involve making questionable (at least ex ante) statistical assumptions. Now, the counter is of course that we don't have a theoretical backing yet, but there still may be one in the future.

I'll probably have more to say about this when the workshop's over but I partly just wanted to record my thoughts while they were fresh.