Billion-scale semi-supervised learning for state-of-the-art image and video classification 2019-10-19T15:10:17.267Z · score: 5 (2 votes)
What are your strategies for avoiding micro-mistakes? 2019-10-04T18:42:48.777Z · score: 18 (9 votes)
What are effective strategies for mitigating the impact of acute sleep deprivation on cognition? 2019-03-31T18:31:29.866Z · score: 26 (11 votes)
So you want to be a wizard 2019-02-15T15:43:48.274Z · score: 16 (3 votes)
How do we identify bottlenecks to scientific and technological progress? 2018-12-31T20:21:38.348Z · score: 31 (9 votes)
Babble, Learning, and the Typical Mind Fallacy 2018-12-16T16:51:53.827Z · score: 4 (2 votes)
An1lam's Short Form Feed 2018-08-11T18:33:15.983Z · score: 14 (3 votes)
The Case Against Education: Why Do Employers Tolerate It? 2018-06-10T23:28:48.449Z · score: 17 (5 votes)


Comment by an1lam on An1lam's Short Form Feed · 2019-10-24T00:05:47.284Z · score: 1 (1 votes) · LW · GW

My takeaway from the article was that, to your point, their brains weren't using more energy. Rather, the best hypothesis was just that their adrenal hormones remained elevated for many hours of the day, leading to higher metabolism during that period. Running an hour a day is definitely not enough to burn 6000 calories for the record (a marathon burns around 3500).

Maybe I wasn't clear, but that's what I meant by the following.

The ar­ti­cle’s claiming that chess play­ers burn more en­ergy purely from the side effects of stress, not be­cause their brains are do­ing more work. So why am I re­vis­it­ing this ques­tion?

Comment by an1lam on An1lam's Short Form Feed · 2019-10-23T14:11:55.288Z · score: 2 (2 votes) · LW · GW
I feel like the center often shifts as I learn more about a topic (because I develop new interests within it). The questions I ask myself are more like "How embarrassed would I be if someone asked me this and I didn't know the answer?" and "How much does knowing this help me learn more about the topic or related topics?" (These aren't ideal phrasings of the questions my gut is asking.)

Those seem like good questions to ask as well. In particular, the second one is something I ask myself although, similar to you, in my gut more than verbally. I also deal with the "center shifting" by revising cards aggressively if they no longer match my understanding. I even revise simple phrasing differences when I notice them. That is, if I repeatedly phrase the answer to a card one way in my head and have it phrased differently on the actual card, I'll change the card.

In my experience, I often still forget things I've entered into Anki either because the card was poorly made or because I didn't add enough "surrounding cards" to cement the knowledge. So I've shifted away from this to thinking something more like "at least Anki will make it very obvious if I didn't internalize something well, and will give me an opportunity in the future to come back to this topic to understand it better instead of just having it fade without detection".

I think both this and the original motivational factor I described apply for me.

I'm confused about what you mean by this. (One guess I have is big-O notation, but big-O notation is not sensitive to constants, so I'm not sure what the 5 is doing, and big-O notation is also about asymptotic behavior of a function and I'm not sure what input you're considering.)

You're right. Sorry about that... I just heinously abuse big-O notation and sometimes forget to not do it when talking with others/writing. Edited the original post to be clearer ("on the order of 10").

I think there are few well-researched and comprehensive blog posts, but I've found that there is a lot of additional wisdom the spaced repetition community has accumulated, which is mostly written down in random Reddit comments and smaller blog posts. I feel like I've benefited somewhat from reading this wisdom (but have benefited more from just trying a bunch of things myself).

Interesting, I've perused the Anki sub-reddit a fair amount, but haven't found many posts that do what I'm looking for, which is both give good guidelines and back them up with specific examples. This is probably the closest thing I've read to what I'm looking for, but even this post mostly focuses on high level recommendations and doesn't talk about the nitty-gritty such as different types of cards for different types of skills. If you've saved some of your favorite links, please share!

I agree that trying stuff myself has worked better than reading.

For myself, I've considered writing up what I've learned about using Anki, but it hasn't been a priority because (1) other topics seem more important to work on and write about; (2) most newcomers cannot distinguish been good and bad advice, so I anticipate having low impact by writing about Anki; (3) I've only been experimenting informally and personally, and it's difficult to tell how well my lessons generalize to others.

Regarding other topics being more important, I admit I mostly wrote up the above because I couldn't stop thinking about it rather than based on some sort of principled evaluation of how important it would be. That said, I personally would get a lot of value out of having more people write up detailed case reports of how they've been using Anki and what does/doesn't work well for them that give lots of examples. I think you're right that this won't necessarily be helpful for newcomers, but I do think it will be helpful for people trying to refine their practice over long periods of time. Given that most advice is targeted at newcomers, while the overall impact may be lower, I'd argue "advice for experts" is more neglected and more impactful on the margin.

Regarding takeaways not generalizing, this is why I think giving lots of concrete examples is good because it basically makes your claims reproducible. That is, someone can go out and try what you described fairly easily and see if it works for them.

Comment by an1lam on An1lam's Short Form Feed · 2019-10-23T14:00:34.746Z · score: 3 (2 votes) · LW · GW

You're right on both accounts. Maybe I should've discussed this in my original post... At least for me, Anki serves different purposes at different stages of learning.

Key definitions tend to be useful in the early stages, especially if I'm learning something on and off, as a way to prevent myself from having to constantly refer back and make it easier to think about what they actually mean when I'm away from the source. E.g., I've been exploring alternate interpretations of d-separation in my head during my commute and it helps that I remember the precise conditions in addition to having a visual picture.

Once I've mastered something, I agree that the "concepts and competencies" ("mental moves" is my preferred term) become more important to retain. E.g., I remember the spectral theorem but wish I remembered the sketch of what it looks like to develop the spectral theorem from scratch. Unfortunately, I'm less clear/experienced on using Anki to do this effectively. I think Michael Nielsen's blog post on seeing through a piece of mathematics is a good first step. Deeply internalizing core proofs from an area presumably should help for retaining the core mental moves involved in being effective in that area. But, this is quite time intensive and also prioritizes breadth over depth.

I actually did mention two things that I think may help with retaining concepts and competencies - Anki-izing the same concepts in different ways (often visually) and Anki-izing examples of concepts. I haven't experienced this yet, but I'm hopeful that remembering alternative visual versions of definitions, analogies to them, and examples of them may help with the types of problems where you can see the solution at a glance if you have the right mental model (more common in some areas than others). For example, I remember feeling (usually after agonizing over a problem for a while) like Linear Algebra Done Right had a lot of exercises where the right geometric intuition or representative example would allow you to see the solution relatively quickly and then just have to convert it to words.

Another idea for how to Anki-ize concepts and competencies better that I haven't tried (yet) but will share anyway is succinctly capturing strategies pop up again and again in similar forms. To use another Linear Algebra Done Right example, there are a lot of exercises with solutions of the form "construct some arbitrary linear map that does what we want" and show it... does what we want. I remember this technique but worry that my pattern matching machinery for the types of problems to which it tends to apply has decayed. On the other hand, if I had an Anki card that just listed short descriptions of a few exercises and asked me which technique was core to their solutions, maybe I'd retain that competency better.

Comment by an1lam on An1lam's Short Form Feed · 2019-10-23T01:23:24.083Z · score: 18 (6 votes) · LW · GW

Anki's Not About Looking Stuff Up

Attention conservation notice: if you've read Michael Nielsen's stuff about Anki, this probably won't be new for you. Also, this is all very personal and YMMV.

In a number of discussions of Anki here and elsewhere, I've seen Anki's value measured in terms of time saved by not having to look stuff up. For example, Gwern's spaced repetition post includes a calculation of when it's worth it to Anki-ize threshold, although I would be surprised if Gwern hasn't already thought about the claim going to make.

While I occasionally use Anki to remember things that I would otherwise have to Google, e.g. statistics, I almost never Anki-ize things so that I can avoid Googling them in the future. And I don't think in terms of time saved when deciding what to Anki-ize.

Instead, (as Michael Nielsen discusses in his posts) I almost always Anki-ize with the goal of building a connected graph of knowledge atoms about an area in which I'm interested. As a result, I tend to evaluate what to Anki-ize based on two criteria:

  1. Will this help me think about this domain without paper or a computer better?
  2. In the Platonic graph of this domain's knowledge ontology, how central is this node? (Pedantic note: it's easier to visualize distance to the root of the tree, but this requires removing cycles from the graph.)

To make this more concrete, let's look at an example of a topic I've been Anki-izing recently, causal inference. I just started Anki-izing this topic a week ago, so it'll be easier for me to avoid idealizing the process. Looking at my cards so far, I have questions about and definitions of things like "d-separation", "sufficient/admissible sets", and "backdoor paths". Notably, for each of these, I don't just have a cloze card to recall the definition, I also have image cards that quiz me on examples and conceptual questions that clarify things I found confusing upon first encountering these concepts. I've found that making these cards has the effect of both forcing me to ensure I understand concepts (because writing cards requires breaking them down) and makes it easier to bootstrap my understanding over the course of multiple days. Furthermore, knowing that I'll remember at least the stuff I've Anki-ized has a surprisingly strong motivational impact on me on a gut level.

All that said, I suspect there are some people for whom Anki-izing wouldn't be helpful.

The first is people who have the time and a career in which they focus on a narrow enough set of topics such that they repeatedly see the same concepts and rarely go for long periods without revisiting them. I've experienced this myself for Python - I learned it well before starting to use Anki and used it every day for many years. So even if I forget some stuff, it's very easy for me to use the language fluently after time away from it.

The second is, for lack of a better term, actual geniuses. Like, if you're John Von Neumann and you legitimately have an approximation of a photographic memory (I'm really skeptical that he actually had an eidetic memory but regardless...) and can understand any concept incredibly quickly, you probably don't need Anki. Also, if you're the second coming if John Von Neumann and you're reading this, cool!

To give another example, Terry Tao is a genius who also has spent his entire life doing math. Probably doesn't need Anki (or advice from me in general in case it wasn't obvious).

Finally, I do think how to use Anki well is an under-explored topic given that there's on the order of 10 actual blog posts about it. Given this, I'm still figuring things out myself, in particular around how to Anki-ize stuff that's more procedural, e.g. "when you see a problem like this, consider these three strategies" or something. If you're also experimenting with Anki, I'd love to hear from you!

Comment by an1lam on Raemon's Scratchpad · 2019-10-20T20:25:38.205Z · score: 2 (2 votes) · LW · GW

In my experience, trade can work well here. That is, you care more about cleanliness than your roommate, but they either care abstractly about your happiness or care about some concrete other thing you care about less, e.g. temperature of the apartment. So, you can propose a trade where they agree to be cleaner than they would be otherwise in exchange for you either being happier or doing something else that they care about.

Semi-serious connection to AI: It's kind of like merging your utility functions but it's only temporary.

Comment by an1lam on Billion-scale semi-supervised learning for state-of-the-art image and video classification · 2019-10-19T20:10:36.540Z · score: 1 (1 votes) · LW · GW

Interesting, I somehow hadn't seen this. Thanks! (Editing to reflect this as well.)

I'm curious - even though this isn't new, do you agree with my vague claim that the fact that this and the paper you linked work pertains to the feasibility of amplification-style strategies?

Comment by an1lam on Declarative Mathematics · 2019-10-19T16:16:28.342Z · score: 3 (2 votes) · LW · GW

To your request for examples, my impression is that Black Box Variational Inference is slowly but surely becoming the declarative replacement to MCMC for a lot of generative modeling stuff.

Comment by an1lam on What's your big idea? · 2019-10-19T15:48:14.990Z · score: 1 (1 votes) · LW · GW

Have you read any of Cosma Shalizi's stuff on computational mechanics? Seems very related to your interests.

Comment by an1lam on Open & Welcome Thread - October 2019 · 2019-10-17T14:34:18.948Z · score: 3 (3 votes) · LW · GW

I, like eigen, am also a fan of your blog! Welcome!

Comment by an1lam on ML is an inefficient market · 2019-10-15T23:28:52.911Z · score: 3 (3 votes) · LW · GW

Got it - I agree discussing further probably doesn't make sense without a concrete thing to talk about.

Comment by an1lam on ML is an inefficient market · 2019-10-15T21:03:58.472Z · score: 3 (3 votes) · LW · GW

Not saying you're wrong or that your experience is invalid but this does not match my experience trying to do applied ML work or ML research. (More experienced people feel free to chime in and tell me how wrong I am...) Granted, I'm not that experienced, but my experience so far has been that most ideas I've had are either ideas that people had and tried or actually published on.

Further, my sense is that multiple experienced researchers often have all had similar ideas and the one who succeeds is the one who overcomes some key obstacle that others failed to overcome previously, so it's not just that I'm new/bad.

I do think this mostly applies to areas that ML researchers think are interesting and somewhat tractable. So, for example, I wouldnt make this claim about safety research and don't know how much it applies to gesture recognition in particular.

That said, since you're claiming your tools are better, I will note that the ML community does seem open to switching tools in general, as evidenced by the Theano to Tensorflow / PyTorch (somewhat) shifts of the past few years.

Would you be willing to describe at least at a high level what these tools let you do?

Edit: I'm more skeptical of the object-level claims than the title claim. Assuming you're using the classical definition of efficient market, I agree that "ML tooling" adoption doesn't follow efficient market dynamics in at least one respect.

Comment by an1lam on An1lam's Short Form Feed · 2019-10-14T14:50:40.298Z · score: 2 (2 votes) · LW · GW

Thing I desperately want: tablet native spaced repetition software that lets me draw flashcards. Cloze deletions are just boxes or hand-drawn occlusions.

Comment by an1lam on Gears vs Behavior · 2019-10-11T02:43:30.790Z · score: 1 (1 votes) · LW · GW

Good post.

I agree with your point regarding ML's historical focus on blackbox prediction. That said, there has been some intriguing recent work (example 1, example 2), which I've only just started looking at, on trying to learn causal models.

I bring this up because I think the question of how causal model learning happens and how learning systems can do it may potentially be relevant to the work you've been writing about. It's primarily of interest to me for different reasons, related to applying ML systems to scientific discovery. In particular, in domains where coming up with causal hypotheses is harder at scale.

Comment by an1lam on Are there technical/object-level fields that make sense to recruit to LessWrong? · 2019-10-09T00:06:04.877Z · score: 1 (1 votes) · LW · GW

Fixed, same HTTPS problem Raemon commented on above.

Comment by an1lam on What are your strategies for avoiding micro-mistakes? · 2019-10-07T16:10:49.138Z · score: 1 (1 votes) · LW · GW

Just wanted to note this was definitely helpful and not too general. Weirdly enough, I've read parts of the SRE book but for some reason was compartmentalizing it in my "engineering" bucket rather than seeing the connection you pointed out.

Comment by an1lam on Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More · 2019-10-05T23:25:19.524Z · score: 29 (12 votes) · LW · GW

Meta: This is in response to both this and comments further up the chain regarding the level of the debate.

It's worth noting that, at least from my perspective, Bengio, who's definitely not in the LW bubble, made good points throughout and did a good job of moderating.

On the other hand, Russell, obviously more partial to the LW consensus view, threw out some "zingers" early on (such as the following one) that didn't derail the debate but easily could've.

Thanks for clearing that up - so 2+2 is not equal to 4, because if the 2 were a 3, the answer wouldn't be 4? I simply pointed out that in the MDP as I defined it, switching off the human is the optimal solution, despite the fact that we didn't put in any emotions of power, domination, hate, testosterone, etc etc. And your solution seems, well, frankly terrifying, although I suppose the NRA would approve.

Comment by an1lam on [Link] Tools for thought (Matuschak & Nielson) · 2019-10-05T17:21:33.578Z · score: 1 (1 votes) · LW · GW

The latter. To be clear, exercises are great but I think they're often not enough, in particular for topics where it's harder to build intuition just by thinking. The visualizations in that post would be an example of a prototype of the sorts of visualizations I'd want for a data-heavy topic.

Regarding textbook problems,the subset of things for which textbook problems substitute for rather than complement interactive visualizations seems relatively small, especially outside of more theoretical domains. Even for something like math, imagine if an addition to exercises, your textbook let you play with 3Blue1Brown-style visualizations of the thing you're learning about.

To give another example, say I'm learning about economics at the intro level. Typical textbooks will have questions about supply & demand curves, diminishing marginal utility, etc. My claim is that most people will build a deeper understanding of these concepts by having access to some sort of interactive models to probe in addition to the standard exercises at the end of the chapter.

Comment by an1lam on [Link] Tools for thought (Matuschak & Nielson) · 2019-10-05T04:38:59.408Z · score: 2 (2 votes) · LW · GW

One issue I have is his idea that the medium of content should be mnemonically based. This bothers me because I presume that if your content is really good, professionals and experts will read it as well. And since the way that they read is different from the manner of a novice, they should be able to ignore and not be interrupted or slowed by tools designed for novices. I feel like there are two rebuttals to this:

  1. In both this and previous essays, Michael Nielsen has specifically addressed the point that tools of thought should scale up to research-level thinking, so he (and presumably Matuschak) are aware of the issue of only appealing to novices.
  2. I read the post you linked and I agree with your points about well-written books' table of contents acting as a memory aid and outline. But, Matuschak's points about books still ring true to me. Just because books are great tools for learning doesn't mean we can't do better. To give one example of a limitation of books outside of memory not addressed by your post, books don't provide any way for me to answer questions about the ideas being discussed beyond what I can visualize in my head (in particular in cases where the ideas are heavily quantitative). To give one example of what this could look like, Bret Victor has posed the thought experiment of what it would be like if you could explore models of climate change as you read about them.
Comment by an1lam on What are your strategies for avoiding micro-mistakes? · 2019-10-05T04:23:51.960Z · score: 1 (1 votes) · LW · GW

This is a great point. Not sure why I phrased it the way I did originally in retrospect. I updated the question to reflect your point.

Comment by an1lam on What are your strategies for avoiding micro-mistakes? · 2019-10-05T00:12:16.508Z · score: 1 (1 votes) · LW · GW

Yeah, with coding, unit testing plus assertions plus checking my intuitions against the code as John Wentworth described does in fact seem to work fairly well for me. I think the difficulty with algebra is that there's not always an obvious secondary check you can do.

Comment by an1lam on Are there technical/object-level fields that make sense to recruit to LessWrong? · 2019-09-30T02:17:15.170Z · score: 4 (3 votes) · LW · GW

Yeah, I have a ton of confirmation bias pushing me to agree with this (because for me the two are definitely related), but I'll add that I also think spending a lot of time programming helped me make reductionism "a part of me" in a way it wasn't before. There are just very few other activities where you're forced to express what you want or a concept to something that fundamentally can only understand a limited logical vocabulary. Math is similar but I think programming makes the reductionist element more salient because of the compiler and because programming tends to involve more mundane work.

Comment by an1lam on Are there technical/object-level fields that make sense to recruit to LessWrong? · 2019-09-30T02:11:36.217Z · score: 2 (2 votes) · LW · GW

Wow, thanks for your detailed reply! I'm going to just sort of reply to a random sampling of stuff you said (hope that's OK).

I suspect one thing that might appeal to these sorts of people, which we have a chance of being able to provide, is an interesting applied-researcher-targeted semi-plain-language (or highly-visual, or flow-chart/checklist, or otherwise accessibly presented) explanation of certain aspects of statistics that are particularly likely to be relevant to these fields.

Makes sense, I've been learning more statistics recently and would have appreciated something like this too.

Small sample sizes, but I think in the biology reference class, I've seen more people bounce off of Eliezer's writing style than the programming reference class does (fairly typical "reads-as-arrogant" stuff; I didn't personally bounce off it, so I'm transmitting this secondhand). I don't think there's anything to be done about this; just sharing the impression. Personally, I've felt moments of annoyance with random LWers who really don't have an intuitive feel for the nuances for evolution, but Eliezer is actually one of the people who seems to have a really solid grasp on this particular topic.

Speculation but do you think this might also be because people in more applied sciences tend to be more skeptical of long chains of reasoning in general? My sense is that doing biology (or chemisty) lab work gives you a mostly healthy but strong skepticism of theorizing without feedback loops because theorizing about biology is so hard.

Networking and career-development-wise... quite frankly, I think we have some, but not a ton to offer biologists directly.

That's fair. I do think it's worth distinguishing between the rationalist community in a specific case and LW itself, even though they're obviously strongly overlapping. I say this because I can imagine a world where LW attracts a mostly socially separate group of biology-interested folks who post and engage but don't necessarily live in Berkeley.

Comment by an1lam on Attainable Utility Theory: Why Things Matter · 2019-09-28T21:36:27.756Z · score: 1 (1 votes) · LW · GW

Kind of weird question: what are you using to write these? A tablet of some sort presumably?

Comment by an1lam on An1lam's Short Form Feed · 2019-09-27T03:19:38.809Z · score: 3 (3 votes) · LW · GW

Today I attended the first of two talks in a two-part mini-workshop on Variational Inference. It's interesting to think of from the perspective of my recent musings about more science-y vs. engineering mindsets because it highlighted the importance of engineering/algorithmic progress in widening Bayesian methods' applicability

The presenter, who's a fairly well known figure in probabilistic ML and has developed some well known statistical inference algorithms, talked about how part of the reason so much time was spent debating philosophical issues in the past was because Bayesian inference wasn't computationally tractable until the development of Gibbs Sampling in the '90s by Gelfand & Smith.

To be clear, the type of progress I'm talking about is still "scientific" in the sense of it mostly involves applied math and finding good ways to approximate posterior distributions. But, it's "engineering" in the sense that it's the messy sort of work I talked about in my other post, where messy means a lot of the methods don't have a good theoretical backing and involve making questionable (at least ex ante) statistical assumptions. Now, the counter is of course that we don't have a theoretical backing yet, but there still may be one in the future.

I'll probably have more to say about this when the workshop's over but I partly just wanted to record my thoughts while they were fresh.

Comment by an1lam on An1lam's Short Form Feed · 2019-09-23T19:43:19.590Z · score: 1 (1 votes) · LW · GW

From cursory Googling, it looks like tensor networks are mostly used for understanding quantum systems. I'm not opposed to learning about them, but is there a good resource you can point me to that introduces them independent of the physics concepts? Were you learning them for use in physics?

For example, have you happened to read this Google AI paper introducing their TensorNetworks library and giving an overview?

Comment by an1lam on An1lam's Short Form Feed · 2019-09-22T20:07:22.944Z · score: 4 (3 votes) · LW · GW

ML-related math trick: I find it easier to imagine a 4D tensor, say of dimensions , as a big matrix with dimensions within which are nested matrices of dimensions . The nice thing about this is, at least for me, it makes it easier to imagine applying operations over the matrices in parallel, which is something I've had to thing about a number of times doing ML-related programming, e.g. trying to figure out how write the code to apply a 1D convolution-like operation to an entire batch in parallel.

Comment by an1lam on Reframing Impact · 2019-09-21T01:33:27.951Z · score: 5 (3 votes) · LW · GW

Yup, I have (and the untrollable mathematician one). I dashed off that comment but really meant something like, "I hope this trend takes off."

Comment by an1lam on Reframing Impact · 2019-09-20T22:38:11.510Z · score: 7 (2 votes) · LW · GW

I enjoyed the post and in particular really liked the illustrated format. Definitely planning to read the rest!

I'm now wishing more technical blog posts were illustrated like this...

Comment by an1lam on Matt Goldenberg's Short Form Feed · 2019-09-17T23:51:26.400Z · score: 1 (1 votes) · LW · GW

This note won't make sense to anyone who isn't already familiar with the Sociopath framework in which you're discussing this, but I did want to call out that Venkat Rao (at least when he wrote the Gervais Principle) explicitly stated that sociopaths are amoral and has fairly clearly (especially relative to his other opinions) stated that he thinks having more Sociopaths wouldn't be a bad thing. Here are a few quotes from Morality, Compassion, and the Sociopath which talk about this:

So yes, this entire edifice I am constructing is a determinedly amoral one. Hitler would count as a sociopath in this sense, but so would Gandhi and Martin Luther King.

In all this, the source of the personality of this archetype is distrust of the group, so I am sticking to the word “sociopath” in this amoral sense. The fact that many readers have automatically conflated the word “sociopath” with “evil” in fact reflects the demonizing tendencies of loser/clueless group morality. The characteristic of these group moralities is automatic distrust of alternative individual moralities. The distrust directed at the sociopath though, is reactionary rather than informed.

Sociopaths can be compassionate because their distrust only extends to groups. They are capable of understanding and empathizing with individual pain and acting with compassion. A sociopath who sets out to be compassionate is strongly limited by two factors: the distrust of groups (and therefore skepticism and distrust of large-scale, organized compassion), and the firm grounding in reality. The second factor allows sociopaths to look unsentimentally at all aspects of reality, including the fact that apparently compassionate actions that make you “feel good” and assuage guilt today may have unintended consequences that actually create more evil in the long term. This is what makes even good sociopaths often seem callous to even those among the clueless and losers who trust the sociopath’s intentions. The apparent callousness is actually evidence that hard moral choices are being made.

When a sociopath has the resources for (and feels the imperative towards) larger scale do-gooding, you get something like Bill Gates’ behavior: a very careful, cautious, eyes-wide-open approach to compassion. Gates has taken on a world-hunger sized problem, but there is very little ceremony or posturing about it. It is sociopath compassion. Underlying the scale is a residual distrust of the group — especially the group inspired by oneself — that leads to the “reluctant messiah” effect. Nothing is as scary to the compassionate and powerful sociopath as the unthinking adulation and following inspired by their ideas. I suspect the best among these lie awake at night worrying that if they were to die, the headless group might mutate into a monster driven by a frozen, unexamined moral code. Which is why the smartest attempt to engineer institutionalized doubt, self-examination and formal checks and balances into any systems they design.

I hope my explanation of the amorality of the sociopath stance makes a response mostly unnecessary: I disagree with the premise that “more sociopaths is bad.” More people taking individual moral responsibility is a good thing. It is in a sense a different reading of Old Testament morality — eating the fruit of the tree of knowledge and learning to tell good and evil apart is a good thing. An atheist view of the Bible must necessarily be allegorical, and at the risk of offending some of you, here’s my take on the Biblical tale of the Garden of Eden: Adam and Eve were clueless, having abdicated moral responsibility to a (putatively good) sociopath: God. Then they became sociopaths in their own right. And were forced to live in an ecosystem that included another sociopath — the archetypal evil one, Satan — that the good one could no longer shield them from. This makes the “descent” from the Garden of Eden an awakening into freedom rather than a descent into baseness. A good thing.

I apologize if this just seems like nitpicking your terminology, but I'm calling it out because I'm curious whether you agree with his abstract definition but disagree with his moral assessment of Sociopaths, vice versa, or something else entirely? As a concrete example, I think Venkat would argue that early EA was a form of Sociopath compassion and that for the sorts of world-denting things a lot LWers tend to be interested in, Sociopathy (again, as he defines it) is going to be the right stance to take.

Comment by an1lam on Are there technical/object-level fields that make sense to recruit to LessWrong? · 2019-09-16T13:11:02.185Z · score: 22 (11 votes) · LW · GW

As I've been talking about on my shortform, I'd be excited about attracting more "programmer's programmers". AFAICT, a lot of LW users are programmers, but a large fraction of these users either are more interested in transitioning into theoretical alignment research or just don't really post about programming. As a small piece of evidence for this claim, I've been consistently surprised to see the relatively lukewarm reaction to Martin Sustrik's posts on LW. I read Sustrik's blog before he started posting and consistently find his posts there and here pretty interesting (I am admittedly a bit biased because I was already impressed by Sustrik's work on ZeroMQ).

I think that's a bit of a shame because I personally have found LW-style thinking useful for programming. My debugging process has especially benefited from applying some combination of informal probabilistic reasoning and "making beliefs pay rent", which enabled me to make more principled decisions about which hypotheses to falsify first when finding root causes. For a longer example, see this blog post about reproducing a deep RL paper, which discusses how noticing confusion helped the author make progress (CFAR is specifically mentioned). LW-style thinking has also helped me stop obsessing over much of the debate around some of the more mindkiller-y topics in programming like "should you always write tests first", "are type-safe languages always better than dynamic ones". In my ideal world, LW-style thinking applied to fuzzier questions about programming would help us move past these "wrong questions".

Programming already has a few other internet locuses such as Hacker News and, but I think those places have fewer "people who know how to integrate evidence and think probabilistically in confusing domains."

Assuming this seems appealing, one way to approach getting more people of the type I'm talking about would be to reach out to prominent bloggers who seem like they're already somewhat sympathetic to the LW meme-plex and see if they'd be willing to cross-post their content. Example of the sorts of people I'm thinking about include:

  • Hillel Wayne: who writes about empiricism in software engineering and formal methods.

  • Jimmy Koppel: who writes about insights for programming he's gleaned from his "day job" as a programming tools researcher (I think he has a LW account already).

  • Julia Evans: Writes about programming practice and questions she's interested in. A blog post of hers that seems especially LW-friendly is What does debugging a program look like?

Last, I do want to include add a caveat for all this which I think applies to reaching out to basically any group: there's a big risk of culture clash/dilution if the outreach effort succeeds (see Geeks, MOPs, and sociopaths for one exploration of this topic). How to mitigate this is probably a separate question, but I did want to call it out in case it seems like I'm just recommending blindly trying to get more users.

Comment by an1lam on Are there technical/object-level fields that make sense to recruit to LessWrong? · 2019-09-16T12:30:20.954Z · score: 8 (6 votes) · LW · GW

Minor conflict of interest disclaimer: I've recently become much more interested in computational biology and therefore have a personal interest in having more content related to biology in general on LW.

I'd be excited about having more representation from the experimental sciences, e.g. biology, certain areas of physics, chemistry, on LessWrong. I don't have a good sense of how many total LW users come from these fields, but it certainly doesn't seem like many prominent posters/commenters do. The closest thing to a prominent poster who talks about experimental science is Scott Alexander.

My sense from random conversations I've had over the years is that there's a lot of tacit but important knowledge about how to do experimental research and lab work well that isn't written down anywhere and could make for interesting complementary content to the wealth of content on LW about the connection between rationality and doing theory well. There's also an untapped treasure trove of stories about important discoveries in these areas that could make for good LW post series. I'd love to see someone take me through the history of Barbara McClintock's discoveries or the development of CRISPR from a rationalist perspective (i.e. what were the cognitive strategies that went along with discovering these things). There are books on discoveries like this of course, but there are also books on most of the material in the Sequences.

Having more LWers from experimental sciences could also provide a foundation for more detailed discussion of X-risks outside of transformative AI, bio-risks in particular.

In terms of attracting these sorts of people, one challenge is that younger researchers in these areas in particular tend to have long hours due to the demands of lab work and therefore may have less time to post on LW.

Comment by an1lam on An1lam's Short Form Feed · 2019-09-15T17:24:05.749Z · score: 5 (3 votes) · LW · GW

Epistemic status: Thinking out loud.

Introducing the Question

Scientific puzzle I notice I'm quite confused about: what's going on with the relationship between thinking and the brain's energy consumption?

On one hand, I'd always been told that thinking harder sadly doesn't burn more energy than normal activity. I believed that and had even come up with a plausible story about how evolution optimizes for genetic fitness not intelligence, and introspective access is pretty bad as it is, so it's not that surprising that we can't crank up our brains energy consumption to think harder. This seemed to jive with the notion that our brain's putting way more computational resources towards perceiving and responding to perception than abstract thinking. It also fit well with recent results calling ego depletion into question and into the framework in which mental energy depletion is the result of a neural opportunity cost calculation.

Going even further, studies like this one left me with the impression that experts tended to require less energy to accomplish the same mental tasks as novices. Again, this seemed plausible under the assumption that experts brains developed some sort of specialized modules over the thousands of hours of practice they'd put in.

I still believe that thinking harder doesn't use more energy, but I'm now much less certain about the reasons I'd previously given for this.

Chess Players' Energy Consumption

This recent ESPN (of all places) article about chess players' energy consumption during tournaments has me questioning this story. The two main points of the article are:

  1. Chess players burn a lot of energy during tournaments, potentially on the order of 6000 calories a day (that's about what marathon runners burn in a day). This results from intense mental stress leading to an elevated heart rate and, as a result, increased oxygen consumption. Chess players also tend to eat less during competitions, which also contributes to weight loss during tournaments (apparently Karpov once lost 20 pounds during an extended chess championship).
  2. Chess players and their coaches now understand that humans aren't Cartesian, i.e. our physical health impacts our cognitive performance, and have responded accordingly with intense physical training regimens. On the surface, none of this contradicts the claims I cited above. The article's claiming that chess players burn more energy purely from the side effects of stress, not because their brains are doing more work. So why am I revisiting this question?

Gaps in the Evolutionary Justification

First, reading the chess article led me to notice a big gap in the explanation I gave above for why we shouldn't expect a connection between thinking hard and energy consumption. In my explanation, I mentioned that we should expect our brains to spend much more energy on perceptive and reactive processing than on abstract thinking. This still makes sense to me as a general claim about the median mammal, but now seems less plausible to me as it relates to humans specifically. This recent study, for example, provides evidence that our (humans) big brains are one of two primary causes for our increased energy consumption compared to other primates. As far as I can tell, humans don't seem to have meaningfully better coordination or perceptive abilities than chimps. Chimps have opposable thumbs and big toes, spend their days picking bugs off of each other, and climbing trees. Given this, while I admittedly haven't looked into studies on this but I find it hard to imagine that human brains spend much more energy than chimps on perception.

Let's say that we put aside the question of what exactly human brains use their extra energy for and bucket it into the loose category of "higher mental functions". This still leaves me with a relevant question, why didn't brains evolve to use varying amounts of energy depending on what they were doing? In particular, if we assume that humans are the first and only mammals that spend large fractions of their calories on "extra" brain functions, then why wasn't there selection pressure to have those functions only use energy when they were needed instead of all the time?

Bringing things back to my original point, in my initial story, thinking didn't impact energy consumption because our brains spend most of their energy on other stuff anyway, so there wasn't strong selective pressure to connect thinking intensity to energy consumption. However, I've just given some evidence that "higher brain functions" actually did come with a significant energy cost, so we might expect that those functions' energy consumption would in fact be context-dependent.

Second, it's weird that what we're doing (mentally) can so dramatically impact our energy consumption due to elevated heart rate and other stress-triggered adaptations but has no impact on the energy our brain consumes. To be clear, it makes sense that physical activity and stress would be intimately connected as this connection is presumably very important for balancing the need to eat/escape predators with the need to not use too much energy when sitting around. One doesn't yet make sense to me is that, even though neurons evolved from the same cells as all the rest of our biology, they proved so resistant to optimization for variable energy consumption.

Rescuing the Original Hypothesis

The best explanation I can come up with for the two puzzles I just discussed is that, for whatever reason, evolution didn't select for a neural architecture that could selectively up- and down-regulate its energy consumption depending on the circumstances. For example, maybe the fact that neurons die when they don't have energy is somehow intimately coupled with their architecture such that there's no way to fix it short of something only a goal-directed consequentialist (and therefore not a hill-climbing process) could accomplish. If this is true, even though humans plausibly would've benefited at some point during our evolutionary history from being able to spend more or less energy on thinking, we shouldn't be surprised never happened.

Another weaker (IMO) explanation is that human brains do use more energy in certain situations for some "higher mental functions" but it's not the situations you'd expect. For example, maybe humans use a ton of energy for social cognition and if we could measure the neocortex's energy consumption during parties, we'd find it uses a lot more energy than usual.

Comment by an1lam on crabman's Shortform · 2019-09-15T14:08:21.734Z · score: 1 (1 votes) · LW · GW

That is interesting! I should be clear that my odds ratios are pretty tentative given the uncertainty around the challenge. For example, I literally woke up this morning and thought that my 1/3 odds might be too conservative given recent progress on 8th grade science tests and theorem proving.

I created three PredictionBook predictions to track this if anyone's interested (5 years, 10 years, 20 years).

Comment by an1lam on crabman's Shortform · 2019-09-15T04:33:54.704Z · score: 2 (2 votes) · LW · GW

Can you quantify soon :) ? For example, I'd be willing to bet at 1/3 odds that this will be solved in the next 10 years conditional on a certain amount of effort being put in and more like 1/1 odds for the next 20 years. It's hard to quantify the conditional piece but I'd cash it out as something like if researchers put in the same amount of effort into this that they put into NLP/image recognition benchmarks. I don't think that'll happen, so this is purely a counterfactual claim, but maybe it will help ground any subsequent discussion with some sort of concrete claim?

Comment by an1lam on Jimrandomh's Shortform · 2019-09-15T04:28:52.001Z · score: 5 (2 votes) · LW · GW

Sure, but let me clarify that I'm probably not drawing as hard a boundary between "ordinary paranoia" and "deep security" as I should be. I think Bruce Schneier's and Eliezer's buckets for "security mindset" blended together in the months since I read both posts. Also, re-reading the logistic success curve post reminded me that Eliezer calls into question whether someone who lacks security mindset can identify people who have it. So it's worth noting that my ability to identify people with security mindset is itself suspect by this criteria (there's no public evidence that I have security mindset and I wouldn't claim that I have a consistent ability to do "deep security"-style analysis.)

With that out of the way, here are some of the examples I was thinking of.

First of all, at a high level, I've noticed that you seem to consistently question assumptions other posters are making and clarify terminology when appropriate. This seems like a prerequisite for security mindset, since it's a necessary first step towards constructing systems.

Second and more substantively, I've seen you consistently raise concerns about human safety problems (also here. I see this as an example of security mindset because it requires questioning the assumptions implicit in a lot of proposals. The analogy to Eliezer's post here would be that ordinary paranoia is trying to come up with more ways to prevent the AI from corrupting the human (or something similar) whereas I think a deep security solution would look more like avoiding the assumption that humans are safe altogether and instead seeking clear guarantees that our AIs will be safe even if we ourselves aren't.

Last, you seem to be unusually willing to point out flaws in your own proposals, the prime example being UDT. The most recent example of this is your comment about the bomb argument, but I've seen you do this quite a bit and could find more examples if prompted. On reflection, this may be more of an example of "ordinary paranoia" than "deep security", but it's still quite important in my opinion.

Let me know if that clarifies things at all. I can probably come up with more examples of each type if requested, but it will take me some time to keep digging through posts and comments so figured I'd check in to see if what I'm saying makes sense before continuing to dig.

Comment by an1lam on Jimrandomh's Shortform · 2019-09-14T21:50:20.849Z · score: 3 (2 votes) · LW · GW

In fairness, I'm probably over-generalizing from a few examples. For example, my biggest inspiration from the field of crypto is Daniel J. Bernstein, a cryptographer who's in part known for building qmail, which has an impressive security track record & guarantee. He discusses principles for secure software engineering in this paper, which I found pretty helpful for my own thinking.

To your point about hashing the results of several different hash functions, I'm actually kind of surprised to hear that this might to protect against the sorts of advances I'd expect to break hash algorithms. I was under the very amateur impression that basically all modern hash functions relied on the same numerical algorithmic complexity (and number-theoretic results). If there are any resources you can point me to about this, I'd be interested in getting a basic understanding of the different assumptions hash functions can depend on.

Comment by an1lam on Who's an unusual thinker that you recommend following? · 2019-09-13T16:41:12.093Z · score: 8 (7 votes) · LW · GW

Drew Endy's a professor at Stanford and synthetic biology pioneer. I discovered him via this talk he gave at a hacker conference about programming DNA. He unfortunately doesn't have a ton of recent public content, but I recommend watching that talk I linked and reading this interview. Two of my favorite quotes of his are (from memory so paraphrased), "biology is nanotech that works" and "what's the most advanced thing on a person's desk. It's not their iPhone; it's the plant they keep there."

I think Drew (also) satisfies most of your desiderata.

  • He thinks biology is overly focused on science and should focus more on maturing as an engineering discipline by building reusable modular parts.
  • (See the prior bullet, which was pretty novel to me when I first heard it.)
  • Drew caused me to update towards thinking that synthetic biology was more promising as a solution to problems beyond health, e.g. for producing cheaper materials, democratizing production, and producing more sustainable energy. He also caused me to update towards the view that synthetic biology was making very rapid progress.
  • Pushing forward synthetic biology seems pretty interesting to me. Drew also created the iGEM competition, which has enabled 10s of thousands of students to participate in synthetic biology projects.
  • Drew created iGEM and generally advocates for learning by doing as an alternative to just studying in the abstract.
Comment by an1lam on Who's an unusual thinker that you recommend following? · 2019-09-13T16:33:31.201Z · score: 7 (5 votes) · LW · GW

David Ha's (Twitter, blog) one of the more interesting deep learning researchers I follow. He works loosely on Model-based Reinforcement Learning and Evolutionary Algorithms, but in practice seems to explore whatever interests him. His most recent paper, Weight Agnostic Neural Networks looks at what happens when you do architecture search over neural nets initialized with random weights to try and better understand how much work structure is doing in neural nets.

I believe David satisfies all of your desiderata.

  • Often disagrees with the consensus on various questions around what are promising AI research directions.
  • Consistently produces original deep learning research that makes me go "wow, I never would have thought of that.".
  • Has caused me to update on which aspects of neural nets are important for performance.
  • Is definitely effective as a researcher (see above).
  • Writes much clearer than average papers and also often uses visual aids and blog posts to explain his and others' work (this is for the last two together).
Comment by an1lam on An1lam's Short Form Feed · 2019-09-12T19:02:29.950Z · score: 9 (5 votes) · LW · GW

At this point, I basically agree that we agree and that the most useful follow up action is for someone (read: me) to actually be the change they want to see and write some (object-level), and ideally good, content from a more engineering-y bent.

As I mentioned in my reply to jimrandomh, a book review seems like a good place for me to start.

Comment by an1lam on jp's Shortform · 2019-09-12T02:17:07.767Z · score: 4 (3 votes) · LW · GW

I tried this today and it went well. Got through ~15 cards in only a few sets. It did cause me to take longer rests between sets (I can't seem to consistently use a timer) but I'm not that worried about long rests anyway.

Entering cards seems harder for me though. Most of my cards include some sort of LaTex formatting, which I don't think the Android app supports applying.

Comment by an1lam on Jimrandomh's Shortform · 2019-09-12T02:13:38.445Z · score: 8 (5 votes) · LW · GW

I like this post!

Some evidence that security mindset generalizes across at least some domains: the same white hat people who are good at finding exploits in things like kernels seem to also be quite good at finding exploits in things like web apps, real-world companies, and hardware. I don't have a specific person to give as an example, but this observation comes from going to a CTF competition and talking to some of the people who ran it about the crazy stuff they'd done that spanned a wide array of different areas.

Another slightly different example, Wei Dai is someone who I actually knew about outside of Less Wrong from his early work on cryptocurrency stuff, so he was at least at one point involved in a security-heavy community (I'm of the opinion that early cryptocurrency folks were on average much better about security mindset than the average current cryptocurrency community member). Based on his posts and comments, he generally strikes me as having security mindset style thinking from his comments and from my perspective has contributed a lot of good stuff to AI alignment.

Theo de Raadt is notoriously... opinionated, so it would definitely be interesting to see him thrown on an AI team. That said, I suspect someone like Ralph Merkle, who's a bona fide cryptography wizard (he invented public key cryptography and Merkle trees!) and is heavily involved in the cryonics and nanotech communities, could fairly easily get up to speed on AI control work and contribute from a unique security/cryptography-oriented perspective. In particular, now that there seems to be more alignment/control work that involves at least exploring issues with concrete proposals, I think someone like this would have less trouble finding ways to contribute. That said, having cryptography experience in addition to security experience does seem helpful. Cryptography people are probably more used to combining their security mindset with their math intuition than your average white-hat hacker.

Comment by an1lam on Is competition good? · 2019-09-10T16:15:23.187Z · score: 6 (4 votes) · LW · GW

FYI: Robin Hanson has two recent posts (first, second) on a very similar topic.

Comment by an1lam on The 3 Books Technique for Learning a New Skilll · 2019-09-10T16:09:41.582Z · score: 8 (4 votes) · LW · GW

I think CLRS is a pretty questionable book for someone who hasn't programmed. I don't think it's great as a reference for writing algorithms, e.g. I think internet searching will often help you find better resources. And in terms of a straight read-through, it's one of the more theoretical algorithms texts, and a large fraction of its exercises are proofs.

If the OP is interested in an algorithms book but has never done any programming or CS, I'd recommend The Algorithm Design Manual (which I've read much of and done a decent number of exercises) or Jeff Erickson's free algorithms book (which I've read sections of and been impressed by).

Comment by an1lam on The 3 Books Technique for Learning a New Skilll · 2019-09-10T16:05:40.830Z · score: 1 (1 votes) · LW · GW

Whoops, will edit my comment to reflect that.

Comment by an1lam on The 3 Books Technique for Learning a New Skilll · 2019-09-10T02:48:08.448Z · score: 2 (2 votes) · LW · GW

For programming, I think starting with a project and using that to decide what books to read my work best. Assuming you want to learn to program rather than learn Computer Science, the books that will be helpful will depend highly on the area in which you're interested.

Do you just generally want to see if you'll be good at programming? Even if so, is there a specific area which you'd be interested in writing a program, e.g. an operating system, a server, a web app, etc.?

I agree with the comment below that SICP is a good "Why" book but did want to note that I personally didn't find SICP nearly as enlightening when I started programming as many others seem to. I've gone back to it since and loved it, but it definitely was not the thing that motivated me to practice programming a lot. Like everything else, it depends on your personality.

Comment by an1lam on An1lam's Short Form Feed · 2019-09-10T02:33:40.011Z · score: 3 (2 votes) · LW · GW

These are all good points.

After I saw that Benito did a transcript post, I considered doing one for one of Carmack's talks or a recent interview of Yann LeCunn I found pretty interesting (based on the talks of his I've listened to, LeCunn has a pretty engineering-y mindset even though he's nominally a scientist). Not going to happen immediately though since it requires a pretty big time investment.

Alternatively, maybe I'll review Masters of Doom, which is where I learned most of what I know about Carmack.

Comment by an1lam on Seven habits towards highly effective minds · 2019-09-06T15:39:55.620Z · score: 1 (1 votes) · LW · GW

I've recently been experimenting with being a Hammer. That is, applying ideas I come up with/things I learn aggressively to figure out their limits and occasionally discover an unexpected connection. I just started trying this recently, in particular with Linear Algebra, so I can't promise amazing results, but it does seem useful as a less magical way to generate more ideas.

Comment by an1lam on Seven habits towards highly effective minds · 2019-09-06T15:35:58.221Z · score: 4 (3 votes) · LW · GW

I would add "identify bottlenecks". I discuss this a bit here and it's also the topic of The Goal, the only business novel I've read. To summarize, in situations where completing a task requires taking a sequential series of steps, e.g. producing good thoughts, you're often rate-limited by the slowest step, so effort put into speeding up other faster steps is mostly wasted.

For example, I've actually seen something like the following play out in a software context. There's a team working on a project. Completing the project involves two high-level phases--coding and not coding.

Writing the code is parallelizable across team members and takes X hours per person. All the other stuff--actually running all the tests, deploying the code, etc. takes hours per person--where . The team manager becomes frustrated that the project's taking too long and proposes adding more people to speed it up. However, even if adding more people cuts down coding time by the optimal amount, i.e. per person coding time goes from something (hand-wavy) like to where is the original number of people involved, the actual time to completion of the project will still be bottlenecked by !

But what does this have to do with being a "highly effective mind"? I think there's a similar dynamic at play with the ideas to crystallized theories/principles/heuristics pipeline. If someone has a lot of ideas but takes a long time to crystallize them, they're're better off practicing at crystallizing ideas than trying to have even more ideas. On the flip side, if they can crystallize ideas quickly but take a long time to come up with them, they could benefit from practice that emphasizes generating a lot of ideas quickly.

The above may seem obvious, but I think the useful part is using the frame of "identify bottlenecks" to figure out when different advice applies, even if the actual advice being applied is standard.

Comment by an1lam on Open & Welcome Thread - September 2019 · 2019-09-05T01:07:41.098Z · score: 6 (3 votes) · LW · GW

I don't mean this to sound confrontational, but what do you expect to do differently to enable yourself to go more quickly? I ask because my personal experience has been that just saying I'm going to go faster with self-learning doesn't work very well.

For example, do you plan to do fewer exercises, devote more time, etc.?

Comment by an1lam on The Transparent Society: A radical transformation that we should probably undergo · 2019-09-05T01:04:12.161Z · score: 4 (3 votes) · LW · GW

Strong upvoted and would add that we currently live in a world where surveillance is much more common than inverse surveillance, so proponents of a transparent society should, AFAICT, be much more focused on increasing inverse surveillance than surveillance at the moment.