Comment by john_maxwell_iv on The Case for a Bigger Audience · 2019-02-15T03:57:40.617Z · score: 10 (2 votes) · LW · GW

Maybe it has something to do with this question you asked? Maybe letting people leave anonymous comments if they're approved by the post author or something like that could help?

Comment by john_maxwell_iv on How much can value learning be disentangled? · 2019-02-12T06:25:41.359Z · score: 2 (1 votes) · LW · GW

What's your answer to the postmodernist?

Comment by john_maxwell_iv on The Case for a Bigger Audience · 2019-02-10T22:03:00.209Z · score: 2 (1 votes) · LW · GW

As it happens, your writing style is pretty enjoyable

Thanks, I'm very flattered!

Comment by john_maxwell_iv on The Case for a Bigger Audience · 2019-02-10T08:44:22.228Z · score: 4 (2 votes) · LW · GW

I think having it be automated will help posts avoid getting forgotten in the sands of time.

Comment by john_maxwell_iv on The Case for a Bigger Audience · 2019-02-10T02:12:22.188Z · score: 11 (2 votes) · LW · GW

This post cites Scott Aaronson, but maybe there were other discussions too.

Comment by john_maxwell_iv on The Case for a Bigger Audience · 2019-02-10T02:01:59.539Z · score: 2 (1 votes) · LW · GW

I would think related questions is something to put off until you have lots of question data on which to tune your relatedness metric.

Comment by john_maxwell_iv on The Case for a Bigger Audience · 2019-02-10T02:00:12.275Z · score: 14 (7 votes) · LW · GW

Thanks for the reply! I see what you're saying, but here are some considerations on the other side.

Part of what I was trying to point out here is that 179 comments would not be "extraordinary" growth, it would be an "ordinary" return to what used to be the status quo. If you want to talk about startups, Paul Graham says 5-7% a week is a good growth rate during Y Combinator. 5% weekly growth corresponds to 12x annual growth, and I don't get the sense LW has grown 12x in the past year. Maybe 12x/year is more explosive than ideal, but I think there's room for more growth even if it's not explosive. IMO, growth is good partially because it helps you discover product-market fit. You don't want to overfit to your initial users, or, in the case of an online community, over-adapt to the needs of a small initial userbase. And you don't want to be one of those people who never ships. Some entrepreneurs say if you're not embarrassed by your initial product launch, you waited too long.

that metric is obviously very goodhart-able

One could easily goodhart the metric by leaving lots of useless one-line comments, but that's a little beside the point. The question for me is whether additional audience members are useful on the current margin. I think the answer is yes, if they're high-quality. The only promo method I suggested which doesn't filter heavily is the Adwords thing. Honestly I brought it up mostly to point out that we used to do that and it wasn't terrible, so it's a data point about how far it's safe to go.

A second and related reason to be skeptical of focusing on moving comments from 19 to 179 at the current stage (especially if I put on my 'community manager hat'), is a worry about wasting people's time. In general, LessWrong is a website where we don't want many core members of the community to be using it 10 hours per day. Becoming addictive and causing all researchers to be on it all day, could easily be a net negative contribution to the world. While none of your recommendations were about addictiveness, there are related ways of increasing the number of comments such as showing a user's karma score on every page, like LW 1.0 did.

What if we could make AI alignment research addictive? If you can make work feel like play, that's a huge win right?

See also Giving Your All. You could argue that I should either be spending 0% of my time on LW or 100% of my time on LW. I don't think the argument fully works, because time spent on LW is probably a complementary good with time spent reading textbooks and so on, but it doesn't seem totally unreasonable for me to see the number of upvotes I get as a proxy for the amount of progress I'm making.

I want LW to be more addictive on the current margin. I want to feel motivated to read someone's post about AI alignment and write some clever comment on it that will get me karma. But my System 1 doesn't have a sufficient expectation of upvotes & replies for me to experience a lot of intrinsic motivation to do this.

I'd suggest thinking in terms of focus destruction rather than addictiveness. Ideally, I find LW enjoyable to use without it hurting my ability to focus.

I think instead of restricting the audience, a better idea is making discussion dynamics a little less time-driven.

  • If I leave a comment on LW in the morning, and I'm deep in some equations during the afternoon, I don't want my brain nagging me to go check if I need to defend my claims on LW while the discussion is still on the frontpage.

  • Spreading discussions out over time also serves as spaced repetition to reinforce concepts.

  • I think I heard about research which found that brainstorming 5 minutes on 5 different days, instead of 25 minutes on a single day, is a better way to generate divergent creative insights. This makes sense to me because the effect of being anchored on ideas you've already had is lessened.

  • See also the CNN effect.

Re: intro texts, I'd argue having Rohin's value learning sequence go by without much of an audience to read & comment on it was a big missed opportunity. Paul Christiano's ideas seem important, and it could've been really valuable to have lively discussions of those ideas to see if we could make progress on them, or at least share our objections as they were rerun here on LW.

Ultimately, it's the idea that matters, not whether it comes in the form of a blog post, journal article, or comment. You mods have talked about the value of people throwing ideas around even when they're not 100% sure about them. I think comments are a really good format for that. [Say, random idea: what if we had a "you should turn this into a post" button for comments?]

Comment by john_maxwell_iv on Thoughts on Ben Garfinkel's "How sure are we about this AI stuff?" · 2019-02-09T07:27:35.514Z · score: 2 (1 votes) · LW · GW

I made a relevant post in the Meta section.

The Case for a Bigger Audience

2019-02-09T07:22:07.357Z · score: 58 (23 votes)
Comment by john_maxwell_iv on EA grants available (to individuals) · 2019-02-08T04:30:55.557Z · score: 9 (3 votes) · LW · GW

Paul Christiano might still be active in funding stuff. (There are a few more links to funding opportunities in the comments of that post.)

Comment by john_maxwell_iv on X-risks are a tragedies of the commons · 2019-02-07T10:24:01.178Z · score: 6 (6 votes) · LW · GW

True, but from a marketing perspective it's better to emphasize the fact reducing x-risk is in each individual's self-interest even if no one else is doing it. Also, instead of talking about AI arms races, we should talk about why AI done right means a post-scarcity era whose benefits can be shared by all. There's no real benefit to being the person who triggers the post-scarcity era.

Comment by john_maxwell_iv on Thoughts on Ben Garfinkel's "How sure are we about this AI stuff?" · 2019-02-07T10:02:05.217Z · score: 3 (2 votes) · LW · GW

Good talk. I'd like to hear what he thinks about the accelerating change/singularity angle, as applied to the point about the person living during the industrial revolution who's trying to improve the far future.

Comment by john_maxwell_iv on Thoughts on Ben Garfinkel's "How sure are we about this AI stuff?" · 2019-02-07T09:56:22.289Z · score: 10 (3 votes) · LW · GW

The criticism is expecting counter-criticism. i.e. What I think we're missing is critics who are in it for the long haul, who see their work as the first step of an iterative process, with an expectation that the AI safety field will respond and/or update to their critiques.

As someone who sometimes writes things that are a bit skeptical regarding AI doom, I find the difficulty of getting counter-criticism frustrating.

Comment by john_maxwell_iv on How does Gradient Descent Interact with Goodhart? · 2019-02-02T04:25:59.515Z · score: 13 (5 votes) · LW · GW

I think it depends almost entirely on the shape of V and W.

In order to do gradient descent, you need a function which is continuous and differentiable. So W can't be noise in the traditional regression sense (independent and identically distributed for each individual observation), because that's not going to be differentiable.

If W has lots of narrow, spiky local maxima with broad bases, then gradient descent is likely to find those local maxima, while random sampling rarely hits them. In this case, fake wins are likely to outnumber real wins in the gradient descent group, but not the random sampling group.

More generally, if U = V + W, then dU/dx = dV/dx + dW/dx. If V's gradient is typically bigger than W's gradient, gradient descent will mostly pay attention to V; the reverse is true if W's gradient is typically bigger.

But even if W's gradient typically exceeds V's gradient, U's gradient will still correlate with V's, assuming dV/dx and dW/dx are uncorrelated. (cov(dU, dV) = cov(dV+dW, dV) = cov(dV, dV) + cov(dW, dV) = cov(dV, dV).)

So I'd expect that if you change your experiment so instead of looking at the results in some band, you instead take the best n results from each group, the best n results of the gradient descent group will be better on average. Another intuition pump: Let's consider the spiky W scenario again. If V is constant everywhere, gradient descent will basically find us the nearest local maximum in W, which essentially adds random movement. But if V is a plane with a constant slope, and the random initialization is near two different local maxima in W, gradient descent will be biased towards the local maximum in W which is higher up on the plane of V. The very best points will tend to be those that are both on top of a spike in W and high up on the plane of V.

I think this is a more general point which applies regardless of the optimization algorithm you're using: If your proxy consists of something you're trying to maximize plus unrelated noise that's roughly constant in magnitude, you're still best off maximizing the heck out of that proxy, because the very highest value of the proxy will tend to be a point where the noise is high and the thing you're trying to maximize is also high.

"Constant unrelated noise" is an important assumption. For example, if you're dealing with a machine learning model, noise is likely to be higher for inputs off of the training distribution, so the top n points might be points far off the training distribution chosen mainly on the basis of noise. (Goodhart's Law arguably reduces to the problem of distribution shift.) I guess then the question is what the analogous region of input space is for approval. Where does the correspondence between approval and human value tend to break down?

(Note: Although W can't be i.i.d., W's gradient could be faked so it is. I think this corresponds to perturbed gradient descent, which apparently helps performance on V too.)

Comment by john_maxwell_iv on How much can value learning be disentangled? · 2019-01-31T10:13:30.696Z · score: 2 (1 votes) · LW · GW

Your original argument, as I understood it, was something like: Explanation aims for a particular set of mental states in the student, which is also what manipulation does, so therefore explanation can't be defined in a way that distinguishes it from manipulation. I pushed back on that. Now you're saying that explanation tends to produce side effects in the listener's values. Does this mean you're allowing the possibility that explanation can be usefully defined in a way that distinguishes it from manipulation?

BTW, computer security researchers distinguish between "reject by default" (whitelisting) and "accept by default" (blacklisting). "Reject by default" is typically more secure. I'm more optimistic about trying to specify what it means to explain something (whitelisting) than what it means to manipulate someone in a way that's improper (blacklisting). So maybe we're shooting at different targets.

Tying all of this back to FAI... you say you find the value changes that come with greater understanding to be (generally) positive and you'd like them to be more common. I'm worried about the possibility that AGI will be a global catastrophic risk. I think there are good arguments that by default, AGI will be something which is not positive. Maybe from a triage point of view, it makes sense to focus on minimizing the probability that AGI is a global catastrophic risk, and worry about the prevention of things that we think are likely to be positive once we're pretty sure the global catastrophic risk aspect of things has been solved?

In Eliezer's CEV paper, he writes:

In poetic terms, our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that inter- preted.

I haven't seen anyone on Less Wrong argue against CEV as a vision for how the future of humanity should be determined. And CEV seems to involve having the future be controlled by humans who are more knowledgable than current humans in some sense. But maybe you're a CEV skeptic?

Comment by john_maxwell_iv on How much can value learning be disentangled? · 2019-01-31T07:09:14.401Z · score: 2 (1 votes) · LW · GW

Hm, I understood the traditional Less Wrong view to be something along the lines of: there is truth about the world, and that truth is independent of your values. Wanting something to be true won't make it so. Whereas I'd expect a postmodernist to say something like: the Christians have their truth, the Buddhists have their truth, and the Atheists have theirs. Whose truth is the "real" truth comes down to the preferences of the individual. Your statement sounds more in line with the postmodernist view than the Less Wrong one.

This matters because if the Less Wrong view of the world is correct, it's more likely that there are clean mathematical algorithms for thinking about and sharing truth that are value-neutral (or at least value-orthogonal, e.g. "aim to share facts that the student will think are maximally interesting or surprising". Note that this doesn't necessarily need to be implemented in a way that a "fact" which triggers an epileptic fit and causes the student to hit the "maximally interesting" button will be selected for sharing. If I have a rough model of the user's current beliefs and preferences, I could use that to estimate the VoI of various bits of information to the user and use that as my selection criterion. Point being that our objective doesn't need to be defined in terms of "aiming for a particular set of mental states".)

Comment by john_maxwell_iv on Future directions for narrow value learning · 2019-01-30T07:31:55.664Z · score: 2 (1 votes) · LW · GW

It seems likely that there will be contradictions in human preferences that are about sufficiently difficult for humans to understand that the AI system can't simply present the contradiction to the human and expect the human to resolve it correctly, which is what I was proposing in the previous sentence.

How relevant do you expect this to be? It seems like the system could act pessimistically, under the assumption that either answer might be the correct way to resolve the contradiction, and only do actions that are in the intersection of the set of actions that each possible philosophy says is OK. Also, I'm not sure the overseer needs to think directly in terms of some uber-complicated model of the overseer's preferences that the system has; couldn't you make use of active learning and ask whether specific actions would be corrigible or incorrigible, without the system trying to explain the complex confusion it is trying to resolve?

Comment by john_maxwell_iv on How much can value learning be disentangled? · 2019-01-30T00:25:00.467Z · score: 3 (2 votes) · LW · GW

It seems that the only difference between manipulation and explanation is whether we end up with a better understanding of the situation at the end. And measuring understanding is very subtle. And even if we do it right, note that we have now motivated the AI to... aim for a particular set of mental states. We are rewarding it for manipulating us. This is contrary to the standard understanding of manipulation, which focuses on the means, not the end result.

It sounds like by the definitions you're using, a teacher who aims to help a student end up with a better understanding of the situation at the end is "manipulating" the student. Is that right?

I'm not persuaded measuring understanding is "very subtle". It seems like teachers manage to do it alright.

Comment by john_maxwell_iv on Can there be an indescribable hellworld? · 2019-01-30T00:05:44.845Z · score: 2 (1 votes) · LW · GW

the set of things that can be described to us without fundamentally changing our values is much smaller still

What's the evidence for this set being "much smaller"?

Comment by john_maxwell_iv on Future directions for narrow value learning · 2019-01-29T08:28:06.928Z · score: 4 (2 votes) · LW · GW

I am slightly less optimistic about this avenue of approach than one in which we create a system that is directly trained to be corrigible.

I'm confused about the difference between these two. Does "directly trained to be corrigible" correspond to hand-coded rules for corrigible/incorrigible behavior?

(Though this wouldn’t scale to superintelligent AI.)

Why's that? Some related thinking of mine.

Comment by john_maxwell_iv on Thoughts on reward engineering · 2019-01-26T08:57:14.697Z · score: 2 (1 votes) · LW · GW

Using powerful optimization to produce outcomes that look great to Paul-level reasoning doesn't seem wise, regardless of your views on moral questions.

Interesting. I think there are some important but subtle distinctions here.

In the standard supervised learning setup, we provide a machine learning algorithm with some X (in this case, courses of action an AI could take) and some Y (in this case, essentially real numbers representing the degree to which we approve of the courses of action). The core challenge of machine learning is to develop a model which extrapolates well beyond this data. So then the question becomes... does it extrapolate well in the sense of accurately predicting Paul-level reasoning, including deficiencies Paul would exhibit when examining complex or deceptive scenarios that are at the limit of Paul's ability to understand? Or does it extrapolate well in the sense of accurately predicting what Paul would desire on reflection, given access to all of the AI's knowledge, cognitive resources, etc.?

Let's assume for the sake of argument that all of the X and Y data is "good", i.e. it doesn't make the algorithm think it's the first Paul which is supposed to get extrapolated by including a mistake that only the first Paul would make. I'll talk about the case where we have some bad data at the end.

The standard way to measure the effectiveness of extrapolation in machine learning is to make use of a dev set. Unfortunately, that doesn't help in this case because we don't have access to labeled data from "Paul who has reflected a bunch given access to all of the AI's knowledge, cognitive resources, etc." If we did have access to such data, we could find a data point that the two Paul's label differently and test the model on that. (However, we might do a similar sort of test by asking a child to provide some labeled data, then checking to see whether the model assigns nontrivial credence to the answers an adult gives on data points where the child and the adult disagree.)

In poetic terms, we want the system to be asking itself:

Is there a plausible model that fits the labeled data I've been given which leads me to believe this world is not one in which humans actually have adequate control and understanding of the situation? Does there exist some model for the user's preferences such that I assign a decently high prior to this model, the model fits the labeled data I've been given, and when this model is extrapolated to this [malign] clever scheme I've dreamed up, it returns either "this scheme is too complicated for me to evaluate and it should be penalized on that basis" or "this scheme is just bad"?

In the absence of data which distinguishes between two hypotheses, belief in one hypothesis or the other comes down to the choice of prior. So you want the AI's cognitive architecture to be structured so that whatever concepts, learning capabilities, prediction capabilities, etc. which make it cognitively powerful also get re-used in the service of generating plausible extrapolations from the labeled data the user has provided. Then, if any of those extrapolations assign nontrivial credence to some plan being malign, that's a strike against it.

Re: the bad data case, you might handle this using the same sort of techniques which are normally used for mislabeled or noisy data. For example, split the data into 30 folds, train an ensemble on every possible combination of 10 folds, and if any one of the resulting models objects to some action, nix it. Now we're resilient to up to 20 mislabeled data points. Not saying this is a good scheme, just trying to offer a concrete illustration of how this problem seems tractable.

Comment by john_maxwell_iv on Why don't people use formal methods? · 2019-01-26T07:58:55.030Z · score: 4 (2 votes) · LW · GW

Thanks for the info!

Comment by john_maxwell_iv on Thoughts on reward engineering · 2019-01-25T10:31:37.873Z · score: 4 (2 votes) · LW · GW

The capability amplification section also seems under-motivated to me. Paul writes: "If we start with a human, then RL will only ever produce human-level reasoning about long-term consequences or about “what is good.”" But absent problems like those you describe in this post, I'm inclined to agree with Eliezer that

If arguendo you can construct an exact imitation of a human, it possesses exactly the same alignment properties as the human; and this is true in a way that is not true if we take a reinforcement learner and ask it to maximize an approval signal originating from the human. (If the subject is Paul Christiano, or Carl Shulman, I for one am willing to say these humans are reasonably aligned; and I'm pretty much okay with somebody giving them the keys to the universe in expectation that the keys will later be handed back.)

In other words, if we are aiming for Bostrom's maxipok (maximum probability of an OK outcome), it seems plausible to me that "merely" Paul's level of moral reasoning is sufficient to get us there, especially if the keys to the universe get handed back. If this is our biggest alignment-specific problem, I might sooner allocate marginal research hours towards improving formal methods or something like that.

Comment by john_maxwell_iv on Thoughts on reward engineering · 2019-01-25T10:22:17.061Z · score: 2 (1 votes) · LW · GW

Regarding long time horizons, it seems like the way humans handle this problem is to plan in high resolution over a short time horizon (the coming day or the coming week) and lower resolution over a long time horizon (the coming year or the coming decade). It seems like maybe the AI could use a similar tactic, so the 40-year planning is done with a game where each year constitutes a single time-step. I think maybe this is related to hierarchical reinforcement learning? (The option you outline seems acceptable to me though.)

Comment by john_maxwell_iv on Why don't people use formal methods? · 2019-01-25T09:21:44.585Z · score: 2 (1 votes) · LW · GW

FYI, the reason the post is titled "Why don't people use formal methods?" is because that is what the link is called, not because I was wondering why people in AI safety don't use formal methods :) I don't think formal methods are likely to be a magic bullet for AI safety. However, if one wants a provably safe AI, it seems like developing the capability to write provably correct software in easier cases could be a good first step (a necessary but not sufficient condition, if you will), and the article discusses our progress towards that goal. The article also discusses the specification problem a bit, and it seems plausible that insights from how to do formal specifications for software in general will transfer to FAI. The field seems neglected in general, relative to something like machine learning: "There really isn’t a formal methods community so much as a few tiny bands foraging in the Steppe". Finally, my own personal judgement: the "writing down a specification of what we want for an AI to optimize" part seems like it might be more amenable to being solved with a sudden flash of insight (in the same way Nick Bostrom has suggested that the problem of AGI more generally might be solved with a sudden flash of insight), whereas the problem of formally verifying software seems less like that, and if it ends up being a long slog, it might actually constitute the majority of the problem. So for those reasons, it seemed to me like LW might benefit from additional formal verification awareness/discussion.

Why don't people use formal methods?

2019-01-22T09:39:46.721Z · score: 21 (8 votes)
Comment by john_maxwell_iv on Human-AI Interaction · 2019-01-18T05:34:33.964Z · score: 2 (1 votes) · LW · GW

Thanks for the reply! Looking forward to the next post!

Comment by john_maxwell_iv on Open Thread January 2019 · 2019-01-16T10:30:13.858Z · score: 2 (1 votes) · LW · GW

This was an interesting post. However, given Google's rocky history with DARPA, I'm not convinced a high concentration of AI researchers in the US would give the US government a lead in AI.

Comment by john_maxwell_iv on Open Thread January 2019 · 2019-01-16T08:59:16.251Z · score: 2 (1 votes) · LW · GW

"FDT means choosing a decision algorithm so that, if a blackmailer inspects your decision algorithm, they will know you can't be blackmailed. If you've chosen such a decision algorithm, a blackmailer won't blackmail you."

Is this an accurate summary?

Comment by john_maxwell_iv on Human-AI Interaction · 2019-01-15T07:06:54.689Z · score: 3 (2 votes) · LW · GW

In the previous "Ambitious vs. narrow value learning" post, Paul Christiano characterized narrow value learning as learning "subgoals and instrumental values". From that post, I got the impression that ambitious vs narrow was about the scope of the task. However, in this post you suggest that ambitious vs narrow value learning is about the amount of feedback the algorithm requires. I think there is actually a 2x2 matrix of possible approaches here: we can imagine approaches which do or don't depend on feedback, and we can imagine approaches which try to learn all of my values or just some instrumental subset.

With this sort of setup, we still have the problem that we are maximizing a reward function which leads to convergent instrumental subgoals. In particular, the plan “disable the narrow value learning system” is likely very good according to the current estimate of the reward function, because it prevents the reward from changing causing all future actions to continue to optimize the current reward estimate.

I think it depends on the details of the implementation:

  • We could construct the system's world model so human feedback is a special event that exists in a separate magesterium from the physical world, and it doesn't believe any action taken in the physical world could do anything to affect the type or quantity of human feedback that's given.

  • For redundancy, if the narrow value learning system is trying to learn how much humans approve of various actions, we can tell the system that the negative score from our disapproval of tampering with the value learning system outweighs any positive score it could achieve through tampering.

  • If the reward function weights rewards according to the certainty of the narrow value learning system that they are the correct reward, that creates incentives to keep the narrow value learning system operating, so the narrow value learning system can acquire greater certainty and provide a greater reward.

To elaborate a bit on the first two bullet points: It matters a lot whether the system thinks our approval is contingent on the physical configuration of the atoms in our brains. If the system thinks we will continue to disapprove of an action even after it's reconfigured our brain's atoms, that's what we want.

Comment by john_maxwell_iv on Comments on CAIS · 2019-01-13T00:06:08.012Z · score: 7 (3 votes) · LW · GW

As a basic prior, our only example of general intelligence so far is ourselves - a species composed of agentlike individuals who pursue open-ended goals. So it makes sense to expect AGIs to be similar - especially if you believe that our progress in artificial intelligence is largely driven by semi-random search with lots of compute (like evolution was) rather than principled intelligent design.

The objective function for the semi-random search matters a lot. Evolution did semi-random search over organisms which maximize reproductive success. It seems that the best organisms for maximizing reproductive success tend to be agentlike. AI researchers do semi-random search over programs which are good for making money or impressing other researchers. If the programs which are good for making money or impressing other researchers tend to be services, I think we see progress in services.

My model is there is a lot of interest in agents recently because AlphaGo impressed other researchers a lot. However, it doesn't seem like the techniques behind AlphaGo have a lot of commercial applications. (Maybe OpenAI's new robotics initiative will change that.)

Humans think in terms of individuals with goals, and so even if there's an equally good approach to AGI which doesn't conceive of it as a single goal-directed agent, researchers will be biased against it.

I could imagine this being true in an alternate universe where there's a much greater overlap between SF fandom and machine learning research, but it doesn't seem like an accurate description of the current ML research community. (I think the biggest bias shaping the direction of the current ML research community is the emphasis on constructing benchmarks, e.g. ImageNet, and competing to excel at them. I suspect the benchmark paradigm is a contributor to the AI services trend Eric identifies.)

Comment by john_maxwell_iv on Ambitious vs. narrow value learning · 2019-01-12T22:54:39.798Z · score: 2 (1 votes) · LW · GW

It requires understanding human preferences in domains where humans are typically very uncertain, and where our answers to simple questions are often inconsistent, like how we should balance our own welfare with the welfare of others, or what kinds of activities we really want to pursue vs. enjoying in the moment.

From The easy goal inference problem is still hard

The easy goal inference problem: Given no algorithmic limitations and access to the complete human policy — a lookup table of what a human would do after making any sequence of observations — find any reasonable representation of any reasonable approximation to what that human wants.

This seems similar to what moral philosophers do: They examine their moral intuitions, and the moral intuitions of other humans, and attempt to construct models that approximate those intuitions across a variety of scenarios.

(I think the difference between what moral philosophers do and the problem you've outlined is that moral philosophers typically work from explicitly stated human preferences, whereas the problem you've outlined involves inferring revealed preferences implicitly. I like explicitly stated human preferences better, for the same reason I'd rather program an FAI using an explicit, "non-magical" programming language like Python rather than an implicit, "magical" programming language like Ruby.)

Coming up with a single moral theory that captures all our moral intuitions has proven difficult. The best approach may be a "parliament" that aggregates recommendations from a variety of different moral theories. This parallels the idea of an ensemble in machine learning.

I don't think it is necessary for the ensemble to know the correct answer 100% of the time. If some of the models in the ensemble think an action is immoral, and others think it is moral, then we can punt and ask the human overseer. Ideally, the system anticipates moral difficulties and asks us about them before they arise, so it's competitive for making time-sensitive decisions.

Comment by john_maxwell_iv on One Website To Rule Them All? · 2019-01-12T03:45:00.686Z · score: 2 (1 votes) · LW · GW

Someone claimed that https://www.metaculus.com actually has a pretty good prediction track record

Comment by john_maxwell_iv on Open Thread January 2019 · 2019-01-12T03:38:03.337Z · score: 2 (1 votes) · LW · GW

You could also randomize the thread titles over the next N months in order to collect more data. Beware the law of small numbers.

Comment by john_maxwell_iv on Book Review: The Structure Of Scientific Revolutions · 2019-01-09T09:18:24.420Z · score: 2 (1 votes) · LW · GW

Remember that Eliezer's version of Science vs Bayes is itself a paradigm. IMO it meshes imperfectly with Kuhn's ideas as Scott presents them in this post.

Comment by john_maxwell_iv on Book Review: The Structure Of Scientific Revolutions · 2019-01-09T09:13:33.410Z · score: 12 (7 votes) · LW · GW

The "What paradigm is each of these working from?" section seems like an interesting meta example of the thing Kuhn described, where you keep adding epicycles to Kuhn's "paradigm" paradigm.

This makes for an interesting companion read. I'm inclined to see the problem Kuhn identified as "functional fixedness, but for theories". Maybe because humans are cognitive misers, we prefer to re-use our existing mental representations, or make incremental improvements to them, rather than rework things from scratch. Rather than "there is no objective reality", I'd rather say "there is no lossless compression of reality (at least not one that's gonna fit in your brain)". Rather than "there is no truth", I'd rather say "our lossy mental representations sometimes generate wrong questions".

Comment by john_maxwell_iv on Reframing Superintelligence: Comprehensive AI Services as General Intelligence · 2019-01-09T04:07:58.726Z · score: 3 (2 votes) · LW · GW

eg. an AGI agent (before CAIS) would happen if we find the one true learning algorithm

I think generality and goal-directedness are likely orthogonal attributes. A "one true learning algorithm" sounds very general, but a priori I don't expect it to be any more goal-directed than the comprehensive AI services idea outlined in this post. I suspect you can take each of your comprehensive AI services and swap out the specific algorithm you were using for a one true learning algorithm without making the result any more of an agent.

I'm thinking about it something like this:

  • Traditional view of superintelligent AI ("top-down"): A superintelligent AI is something that's really good at achieving arbitrary goals. We abstract away the details of its implementation and view it as a generic hyper-competent goal achievement process, with a wide array of actions & strategies at its disposal. This view potentially lets us do FAI research without having to contribute to AI progress or depend overmuch on any particular direction that AI capabilities development proceeds in.

  • CAIS ("bottom-up"): We have a collection of AI services. We can use these services to accomplish specific tasks, including maybe eventually generating additional services. Each service represents a specific algorithm that achieves superior performance along one or more dimensions in a narrow or broad range of circumstances. If we abstract away the details of how tasks are being accomplished, that may lead to an inaccurate view of the system's behavior. For example, our machine learning algorithms may get better and better at performing classification tasks... but we have to look into the details of how the algorithm works in order to figure out whether it will consider strategies for improving its classification ability such as "pwn all other servers in the cluster and order them to search the space of hyperparameters in parallel". Our classification systems have been getting better and better, and arguably also more general, without them considering strategies like the pwnage strategy, and it's plausible this trend will continue until the algorithms are superhuman in all domains. Indeed, this feels to me like a fundamental defining characteristic of superintelligence refers to... it refers to a specific bit of computer code that is able to learn better and faster, using fewer computational resources, than whatever algorithms the human brain uses.

Comment by john_maxwell_iv on Will humans build goal-directed agents? · 2019-01-05T20:35:32.863Z · score: 5 (2 votes) · LW · GW

Your post reminded me of Paul Christiano's approval-directed agents which was also about trying to find an alternative to goal-directed agents. Looking at it again, it actually sounds a lot like applying imitation learning to humans (except imitating a speeded-up human):

It seems like approval direction allows for creative actions that the human operator approves of but would not have thought of doing themselves. Not sure if imitation learning does this.

Comment by john_maxwell_iv on Will humans build goal-directed agents? · 2019-01-05T20:22:08.797Z · score: 5 (2 votes) · LW · GW

Or we could have systems that remain uncertain about the goal and clarify what they should do when there are multiple very different options (though this has its own problems).

I'd be interested to hear more about the problems with this, if anyone has a link to an overview or just knows of problems off the top of their head.

Comment by john_maxwell_iv on What is a reasonable outside view for the fate of social movements? · 2019-01-04T10:29:50.174Z · score: 5 (5 votes) · LW · GW

One of the founders of Greenpeace eventually left the organization due to a "trend toward abandoning scientific objectivity"

Comment by john_maxwell_iv on How do we identify bottlenecks to scientific and technological progress? · 2019-01-04T09:34:12.010Z · score: 2 (1 votes) · LW · GW

I think your original phrasing made it sound kinda like I thought that we should go full steam ahead on experimental/applied research. I agree with MIRI that people should be doing more philosophical/theoretical work related to FAI, at least on the margin. The position I was taking in the thread you linked was about the difficulty of such research, not its value.

With regard to the question itself, Christian's point is a good one. If you're solely concerned with building capability, alternating between theory and experimentation, or even doing them in parallel, seems optimal. If you care about safety as well, it's probably better to cross the finish line during a "theory" cycle than an "experimentation" cycle.

Comment by john_maxwell_iv on Why I expect successful (narrow) alignment · 2019-01-02T11:04:32.099Z · score: 2 (1 votes) · LW · GW

We value quantum mechanics and relativity because there are specific phenomena which they explain well. If I'm a Newtonian physics advocate, you can point me to solid data my theory doesn't predict in order to motivate the development of a more sophisticated theory. We were able to advance beyond Newtonian physics because we were able to collect data which disconfirmed the theory. Similarly, if someone suggests a simple approach to FAI, you should offer a precise failure mode, in the sense of a toy agent in a toy environment which clearly exhibits undesired behavior (writing actual code to make the conversation precise and resolve disagreements as necessary), before dismissing it. This is how science advances. If you add complexity to your theory without knowing what the deficiencies of the simple theory are, you probably won't add the right sort of complexity because your complexity isn't well motivated.

Math and algorithms are both made up of thought stuff. So these physics metaphors can only go so far. If I start writing a program to solve some problem, and I choose a really bad set of abstractions, I may get halfway through writing my program and think to myself "geez this is a really hard problem". The problem may be very difficult to think about given the set of abstractions I've chosen, but it could be much easier to think about given a different set of abstractions. It'd be bad for me to get invested in the idea that the problem is hard to think about, because that could cause me to get attached to my current set of inferior abstractions. If you want the simplest solution possible, you should exhort yourself to rethink your abstractions if things get complicated. You should always be using your peripheral vision to watch out for alternative sets of abstractions you could be using.

Comment by john_maxwell_iv on What makes people intellectually active? · 2019-01-02T10:14:53.847Z · score: 9 (2 votes) · LW · GW

My pipeline is weaker in the later stages. I often spend some time developing ideas right after capturing them, or develop ideas if I randomly start having thoughts related to some idea I already captured. But communication currently takes what feels like too long, maybe because I am perfectionistic about ensuring that any given essay contains all the ideas in my notebook that logically seem like they belong in that essay. I would probably write more if my pipeline was better. Hoping to do some dedicated work improving it at some point.

I find that even if I capture an idea, I'll drop it by default if it isn't collected with a set of related ideas which are part of an ongoing thought process. This is somewhat tricky to accomplish.

Whenever I have an idea, I try to ask myself what future situation the idea might be useful in. Then I either find a page I already have for that situation or create one if it doesn't already exist. Not sure if that's helpful.

Comment by john_maxwell_iv on How do we identify bottlenecks to scientific and technological progress? · 2018-12-31T23:36:17.342Z · score: 3 (3 votes) · LW · GW

FWIW, I can't speak for Paul Christiano, but insofar as you've attempted to summarize what I think here, I don't endorse the summary.

Comment by john_maxwell_iv on Why I expect successful (narrow) alignment · 2018-12-31T22:01:13.370Z · score: 1 (3 votes) · LW · GW

most hardcoded rules approaches fail

Nowadays people don't use hardcoded rules, they use machine learning. Then the problem of AI safety boils down to the problem of doing really good machine learning: having models with high accuracy that generalize well. Once you've got a really good model for your preferences, and for what constitutes corrigible behavior, then you can hook it up to an agent if you want it to be able to do a wide range of tasks. (Note: I wouldn't recommend a "consequentialist" agent, because consequentialism sounds like the system believes the ends justify the means, and that's not something we want for our first AGI--see corrigibility.)

Also, they set up the community after they realized the problem, and they could probably make more money elsewhere. So there doesn't seem to be strong incentives to lie.

I'm not accusing them of lying, I think they are communicating their beliefs accurately. "It's difficult to get a man to understand something, when his salary depends on his not understanding it." MIRI has a lot invested in the idea that AI safety is a hard problem which must have a difficult solution. So there's a sense in which the salaries of their employees depend on them not understanding how a simple solution to FAI might work. This is really unfortunate because simple solutions tend to be the most reliable & robust.

If we start with the assumption that a simple solution does exist, we’re much more likely to find one.

Donald Knuth

MIRI has started with the opposite assumption. Insofar as I'm pessimistic about them as an organization, this is the main reason why.

Inadequate Equilibria talks about the problem of the chairman of Japan's central bank, who doesn't have a financial incentive to help Japan's economy. Does it change the picture if the chairman of Japan's central bank could make a lot more money in investment banking? Not really. He still isn't facing a good set of incentives when he goes into work every day, meaning he is not going to do a good job. He probably cares more about local social incentives than his official goal of helping the Japanese economy. Same for MIRI employees.

Comment by john_maxwell_iv on What makes people intellectually active? · 2018-12-31T10:09:44.920Z · score: 10 (4 votes) · LW · GW

My model is having ideas is a skill and the best way to do it is to practice at high volume. Most people are too judgemental of their ideas and they don't believe they can have ideas/having ideas isn't a mental motion that occurs to them.

If you want to have more ideas, I suggest reinforcing yourself for the behavior of having ideas regardless of their quality. A temporary delusion that any particular idea you have is REALLY GOOD is a great reinforcer. Ideally, having one idea that seems REALLY GOOD puts you in a bit of an excited, hypomanic state which triggers additional ideas.

For me, keeping a notebook of my ideas works really well. Categorizing and writing down an idea means I won't forget it and I can admire it as a new addition to my collection. I've been doing this for some years, and I now have way more interesting ideas than I know what to do with.

Another trick is to keep a notebook on your bed and write down ideas as you're falling asleep. Seems like thinking is more fluid then.

I don't ever sit down to generate ideas nowadays, I just engage in passive collection. That seems more time-efficient, because if I sit down deliberately to ideate, I waste a lot of time thinking "I don't have any ideas" and just waiting for the ideas to come. (If you really want to engage in this kind of brainstorming, I'd recommend first collecting ideas for brainstorming prompts. You can make your own list: any time something makes me go "hm, that's a bit different than the way I usually think", I add it to my list of brainstorming prompts.) I'm now at the point where just creating a page in my notebook for "ideas of type X" seems to prompt my subconscious to gather ideas of that type. I think it's a manifestation of my internal packrat instinct... like stamp collecting, but for ideas.

Comment by john_maxwell_iv on Open and Welcome Thread December 2018 · 2018-12-31T07:22:27.574Z · score: 12 (5 votes) · LW · GW

I don't think you need to build a city from scratch. It's sufficient to converge on a (partially?) abandoned city with cheap real estate. This is basically what gentrification is.

Version 0.01 of a new city is to simply get together a group of people who want to work on projects uninterrupted, buy or rent a cheap house in a town the public has forgotten about, and live/work there. 10 or 20 housemates is plenty to feel a sense of community. The EA Hotel is a recent experiment with this. I just spent 6 months there and had a great experience. They're doing a fundraiser now if you want to contribute.

Experimenting with new gentrification strategies sounds like a cool idea, I'm just skeptical of building new real estate in the middle of nowhere if there's plenty of real estate in the middle of nowhere which is already available. (Also, I think your post would benefit from a more even-handed presentation.)

Comment by john_maxwell_iv on Why I expect successful (narrow) alignment · 2018-12-30T21:11:43.144Z · score: 4 (2 votes) · LW · GW

Why would you put two consequentialists in your system that are optimizing for different sets of consequences? A consequentialist is a high-level component, not a low-level one. Anthropomorphic bias might lead you to believe that a "consequentialist agent" is ontologically fundamental, a conceptual atom which can't be divided. But this doesn't really seem to be true from a software perspective.

Comment by john_maxwell_iv on Why I expect successful (narrow) alignment · 2018-12-30T02:31:24.105Z · score: 13 (10 votes) · LW · GW

In general, however, it seems that you believe something you want to believe and find justifications for this belief, because it is more comfortable to think that things will magically work out. Eliezer wrote at length about this failure mode.

I get the same impression from the AI doomsayers. I think it is more likely to be true for the AI doomsayers on priors, because a big part of their selling proposition to their donors is that they have some kind of special insight about the difficulty of the alignment problem that the mainstream AI research community lacks. And we all know how corrupting these kind of financial incentives can be.

The real weak point of the AI doomsayer argument is not discussed anywhere in the sequences, but Eliezer does defend it here. The big thing Eliezer seems to believe, which I don't think any mainstream AI people believe, is that shoving a consequentialist with preferences about the real world into your optimization algorithm is gonna be the key to making it a lot more powerful. I don't see any reasons to believe this, it seems kinda anthropomorphic to be honest, and his point about "greedy local properties" is a pretty weak one IMO. We have algorithms like Bayesian optimization which don't have these greedy local properties but still don't have consequentialist objectives in the real world, because their "desires", "knowledge", "ontology" etc. deal only with the loss surface they're trying to optimize over. It seems weird and implausible that giving the algorithm consequentialist desires involving the "outside world" would somehow be the key to making optimization (and therefore learning) work a lot better.

(Note: This comment of mine feels uncomfortably like a quasi ad-hominem attack of the kind that generates more heat than light. I'm only writing it because shminux's comment also struck me that way, and I'm currently playing tit for tat on this. I don't endorse writing these sort of comments in general. I don't necessarily think you should stop donating to MIRI or stop encouraging researchers to be paranoid about AI safety. I'm just annoyed by the dismissiveness I sometimes see towards anyone who doesn't think about AI safety the same way MIRI does, and I think it's worth sharing that the more I think & learn about AI and ML, the more wrong MIRI seems.)

Comment by john_maxwell_iv on Paul's research agenda FAQ · 2018-12-13T17:58:49.638Z · score: 8 (3 votes) · LW · GW

[My friend suggested that I read this for a discussion we were going to have. Originally I was going to write up some thoughts on it in an email to him, but I decided to make it a comment in case having it be publicly available generates value for others. But I'm not going to spend time polishing it since this post is 5 months old and I don't expect many people to see it. Alex, if you read this, please tell me if reading it felt more effective than having an in-person discussion.]

OK, but doesn't this only incentivize it to appear like it's doing what the operator wants? Couldn’t it optimize for hijacking its reward signal, while seeming to act in ways that humans are happy with?

We’re not just training the agent to take good actions. We’re also training it to comprehensibly answer questions about why it took the actions it took, to arbitrary levels of detail. (Imagine a meticulous boss grilling an employee about a report he put together, or a tax auditor grilling a corporation about the minutiae of its expenses.) We ensure alignment by randomly performing thorough evaluations of its justifications for its actions, and punishing it severely if any of those justifications seem subversive. To the extent we trust these justifications to accurately reflect the agent’s cognition, we can trust the agent to not act subversively (and thus be aligned).

I think this is too pessimistic. "Reward signal" is slipping in some assumptions about the system's reward architecture. Why is it that I don't "hijack my reward signal" by asking a neurosurgeon to stimulate my pleasure center? Because when I simulate the effect of that using my current reward architecture, the simulation is unfavorable.

Instead of auditing the cognition, I'd prefer the system have correctly calibrated uncertainty and ask humans for clarification regarding specific data points in order to learn stuff (active learning).

Re: "Why should we expect the agent’s answers to correspond to its cognition at all?" I like Cynthia Rudin's comparison of explainable ML and interpretable ML. A system where the objective function has been changed in order to favor simplicity/interpretability (see e.g. sparse representation theory?) seems more robust with fewer moving parts, and probably will also be a more effective ML algorithm since simple models generalize better.

In other words, the amplified agent randomly “audits” the distilled agent, and punishes the distilled agent very harshly if it fails the audit. Though the distilled agent knows that it might be able to deceive its supervisor when it isn’t audited, it’s so scared of the outcome where it tries to do that and gets audited that it doesn’t even want to try. (Even if you were 99% confident that you could get away with tax evasion, you wouldn’t want to try if you knew the government tortures and murders the families of the tax evaders they catch.)

This paragraph seems unnecessarily anthropomorphic. Why not just say: The distilled agent only runs computations if it is at least 99.9% confident that the computation is aligned. Or: To determine whether to run a particular computation, our software uses the following expected value formula:

probability_computation_is_malign * malign_penalty + probability_computation_is_not_malign * benefit_conditional_on_computation_not_being_malign

(When we talk about our programs "wanting" to do various things, we are anthropomorphizing. For example, I could talk about a single-page Javascript app and say: when the internet is slow, the program "wants" to present a good interface to the user even without being able to gather data from the server. In the same way, I could say that a friendly AI "wants" to be aligned. The alignment problem is not due to a lack of desire on the part of the AI to be aligned. The alignment problem is due to the AI having mistaken beliefs about what alignment is.)

I like the section on "epistemically corrigible humans". I've spent a fair amount of time thinking about "epistemically corrigible" meta-learning (no agent--just thinking in terms of the traditional supervised/unsupervised learning paradigms, and just trying to learn the truth, not be particularly helpful--although if you have a learning algorithm that's really good at finding the truth, one particular thing you could use it to find the truth about is which actions humans approve of). I'm now reasonably confident that it's doable, and deprioritizing this a little bit in order to figure out if there's a more important bottleneck in my vision of how FAI could work elsewhere.

"Subproblems of “worst-case guarantees” include ensuring that ML systems are robust to distributional shift and adversarial inputs, which are also broad open questions in ML, and which might require substantial progress on MIRI-style research to articulate and prove formal bounds."

By "MIRI-style research" do we mean HRAD? I thought HRAD was notable because it did not attack machine learning problems like distributional shift, adversarial inputs, etc.

Overall, I think Paul's agenda makes AI alignment a lot harder than it needs to be, and his solution sounds a lot more complicated than I think an FAI needs to be. I think "Deep reinforcement learning from human preferences" is already on the right track, and I don't understand why we need all this additional complexity in order to learn corrigibility and stuff. Complexity matters because an overcomplicated solution is more likely to have bugs. I think instead of overbuilding for intuitions based on anthropomorphism, we should work harder to clarify exactly what the failure modes could be for AI and focus more on finding the simplest solution that is not susceptible to any known failure modes. (The best way to make a solution simple is to kill multiple birds with one stone, so by gathering stones that are capable of killing multiple known birds, you maximize the probability that those stones also kill unknown birds.)

Also, I have an intuition that any approach to FAI that is appears overly customized for FAI is going to be the wrong one. For example, I feel "epistemic corrigibility" is a better problem to study than "corrigibility" because it is a more theoretically pure problem--the notion of truth (beliefs that make correct predictions) is far easier than the notion of human values. (And once you have epistemic corrigibility, I think it's easy to use that to create a system which is corrigibility in the sense we need. And epistemic corrigibility kinda a reframed version of what mainstream ML researchers already look for when they consider the merits of various ML algorithms.)

Comment by john_maxwell_iv on Book review: Artificial Intelligence Safety and Security · 2018-12-09T06:48:49.468Z · score: 3 (2 votes) · LW · GW

Thanks for this post! It was especially valuable to see the link to Eliezer's comments in "I expect that a number of AI safety researchers will deny that such a system will be sufficiently powerful." It explains some aspects of Eliezer's worldview that had previously confused me. Personally, I am at the opposite end of the spectrum relative to Eliezer--my intuition is that consequentialist planning and accurate world-modeling are fundamentally different tasks which are likely to stay that way. I'd argue that the history of statistics & machine learning is the history of gradual improvements to accurate world-modeling which basically haven't shown any tendencies towards greater consequentialism. My default expectation is this trend will continue. The idea that you can't have one without the other seems anthropomorphic to me.

Comment by john_maxwell_iv on Suggestion: New material shouldn't be released too fast · 2018-12-08T05:00:10.769Z · score: 7 (3 votes) · LW · GW

Re: reference works. If LW worked a bit more like a mailing list, where reading a post caused you to "subscribe" to its comment thread by default, then people would feel less pressure to comment on things quickly after they were released so their comments would be read. The old LW 1.0 had a subscribe feature, but I'm not sure how many people used it. It was opt-in rather than opt-out.

Right now, my feeling is that checking LW and participating in discussions every day frazzles my brain in subtle ways. So I mostly don't try to keep up with posts day to day, and instead figure I will archive binge at some point in the future. But it seems like the value in me reading a post and leaving a comment on it is a lot higher than just reading the post if my comment gets upvoted. And if I leave a comment while archive binging, it's much less likely to be read. So I suppose while archive binging, instead of leaving a lot of little comments, it's better for me to try to categorize related thoughts into an entire post's worth of material and make a toplevel post with those ideas so people will actually read them? I'm a lot slower at writing posts than writing comments, but I'm hoping to overcome that problem at some point.

General and Surprising

2017-09-15T06:33:19.797Z · score: 3 (3 votes)

Heuristics for textbook selection

2017-09-06T04:17:01.783Z · score: 8 (8 votes)

Revitalizing Less Wrong seems like a lost purpose, but here are some other ideas

2016-06-12T07:38:58.557Z · score: 21 (27 votes)

Zooming your mind in and out

2015-07-06T12:30:58.509Z · score: 8 (9 votes)

Purchasing research effectively open thread

2015-01-21T12:24:22.951Z · score: 12 (13 votes)

Productivity thoughts from Matt Fallshaw

2014-08-21T05:05:11.156Z · score: 13 (14 votes)

Managing one's memory effectively

2014-06-06T17:39:10.077Z · score: 14 (15 votes)

OpenWorm and differential technological development

2014-05-19T04:47:00.042Z · score: 6 (7 votes)

System Administrator Appreciation Day - Thanks Trike!

2013-07-26T17:57:52.410Z · score: 70 (71 votes)

Existential risks open thread

2013-03-31T00:52:46.589Z · score: 10 (11 votes)

Why AI may not foom

2013-03-24T08:11:55.006Z · score: 23 (35 votes)

[Links] Brain mapping/emulation news

2013-02-21T08:17:27.931Z · score: 2 (7 votes)

Akrasia survey data analysis

2012-12-08T03:53:35.658Z · score: 13 (14 votes)

Akrasia hack survey

2012-11-30T01:09:46.757Z · score: 11 (14 votes)

Thoughts on designing policies for oneself

2012-11-28T01:27:36.337Z · score: 80 (80 votes)

Room for more funding at the Future of Humanity Institute

2012-11-16T20:45:18.580Z · score: 18 (21 votes)

Empirical claims, preference claims, and attitude claims

2012-11-15T19:41:02.955Z · score: 5 (28 votes)

Economy gossip open thread

2012-10-28T04:10:03.596Z · score: 23 (30 votes)

Passive income for dummies

2012-10-27T07:25:33.383Z · score: 17 (22 votes)

Morale management for entrepreneurs

2012-09-30T05:35:05.221Z · score: 9 (14 votes)

Could evolution have selected for moral realism?

2012-09-27T04:25:52.580Z · score: 4 (14 votes)

Personal information management

2012-09-11T11:40:53.747Z · score: 18 (19 votes)

Proposed rewrites of LW home page, about page, and FAQ

2012-08-17T22:41:57.843Z · score: 18 (19 votes)

[Link] Holistic learning ebook

2012-08-03T00:29:54.003Z · score: 10 (17 votes)

Brainstorming additional AI risk reduction ideas

2012-06-14T07:55:41.377Z · score: 12 (15 votes)

Marketplace Transactions Open Thread

2012-06-02T04:31:32.387Z · score: 29 (30 votes)

Expertise and advice

2012-05-27T01:49:25.444Z · score: 17 (22 votes)

PSA: Learn to code

2012-05-25T18:50:01.407Z · score: 34 (39 votes)

Knowledge value = knowledge quality × domain importance

2012-04-16T08:40:57.158Z · score: 8 (13 votes)

Rationality anecdotes for the homepage?

2012-04-04T06:33:32.097Z · score: 3 (8 votes)

Simple but important ideas

2012-03-21T06:59:22.043Z · score: 18 (23 votes)

6 Tips for Productive Arguments

2012-03-18T21:02:32.326Z · score: 30 (45 votes)

Cult impressions of Less Wrong/Singularity Institute

2012-03-15T00:41:34.811Z · score: 34 (59 votes)

[Link, 2011] Team may be chosen to receive $1.4 billion to simulate human brain

2012-03-09T21:13:42.482Z · score: 8 (15 votes)

Productivity tips for those low on motivation

2012-03-06T02:41:20.861Z · score: 7 (12 votes)

The Singularity Institute has started publishing monthly progress reports

2012-03-05T08:19:31.160Z · score: 21 (24 votes)

Less Wrong mentoring thread

2011-12-29T00:10:58.774Z · score: 31 (34 votes)

Heuristics for Deciding What to Work On

2011-06-01T07:31:17.482Z · score: 20 (23 votes)

Upcoming meet-ups: Auckland, Bangalore, Houston, Toronto, Minneapolis, Ottawa, DC, North Carolina, BC...

2011-05-21T05:06:08.824Z · score: 5 (8 votes)

Being Rational and Being Productive: Similar Core Skills?

2010-12-28T10:11:01.210Z · score: 18 (31 votes)

Applying Behavioral Psychology on Myself

2010-06-20T06:25:13.679Z · score: 53 (60 votes)

The Math of When to Self-Improve

2010-05-15T20:35:37.449Z · score: 6 (16 votes)

Accuracy Versus Winning

2009-04-02T04:47:37.156Z · score: 12 (21 votes)

So you say you're an altruist...

2009-03-12T22:15:59.935Z · score: 11 (35 votes)