Posts

'Longtermism' definitional discussion on EA Forum 2019-08-02T23:53:03.731Z · score: 17 (6 votes)
Henry Kissinger: AI Could Mean the End of Human History 2018-05-15T20:11:11.136Z · score: 46 (10 votes)
AskReddit: Hard Pills to Swallow 2018-05-14T11:20:37.470Z · score: 17 (6 votes)
Predicting Future Morality 2018-05-06T07:17:16.548Z · score: 22 (8 votes)
AI Safety via Debate 2018-05-05T02:11:25.655Z · score: 40 (9 votes)
FLI awards prize to Arkhipov’s relatives 2017-10-28T19:40:43.928Z · score: 12 (5 votes)
Functional Decision Theory: A New Theory of Instrumental Rationality 2017-10-20T08:09:25.645Z · score: 36 (13 votes)
A Software Agent Illustrating Some Features of an Illusionist Account of Consciousness 2017-10-17T07:42:28.822Z · score: 16 (3 votes)
Neuralink and the Brain’s Magical Future 2017-04-23T07:27:30.817Z · score: 6 (7 votes)
Request for help with economic analysis related to AI forecasting 2016-02-06T01:27:39.810Z · score: 6 (7 votes)
[Link] AlphaGo: Mastering the ancient game of Go with Machine Learning 2016-01-27T21:04:55.183Z · score: 14 (15 votes)
[LINK] Deep Learning Machine Teaches Itself Chess in 72 Hours 2015-09-14T19:38:11.447Z · score: 8 (9 votes)
[Link] First almost fully-formed human [foetus] brain grown in lab, researchers claim 2015-08-19T06:37:21.049Z · score: 7 (8 votes)
[Link] Neural networks trained on expert Go games have just made a major leap 2015-01-02T15:48:16.283Z · score: 15 (16 votes)
[LINK] Attention Schema Theory of Consciousness 2013-08-25T22:30:01.903Z · score: 3 (4 votes)
[LINK] Well-written article on the Future of Humanity Institute and Existential Risk 2013-03-02T12:36:39.402Z · score: 16 (19 votes)
The Center for Sustainable Nanotechnology 2013-02-26T06:55:18.542Z · score: 4 (11 votes)

Comments

Comment by esrogs on ChristianKl's Shortform · 2019-10-13T09:27:30.707Z · score: 4 (3 votes) · LW · GW

Interesting short thread on this here.

Comment by esrogs on Thoughts on "Human-Compatible" · 2019-10-12T19:10:50.058Z · score: 2 (1 votes) · LW · GW
Also, did you mean “wasn’t”? :)

Lol, you got me.

Comment by esrogs on A simple sketch of how realism became unpopular · 2019-10-12T08:15:35.045Z · score: 4 (2 votes) · LW · GW
I don't know who, if anyone, noted the obvious fallacy in Berkeley's master argument prior to Russell in 1912

Not even Moore in 1903?

Russell's criticism is in line with Moore's famous 'The Refutation of Idealism' (1903), where he argues that if one recognizes the act-object distinction within conscious states, one can see that the object is independent of the act.

Isn't that the same argument Russell was making?

Comment by esrogs on Thoughts on "Human-Compatible" · 2019-10-12T07:47:51.737Z · score: 2 (1 votes) · LW · GW
Why does it matter so much that we point exactly to be human?

Should that be "to the human" instead of "to be human"? Wan't sure if you meant to say simply that, or if more words got dropped.

Or maybe it was supposed to be: "matter so much that what we point exactly to be human?"

Comment by esrogs on Thoughts on "Human-Compatible" · 2019-10-12T07:37:02.613Z · score: 2 (1 votes) · LW · GW

FWIW, this reminds me of Holden Karnofsky's formulation of Tool AI (from his 2012 post, Thoughts on the Singularity Institute):

Another way of putting this is that a "tool" has an underlying instruction set that conceptually looks like: "(1) Calculate which action A would maximize parameter P, based on existing data set D. (2) Summarize this calculation in a user-friendly manner, including what Action A is, what likely intermediate outcomes it would cause, what other actions would result in high values of P, etc." An "agent," by contrast, has an underlying instruction set that conceptually looks like: "(1) Calculate which action, A, would maximize parameter P, based on existing data set D. (2) Execute Action A." In any AI where (1) is separable (by the programmers) as a distinct step, (2) can be set to the "tool" version rather than the "agent" version, and this separability is in fact present with most/all modern software. Note that in the "tool" version, neither step (1) nor step (2) (nor the combination) constitutes an instruction to maximize a parameter - to describe a program of this kind as "wanting" something is a category error, and there is no reason to expect its step (2) to be deceptive.

If I understand correctly, his "agent" is your Consequentialist AI, and his "tool" is your Decoupled AI 1.

Comment by esrogs on Thoughts on "Human-Compatible" · 2019-10-12T07:25:06.494Z · score: 2 (1 votes) · LW · GW
Here's my summary: reward uncertainty through some extension of a CIRL-like setup, accounting for human irrationality through our scientific knowledge, doing aggregate preference utilitarianism for all of the humans on the planet, discounting people by how well their beliefs map to reality, perhaps downweighting motivations such as envy (to mitigate the problem of everyone wanting positional goods).

Perhaps a dumb question, but is "reward" being used as a noun or verb here? Are we rewarding uncertainty, or is "reward uncertainty" a goal we're trying to achieve?

Comment by esrogs on What are your strategies for avoiding micro-mistakes? · 2019-10-06T23:09:55.556Z · score: 8 (3 votes) · LW · GW

Incidentally, a similar consideration leads me to want to avoid re-using old metaphors when explaining things. If you use multiple metaphors you can triangulate on the meaning -- errors in the listener's understanding will interfere destructively, leaving something closer to what you actually meant.

For this reason, I've been frustrated that we keep using "maximize paperclips" as the stand-in for a misaligned utility function. And I think reusing the exact same example again and again has contributed to the misunderstanding Eliezer describes here:

Original usage and intended meaning: The problem with turning the future over to just any superintelligence is that its utility function may have its attainable maximum at states we'd see as very low-value, even from the most cosmopolitan standpoint.
Misunderstood and widespread meaning: The first AGI ever to arise could show up in a paperclip factory (instead of a research lab specifically trying to do that). And then because AIs just mechanically carry out orders, it does what the humans had in mind, but too much of it.

If we'd found a bunch of different ways to say the first thing, and hadn't just said, "maximize paperclips" every time, then I think the misunderstanding would have been less likely.

Comment by esrogs on What are your strategies for avoiding micro-mistakes? · 2019-10-06T23:02:14.868Z · score: 4 (2 votes) · LW · GW

One mini-habit I have is to try to check my work in a different way from the way I produced it.

For example, if I'm copying down a large number (or string of characters, etc.), then when I double-check it, I read off the transcribed number backwards. I figure this way my brain is less likely to go "Yes yes, I've seen this already" and skip over any discrepancy.

And in general I look for ways to do the same kind of thing in other situations, such that checking is not just a repeat of the original process.

Comment by esrogs on Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More · 2019-10-04T23:24:30.910Z · score: 9 (4 votes) · LW · GW
And I think claim 5 is basically in line with what, say, Bostrom would discuss (where stabilization is a thing to do before we attempt to build a sovereign).

You mean in the sense of stabilizing the whole world? I'd be surprised if that's what Yann had in mind. I took him just to mean building a specialized AI to be a check on a single other AI.

Comment by esrogs on FB/Discord Style Reacts · 2019-10-04T17:43:40.351Z · score: 2 (1 votes) · LW · GW
Maybe try out giving people an optional prompt about why they upvoted or downvoted things that is quite short

I like this idea.

Comment by esrogs on List of resolved confusions about IDA · 2019-10-01T04:32:57.291Z · score: 14 (5 votes) · LW · GW
act-based = based on short-term preferences-on-reflection

For others who were confused about what "short-term preferences-on-reflection" would mean, I found this comment and its reply to be helpful.

Putting it into my own words: short-term preferences-on-reflection are about what you would want to happen in the near term, if you had a long time to think about it.

By way of illustration, AlphaZero's long-term preference is to win the chess game, its short-term preference is whatever its policy network spits out as the best move to make next, and its short-term preference-on-reflection is the move it wants to make next after doing a fuck-ton of MCTS.

Comment by esrogs on If the "one cortical algorithm" hypothesis is true, how should one update about timelines and takeoff speed? · 2019-08-26T23:25:12.251Z · score: 4 (2 votes) · LW · GW
and that displacement cells both exist and exist in neocortex

Both exist and exist?

Comment by esrogs on Troll Bridge · 2019-08-24T00:19:38.513Z · score: 4 (4 votes) · LW · GW
there is a troll who will blow up the bridge with you on it, if you cross it "for a dumb reason"

Does this way of writing "if" mean the same thing as "iff", i.e. "if and only if"?

Comment by esrogs on "Designing agent incentives to avoid reward tampering", DeepMind · 2019-08-15T03:06:35.947Z · score: 7 (4 votes) · LW · GW

I can't resist giving this pair of rather incongruous quotes from the paper

Could you spell out what makes the quotes incongruous with each other? It's not jumping out at me.

Comment by esrogs on Can we really prevent all warming for less than 10B$ with the mostly side-effect free geoengineering technique of Marine Cloud Brightening? · 2019-08-05T16:19:08.622Z · score: 4 (2 votes) · LW · GW
1 billion per year per W/m^2 of reduced forcing

For others who weren't sure what "reduced forcing" refers to: https://en.wikipedia.org/wiki/Radiative_forcing

And to put that number in context, the "net anthropogenic component" of radiative forcing appears to be about 1.5 W/m^2 (according to an image in the wikipedia article), so canceling out the anthropogenic component would have an ongoing cost of 1.5 billion per year.

Comment by esrogs on Writing children's picture books · 2019-08-03T04:03:02.664Z · score: 3 (2 votes) · LW · GW

Or you could imagine writing for a smarter but less knowledgeable person. E.g. 10 y.o. Feynman.

Comment by esrogs on Another case of "common sense" not being common? · 2019-07-31T20:48:38.178Z · score: 8 (4 votes) · LW · GW
Okay, that is probably not that good a characterization.

I appreciate the caveat, but I'm actually not seeing the connection at all. What is the relationship you see between common sense and surprisingly simple solutions to problems?

Comment by esrogs on Just Imitate Humans? · 2019-07-27T02:40:24.970Z · score: 5 (2 votes) · LW · GW
Could enough human-imitating artificial agents (running much faster than people) prevent unfriendly AGI from being made?

This seems very related to the question of whether uploads would be safer than some other kind of AGI. Offhand, I remember a comment from Eliezer suggesting that he thought that would be safer (but that uploads would be unlikely to happen first).

Not sure how common that view is though.

Acquiring data: put a group of people in a house with a computer. Show them things (images, videos, audio files, etc.) and give them a chance to respond at the keyboard. Their keyboard actions are the actions, and everything between actions is an observation. Then learn the policy of the group of humans.

Wouldn't this take an enormous amount of observation time to generate enough data to learn a human-imitating policy?

Comment by esrogs on The Self-Unaware AI Oracle · 2019-07-24T18:21:10.207Z · score: 7 (4 votes) · LW · GW

Just want to note that I like your distinctions between Algorithm Land and the Real World and also between Level-1 optimization and Level-2 optimization.

I think some discussion of AI safety hasn't been clear enough on what kind of optimization we expect in which domains. At least, it wasn't clear to me.

But a couple things fell into place for me about 6 months ago, which very much rhyme with your two distinctions:

1) Inexploitability only makes sense relative to a utility function, and if the AI's utility function is orthogonal to yours (e.g. because it is operating in Algorithm Land), then it may be exploitable relative to your utility function, even though it's inexploitable relative to its own utility function. See this comment (and thanks to Rohin for the post that prompted the thought).

2) While some process that's optimizing super-hard for an outcome in Algorithm Land may bleed out into affecting the Real World, this would sort of be by accident, and seems much easier to mitigate than a process that's trying to affect the Real World on purpose. See this comment.

Putting them together, a randomly selected superintelligence doesn't care about atoms, or about macroscopic events unfolding through time (roughly the domain of what we care about). And just because we run it on a computer that from our perspective is embedded in this macroscopic world, and that uses macroscopic resources (compute time, energy), doesn't mean it's going to start caring about macroscopic Real World events, or start fighting with us for those resources. (At least, not in a Level-2 way.)

On the other hand, powerful computing systems we build are not going to be randomly selected from the space of possible programs. We'll have economic incentives to create systems that do consider and operate on the Real World.

So it seems to me that a randomly selected superintelligence may not actually be dangerous (because it doesn't care about being unplugged -- that's a macroscopic concept that seems simple and natural from our perspective, but would not actually correspond to something in most utility functions), but that the superintelligent systems anyone is likely to actually build will be much more likely to be dangerous (because they will model and or act on the Real World).

Comment by esrogs on The AI Timelines Scam · 2019-07-23T04:37:04.547Z · score: 4 (2 votes) · LW · GW

I see two links in your comment that are both linking to the same place -- did you mean for the first one (with the text: "the criticism that the usage of "scam" in the title was an instance of the noncentral fallacy") to link to something else?

Comment by esrogs on The Self-Unaware AI Oracle · 2019-07-23T04:05:50.918Z · score: 2 (1 votes) · LW · GW
The way I read it, Gwern's tool-AI article is mostly about self-improvement.

I'm not sure I understand what you mean here. I linked Gwern's post because your proposal sounded very similar to me to Holden's Tool AI concept, and Gwern's post is one of the more comprehensive responses I can remember coming across.

Is it your impression that what you're proposing is substantially different from Holden's Tool AI?

When I say that your idea sounded similar, I'm thinking of passages like this (from Holden):

Another way of putting this is that a “tool” has an underlying instruction set that conceptually looks like: “(1) Calculate which action A would maximize parameter P, based on existing data set D. (2) Summarize this calculation in a user-friendly manner, including what Action A is, what likely intermediate outcomes it would cause, what other actions would result in high values of P, etc.” An “agent,” by contrast, has an underlying instruction set that conceptually looks like: “(1) Calculate which action, A, would maximize parameter P, based on existing data set D. (2) Execute Action A.” In any AI where (1) is separable (by the programmers) as a distinct step, (2) can be set to the “tool” version rather than the “agent” version, and this separability is in fact present with most/all modern software. Note that in the “tool” version, neither step (1) nor step (2) (nor the combination) constitutes an instruction to maximize a parameter - to describe a program of this kind as “wanting” something is a category error, and there is no reason to expect its step (2) to be deceptive….This is important because an AGI running in tool mode could be extraordinarily useful but far more safe than an AGI running in agent mode. In fact, if developing “Friendly AI” is what we seek, a tool-AGI could likely be helpful enough in thinking through this problem as to render any previous work on “Friendliness theory” moot.

Compared to this (from you):

Finally, we query the system in a way that is compatible with its self-unawareness. For example, if we want to cure cancer, one nice approach would be to program it to search through its generative model and output the least improbable scenario wherein a cure for cancer is discovered somewhere in the world in the next 10 years. Maybe it would output: "A scientist at a university will be testing immune therapy X, and they will combine it with blood therapy Y, and they'll find that the two together cure all cancers". Then, we go combine therapies X and Y ourselves.

Your, "Then, we go combine therapies X and Y ourselves." to me sounds a lot like Holden's separation of (1) Calculating the best action vs (2) Either explaining (in the case of Tool AI) or executing (in the case of Agent AI) the action. In both cases you seem to be suggesting that we can reap the rewards of superintelligence but retain control by treating the AI as an advisor rather than as an agent who acts on our behalf.

Am I right that what you're proposing is pretty much along the same lines as Holden's Tool AI -- or is there some key difference that I'm missing?

Comment by esrogs on The Self-Unaware AI Oracle · 2019-07-22T23:41:13.244Z · score: 6 (3 votes) · LW · GW

Also see these discussions of Drexler's Comprehensive AI Services proposal, which also emphasizes non-agency:

Comment by esrogs on The Self-Unaware AI Oracle · 2019-07-22T23:28:53.678Z · score: 4 (2 votes) · LW · GW

If you haven't already seen it, you might want to check out: https://www.gwern.net/Tool-AI

Comment by esrogs on 1hr talk: Intro to AGI safety · 2019-07-17T18:17:24.291Z · score: 4 (2 votes) · LW · GW

The Dive in! link in the last paragraph appears to be broken. It's taking me to: https://www.lesswrong.com/posts/DbZDdupuffc4Xgm7H/%E2%81%A0http://mindingourway.com/dive-in/

Comment by esrogs on What are we predicting for Neuralink event? · 2019-07-17T06:52:55.830Z · score: 15 (3 votes) · LW · GW

Scoring your predictions: it looks like you got all three "not see" predictions right, as well as #1 and #3 from "will see", with only #2 from "will see" missing (though you had merely predicted we'd see something "closer to" your "will see" list, so missing one doesn't necessarily mean you were wrong).

Comment by esrogs on Open Thread July 2019 · 2019-07-16T21:42:07.423Z · score: 9 (3 votes) · LW · GW
But while it may have been sensible to start (fully 10 years ago, now!)

Correction: CFAR was started in 2012 (though I believe some of the founders ran rationality camps the previous summer, in 2011), so it's been 7 (or 8) years, not 10.

Comment by esrogs on Integrity and accountability are core parts of rationality · 2019-07-16T19:53:57.543Z · score: 2 (1 votes) · LW · GW

Got it, that makes sense.

Comment by esrogs on Integrity and accountability are core parts of rationality · 2019-07-16T19:03:15.606Z · score: 2 (1 votes) · LW · GW

But being in a position of power filters for competence, and competence filters for accurate beliefs.

If the quoted bit had instead said:

This means that highly competent people in positions of power often have less accurate beliefs than highly competent people who are not in positions of power.

I wouldn't necessarily have disagreed. But as is I'm pretty skeptical of the claim (again depending on what is meant by "often").

Comment by esrogs on Integrity and accountability are core parts of rationality · 2019-07-16T18:22:05.408Z · score: 6 (3 votes) · LW · GW
This means that highly competent people in positions of power often have less accurate beliefs than much less competent people who are not in positions of power.

Not sure how strong you intend this statement to be (due to ambiguity of "often"), but I would think that all-else-equal, a randomly selected competent person with some measure of power has more accurate beliefs than a less competent person w/o power, even after controlling for e.g. IQ.

Would you disagree with that?

I'd grant that the people with the very most accurate beliefs are probably not the same as the people who are the very most competent, but that's mostly just because the tails come apart.

I'd also grant that having power subjects one to new biases. But being competent and successful is a strong filter for your beliefs matching reality (at least in some domains, and to the extent that your behavior is determined by your beliefs), while incompetence often seems to go hand-in-hand with various kinds of self-deception (making excuses, blaming others, having unrealistic expectations of what will work or not).

So overall I'd expect the competent person's beliefs to be more accurate.

Comment by esrogs on What are we predicting for Neuralink event? · 2019-07-15T06:04:16.397Z · score: 2 (1 votes) · LW · GW
I notice I wanted to put 'dexterous motor control' on both lists, so I'm somehow confused; it seems like we already have prostheses that perform pretty well based on external nerve sites (like reading off what you wanted to do with your missing hand from nerves in your arm) but I somehow don't expect us to have the spatial precision or filtering capacity to do that in the brain.

We've had prostheses that let people control computer cursors via a connection directly to the brain at least since 2001. Would you not count that as dexterous motor control?



Comment by esrogs on The AI Timelines Scam · 2019-07-12T07:38:22.513Z · score: 12 (7 votes) · LW · GW

Sure if you just call it "honest reporting". But that was not the full phrase used. The full phrase used was "honest reporting of unconsciously biased reasoning".

I would not call trimming that down to "honest reporting" a case of honest reporting! ;-)

If I claim, "Joe says X, and I think he honestly believes that, though his reasoning is likely unconsciously biased here", then that does not at all seem to me like an endorsement of X, and certainly not a clear endorsement.

Comment by esrogs on How much background technical knowledge do LW readers have? · 2019-07-12T07:03:59.214Z · score: 4 (2 votes) · LW · GW

Nitpick: "code" (in the computer programming sense) is a mass noun, so you don't say "codes" to refer to programs or snippets of computer code.

Comment by esrogs on Please give your links speaking names! · 2019-07-12T06:55:59.403Z · score: 11 (5 votes) · LW · GW
Unfortunately they are not rendered as footnotes when printed

This seems like a fault in the printing process.

If the author is optimizing for one reading format, and you want to convert it to another, and it's unsatisfactory in the new format, then perhaps the conversion process is what should be improved.

Comment by esrogs on The AI Timelines Scam · 2019-07-12T06:39:22.201Z · score: 4 (2 votes) · LW · GW

Could you say more about what you mean here? I don't quite see the connection between your comment and the point that was quoted.

I understand the quoted bit to be pointing out that if you don't know when a disaster is coming you _might_ want to prioritize preparing for it coming sooner rather than later (e.g. since there's a future you who will be available to prepare for the disaster if it comes in the future, but you're the only you available to prepare for it if it comes tomorrow).

Of course you could make a counter-argument that perhaps you can't do much of anything in the case where disaster is coming soon, but in the long-run your actions can compound, so you should focus on long-term scenarios. But the quoted bit is only saying that there's "an argument", and doesn't seem to be making a strong claim about which way it comes out in the final analysis.

Was your comment meaning to suggest the possibility of a counter-argument like this one, or something else? Did you interpret the bit you quoted the same way I did?

Comment by esrogs on The AI Timelines Scam · 2019-07-12T06:21:06.048Z · score: 23 (10 votes) · LW · GW
but be like, "let me think for myself whether that is correct".

From my perspective, describing something as "honest reporting of unconsciously biased reasoning" seems much more like an invitation for me to think for myself whether it's correct than calling it a "lie" or a "scam".

Calling your opponent's message a lie and a scam actually gets my defenses up that you're the one trying to bamboozle me, since you're using such emotionally charged language.

Maybe others react to these words differently though.

Comment by esrogs on How can guesstimates work? · 2019-07-10T23:16:08.079Z · score: 10 (6 votes) · LW · GW
We just have better medical and welfare systems which allow people to take more risks.

I would think it's something like this, though I would put it differently: we're not at subsistence anymore. If you're at subsistence, then probably most of what you're doing is just to get by, and if you deviate too much from it, you die and/or fail to reproduce.

Now that we have slack, we can take on bigger risks, speculate, and be creative and enterprising.

Comment by esrogs on What was the official story for many top physicists congregating in Los Alamos during the Manhattan Project? · 2019-07-03T22:42:37.349Z · score: 26 (10 votes) · LW · GW

If ever there was a question with your name on it... ;-)

Comment by esrogs on Risks from Learned Optimization: Introduction · 2019-06-01T23:33:37.620Z · score: 6 (3 votes) · LW · GW

Got it, that's helpful. Thank you!

Comment by esrogs on Risks from Learned Optimization: Introduction · 2019-06-01T21:11:23.857Z · score: 20 (7 votes) · LW · GW

Very clear presentation! As someone outside the field who likes to follow along, I very much appreciate these clear conceptual frameworks and explanations.

I did however get slightly lost in section 1.2. At first reading I was expecting this part:

which we will contrast with the outer alignment problem of eliminating the gap between the base objective and the intended goal of the programmers.

to say, "... gap between the behavioral objective and the intended goal of the programmers." (In which case the inner alignment problem would be a subcomponent of the outer alignment problem.)

On second thought, I can see why you'd want to have a term just for the problem of making sure the base objective is aligned. But to help myself (and others who think similarly) keep this all straight, do you have a pithy term for "the intended goal of the programmers" that's analogous to base objective, mesa objective, and behavioral objective?

Would meta objective be appropriate?

(Apologies if my question rests on a misunderstanding or if you've defined the term I'm looking for somewhere and I've missed it.)

Comment by esrogs on Open Thread April 2019 · 2019-04-28T19:50:25.182Z · score: 4 (2 votes) · LW · GW

Apparently the author is a science writer (makes sense), and it's his first book:

I’m a freelance science writer. Until January 2018 I was science writer for BuzzFeed UK; before that, I was a comment and features writer for the Telegraph, having joined in 2007. My first book, The Rationalists: AI and the geeks who want to save the world, for Weidenfeld & Nicolson, is due to be published spring 2019. Since leaving BuzzFeed, I’ve written for the Times, the i, the Telegraph, UnHerd, politics.co.uk, and elsewhere.

https://tomchivers.com/about/

Comment by esrogs on Open Thread April 2019 · 2019-04-28T18:37:49.268Z · score: 16 (5 votes) · LW · GW

Someone wrote a book about us:

Overall, they have sparked a remarkable change.  They’ve made the idea of AI as an existential risk mainstream; sensible, grown-up people are talking about it, not just fringe nerds on an email list.  From my point of view, that’s a good thing.  I don’t think AI is definitely going to destroy humanity.  But nor do I think that it’s so unlikely we can ignore it.  There is a small but non-negligible probability that, when we look back on this era in the future, we’ll think that Eliezer Yudkowsky and Nick Bostrom  — and the SL4 email list, and LessWrong.com — have saved the world.  If Paul Crowley is right and my children don’t die of old age, but in a good way — if they and humanity reach the stars, with the help of a friendly superintelligence — that might, just plausibly, be because of the Rationalists.

https://marginalrevolution.com/marginalrevolution/2019/04/the-ai-does-not-hate-you.html

H/T https://twitter.com/XiXiDu/status/1122432162563788800

Comment by esrogs on Book review: The Sleepwalkers by Arthur Koestler · 2019-04-25T02:40:33.084Z · score: 7 (4 votes) · LW · GW
Figuring out what's up with that seems like a major puzzle of our time.

Would be curious to hear more about your confusion and why it seems like such a puzzle. Does "when you aggregate over large numbers of things, complex lumpiness smooths out into boring sameness" not feel compelling to you?

If not, why not? Maybe you can confuse me too ;-)

Comment by esrogs on The Principle of Predicted Improvement · 2019-04-25T02:35:19.607Z · score: 15 (5 votes) · LW · GW
E[P(H|D)]≥E[P(H)]
In English the theorem says that the probability we should expect to assign to the true value of H after observing the true value of D is greater than or equal to the expected probability we assign to the true value of H before observing the value of D.

I have a very basic question about notation -- what tells me that H in the equation refers to the true hypothesis?

Put another way, I don't really understand why that equation has a different interpretation than the conservation-of-expected-evidence equation: E[P(H=hi|D)]=P(H=hi).

In both cases I would interpret it as talking about the expected probability of some hypothesis, given some evidence, compared to the prior probability of that hypothesis.

Comment by esrogs on Alignment Newsletter One Year Retrospective · 2019-04-12T17:09:29.049Z · score: 6 (3 votes) · LW · GW
I think I've commented on your newsletters a few times, but haven't comment more because it seems like the number of people who would read and be interested in such a comment would be relatively small, compared to a comment on a more typical post.

I am surprised you think this. Don't the newsletters tend to be relatively highly upvoted? They're one of the kinds of links that I always automatically click on when I see them on the LW front page.

Maybe I'm basing this too much on my own experience, but I would love to see more discussion on the newsletter posts.

Comment by esrogs on Degrees of Freedom · 2019-04-03T02:00:26.862Z · score: 9 (5 votes) · LW · GW

For freedom-as-arbitrariness, see also: Slack

Comment by esrogs on Degrees of Freedom · 2019-04-03T01:18:59.695Z · score: 11 (5 votes) · LW · GW
If your car was subject to a perpetual auction and ownership tax as Weyl proposes, bashing your car to bits with a hammer would cost you even if you didn’t personally need a car, because it would hurt the rental or resale value and you’d still be paying tax.

I don't think this is right. COST stands for "Common Ownership Self-Assessed Tax". The self-assessed part refers to the idea that you personally state the value you'd be willing to sell the item for (and pay tax on that value). Once you've destroyed the item, presumably you'd be willing to part with the remains for a lower price, so you should just re-state the value and pay a lower tax.

It's true that damaging the car hurts the resale value and thus costs you (in terms of your material wealth), but this would be true whether or not you were living under a COST regime.

Comment by esrogs on How good is a human's gut judgement at guessing someone's IQ? · 2019-04-03T00:19:17.970Z · score: 5 (3 votes) · LW · GW
Whatever ability IQ tests and math tests measure, I believe that lacking that ability doesn’t have any effect on one’s ability to make a good social impression or even to “seem smart” in conversation.

That section of Sarah's post jumped out at me too, because it seemed to be the opposite of my experience. In my (limited, subject-to-confirmation-bias) experience, how smart someone seems to me in conversation seems to match pretty well with how they did on standardized tests (or other measures of academic achievement). Obviously not perfectly, but way way better than chance.

Comment by esrogs on How good is a human's gut judgement at guessing someone's IQ? · 2019-04-03T00:09:16.628Z · score: 4 (2 votes) · LW · GW
I would also expect that courtesy of things like Dunning-Kruger, people towards the bottom will be as bad at estimating IQ as they are competence at any particular thing.

FWIW, the original Dunning-Kruger study did not show the effect that it's become known for. See: https://danluu.com/dunning-kruger/

In particular:

In two of the four cases, there's an obvious positive correlation between perceived skill and actual skill, which is the opposite of the pop-sci conception of Dunning-Kruger.
Comment by esrogs on Unconscious Economics · 2019-03-28T17:42:56.240Z · score: 10 (3 votes) · LW · GW

I'm not totally sure I'm parsing this sentence correctly. Just to clarify, "large firm variation in productivity" means "large variation in the productivity of firms" rather than "variation in the productivity of large firms", right?

Also, the second part is saying that on average there is productivity growth across firms, because the productive firms expand more than the less productive firms, yes?

Comment by esrogs on What failure looks like · 2019-03-19T17:41:21.784Z · score: 2 (1 votes) · LW · GW

Not sure exactly what you mean by "numerical simulation", but you may be interested in https://ought.org/ (where Paul is a collaborator), or in Paul's work at OpenAI: https://openai.com/blog/authors/paul/ .