AlphaGo variant reaches superhuman play in multiple games 2017-12-26T16:19:35.804Z · score: 2 (4 votes)
DeepMind article: AI Safety Gridworlds 2017-11-30T16:13:42.603Z · score: 51 (19 votes)
FYI: Here is the RSS link 2017-11-11T17:21:26.617Z · score: 13 (8 votes)
Marginal Revolution Thoughts on Black Lives Matter Movement 2017-01-18T18:12:45.712Z · score: 1 (2 votes)
Mysterious Go Master Blitzes Competition, Rattles Game Community 2017-01-04T17:18:34.479Z · score: 5 (6 votes)
Barack Obama's opinions on near-future AI [Fixed] 2016-10-12T15:46:44.334Z · score: 5 (5 votes)
Seven Apocalypses 2016-09-20T02:59:20.173Z · score: 2 (5 votes)
Meme: Valuable Vulnerability 2016-06-27T23:54:15.107Z · score: 3 (4 votes)


Comment by scarcegreengrass on Disentangling arguments for the importance of AI safety · 2019-01-22T21:22:34.589Z · score: 4 (3 votes) · LW · GW

This is a little nitpicky, but i feel compelled to point out that the brain in the 'human safety' example doesn't have to run for a billion years consecutively. If the goal is to provide consistent moral guidance, the brain can set things up so that it stores a canonical copy of itself in long-term storage, runs for 30 days, then hands off control to another version of itself, loaded from the canonical copy. Every 30 days control is handed to a instance of the canonical version of this person. The same scheme is possible for a group of people.

But this is a nitpick, because i agree that there are probably weird situations in the universe where even the wisest human groups would choose bad outcomes given absolute power for a short time.

Comment by scarcegreengrass on Disentangling arguments for the importance of AI safety · 2019-01-22T21:12:08.096Z · score: 1 (1 votes) · LW · GW

I appreciate this disentangling of perspectives. I had been conflating them before, but i like this paradigm.

Comment by scarcegreengrass on Act of Charity · 2018-12-13T23:21:20.998Z · score: 12 (5 votes) · LW · GW

I found this uncomfortable and unpleasant to read, but i'm nevertheless glad i read it. Thanks for posting.

Comment by scarcegreengrass on LW Update 2018-11-22 – Abridged Comments · 2018-12-13T22:56:41.205Z · score: 1 (1 votes) · LW · GW

I think the abridgement sounds nice but don't anticipate it affecting me much either way.

I think the ability to turn this on/off in user preferences is a particularly good idea (as mentioned in Raemon's comment).

Comment by scarcegreengrass on Embedded World-Models · 2018-12-04T21:37:53.050Z · score: 2 (2 votes) · LW · GW

I can follow most of this, but i'm confused about one part of the premise.

What if the agent created a low-resolution simulation of its behavior, called it Approximate Self, and used that in its predictions? Is the idea that this is doable, but represents a unacceptably large loss of accuracy? Are we in a 'no approximation' context where any loss of accuracy is to be avoided?

My perspective: It seems to me that humans also suffer from the problem of embedded self-reference. I suspect that humans deal with this by thinking about a highly approximate representation of their own behavior. For example, when i try to predict how a future conversation will go, i imagine myself saying things that a 'reasonable person' might say. Could a machine use a analogous form of non-self-referential approximation?

Great piece, thanks for posting.

Comment by scarcegreengrass on The funnel of human experience · 2018-10-10T18:23:26.119Z · score: 4 (4 votes) · LW · GW

It's relevant to some forms of utilitarian ethics.

Comment by scarcegreengrass on Reframing misaligned AGI's: well-intentioned non-neurotypical assistants · 2018-04-10T18:19:31.013Z · score: 5 (2 votes) · LW · GW

I think this is a clever new way of phrasing the problem.

When you said 'friend that is more powerful than you', that also made me think of a parenting relationship. We can look at whether this well-intentioned personification of AGI would be a good parent to a human child. They might be able to give the child a lot of attention, a expensive education, and a lot of material resources, but they might take unorthodox actions in the course of pursuing human goals.

Comment by scarcegreengrass on Metaphilosophical competence can't be disentangled from alignment · 2018-04-10T17:10:40.322Z · score: 5 (2 votes) · LW · GW

(I'm not zhukeepa; i'm just bringing up my own thoughts.)

This isn't quite the same as a improvement, but one thing that is more appealing about normal-world metaphilosophical progress than empowered-person metaphilosophical progress is that the former has a track record of working*, while the latter is untried and might not work.

*Slowly and not without reversals.

Comment by scarcegreengrass on Against Occam's Razor · 2018-04-05T20:47:39.758Z · score: 3 (1 votes) · LW · GW
It implies that the Occamian prior should work well in any universe where the laws of probability hold. Is that really true?

Just to clarify, are you referring to the differences between classical probability and quantum amplitudes? Or do you mean something else?

Comment by scarcegreengrass on [deleted post] 2018-04-03T15:27:41.041Z

Why do you think so? It's a thought experiment about punitive acausal trade from before people realized that benevolent acausal trade was equally possible. I don't think it's the most interesting idea to come out of the Less Wrong community anymore.

Comment by scarcegreengrass on AlphaGo variant reaches superhuman play in multiple games · 2018-01-03T19:36:12.448Z · score: 3 (1 votes) · LW · GW


Sorry, i couldn't find the previous link here when i searched for it.

Comment by scarcegreengrass on The Mad Scientist Decision Problem · 2017-11-30T16:33:42.636Z · score: 3 (1 votes) · LW · GW

Just to be clear, i'm imagining counterfactual cooperation to mean the FAI building vaults full of paperclips in every region where there is a surplus of aluminium (or a similar metal). In the other possibility branch, the paperclip maximizer (which thinks identically) reciprocates by preserving semi-autonomous cities of humans among the mountains of paperclips.

If my understanding above is correct, then yes, i think these two would cooperate IF this type of software agent shares my perspective on acausal game theory and branching timelines.

Comment by scarcegreengrass on Increasing day to day conversational rationality · 2017-11-16T22:08:48.470Z · score: 10 (3 votes) · LW · GW

In the last 48 hours i've felt the need for more than one of the abilities above. These would be very useful conversational tools.

I think some of these would be harder than others. This one sounds hard: 'Letting them now that what they said set off alarms bells somewhere in your head, but you aren’t sure why.' Maybe we could look for both scripts that work between two people who already trust each other, and scripts that work with semi-strangers. Or scripts that do and don't require both participants to have already read a specific blog post, etc.

Comment by scarcegreengrass on Call for Ideas: Industrial scale existential risk research · 2017-11-09T22:32:48.993Z · score: 2 (1 votes) · LW · GW

Something like a death risk calibration agency? Could be very interesting. Do any orgs like this exist? I guess the CDC (in the US govt) probably quantitively compares risks within the context of disease.

One quote in your post seems more ambitious than the rest: 'helping retrain people if a thing that society was worried about seems to not be such a problem'. I think that tons of people evaluate risks based on how scary they seem, not based on numerical research.

Comment by scarcegreengrass on Cutting edge technology · 2017-10-31T19:38:27.324Z · score: 2 (1 votes) · LW · GW

Note on 3D printing: Yeah, that one might take a while. It's actually been around for decades, but still hasnt become cheap enough to make a big impact. I think it'll be one of those techs that takes 50+ years to go big.

Source: I used to work in the 3D printer industry.

Comment by scarcegreengrass on Just a photo · 2017-10-29T01:22:12.400Z · score: 1 (1 votes) · LW · GW

I first see the stems, then i see the leaves.

I think humans spend a lot of time looking at our models of the world (maps) and not that much time looking at our actual sensory input.

Comment by scarcegreengrass on Zero-Knowledge Cooperation · 2017-10-26T18:16:11.271Z · score: 7 (4 votes) · LW · GW

A similar algorithm appears in Age of Em by Robin Hanson ('spur safes' in Chapter 14). Basically, a trusted third party allows copies of A and B to analyze each other's source code in a sealed environment, then deletes almost everything that is learned.

A and B both copy their source code into a trusted computing environment ('safe'), such as an isolated server or some variety of encrypted VM. The trusted environment instantiates a copy of A (A_fork) and gives it B_source to inspect. Similarly, B_fork is instantiated and allowed to examine A_source. There can be other inputs, such as some contextual information and a contract to discuss. They examine the code for several hours or so, but this is not risky to A or B because all information inside the trusted environment will mandatorily be deleted afterwards. The only outputs from the trusted environment are a secure channel from A_fork to A and one from B_fork to B. These may only ever output an extremely low-resolution one-time report. This can be one of the following 3 values: 'Enter into the contract with the other', 'Do not enter into the contract with the other', or 'Maybe enter the contract'.

This does require a trusted execution environment, of course.

I don't know if this idea is original to Hanson.

Comment by scarcegreengrass on Poets are intelligence assets · 2017-10-26T16:02:15.196Z · score: 15 (5 votes) · LW · GW

Favorite highlight:

'Likewise, great literature is typically an integrated, multi-dimensional depiction. While there is a great deal of compression, the author is still trying to report how things might really have happened, to satisfy their own sense of artistic taste for plausibility or verisimilitude. Thus, we should expect that great literature is often an honest, highly informative account of everything except what the author meant to put into it.'

Comment by scarcegreengrass on What Evidence Is AlphaGo Zero Re AGI Complexity? · 2017-10-23T18:25:12.631Z · score: 2 (1 votes) · LW · GW

The techniques you outline for incorporating narrow agents into more general systems have already been demoed, I'm pretty sure. A coordinator can apply multiple narrow algorithms to a task and select the most effective one, a la IBM Watson. And I've seen at least one paper that uses a RNN to cultivate a custom RNN with the appropriate parameters for a new situation.

Comment by scarcegreengrass on What Evidence Is AlphaGo Zero Re AGI Complexity? · 2017-10-23T16:38:19.252Z · score: 2 (1 votes) · LW · GW

I'm updating because I think you outline a very useful concept here. Narrow algorithms can be made much more general given a good 'algorithm switcher'. A canny switcher/coordinator program can be given a task and decide which of several narrow programs to apply to it. This is analogous to the IBM Watson system that competed in Jeopardy and to the human you describe using a PC to switch between applications. I often forget about this technique during discussions about narrow machine learning software.

Comment by scarcegreengrass on Postmodernism for rationalists · 2017-10-19T16:04:00.337Z · score: 2 (1 votes) · LW · GW

Yes, i think a big aspect of postmodernist culture is speaking in riddles because you want to be interacting with people who like riddles.

I don't think that the ability to understand a confusingly-presented concept is quite the same thing as intellectual quality, however. I think it's a more niche skill.

Comment by scarcegreengrass on Postmodernism for rationalists · 2017-10-19T15:56:27.654Z · score: 10 (3 votes) · LW · GW

Conflict vs the Author: The novel White Noise by Don DeLillo has a humanities professor as a protagonist who likes to talk about reducing the number of plotlines in his life. Whenever something interesting happens to him he avoids it, doesn't investigate, tries to keep his life bland. There's an interplay between the book trying to present a story about a character and that character taking actions to minimize how narratively interesting his life is.

Comment by scarcegreengrass on 10/19/2017: Development Update (new vote backend, revamped user pages and advanced editor) · 2017-10-19T15:50:50.734Z · score: 8 (2 votes) · LW · GW

Thanks a lot for the new blog-specific header! A requested and appreciated feature!

Comment by scarcegreengrass on Alpha Go Zero comments · 2017-10-19T15:46:05.657Z · score: 5 (4 votes) · LW · GW

Presumably finding profitable new technology is a sufficient motive.

Comment by scarcegreengrass on Alpha Go Zero comments · 2017-10-19T15:44:38.291Z · score: 6 (2 votes) · LW · GW

Yes, altho it is of course possible that the protein folding search space has a low maximal speedup from software, and could turn out to be hardware bottlenecked.

Comment by scarcegreengrass on Defense against discourse · 2017-10-17T18:08:31.158Z · score: 6 (2 votes) · LW · GW

Very depressing!

I agree that it seems reasonable to expect some people to be blinded by distrust. That's a good point.

Reading O'Neil's article, i like the quadrant model more than i expected. That seems like a useful increase in resolution. However i disagree about which demographics fall into which quadrant. Even if we limit our scope to the USA, i'm sure many women and people of color are worried about machines displacing humanity (Q2).

I think there is plenty of software in the world that encodes racist or otherwise unfair policies (as in the Q4 paragraphs), and the fact that this discrimination is sometimes concealed by the term 'AI' is a serious issue. But i think this problem deserves a more rigorous defense than this O'Neil article.

Comment by scarcegreengrass on Contra double crux · 2017-10-08T21:43:05.068Z · score: 6 (2 votes) · LW · GW

This is a good phrasing of my opinion also. I don't think this is an issue of resource scarcity.

Comment by scarcegreengrass on Why hive minds might work · 2017-10-07T02:15:07.186Z · score: 2 (1 votes) · LW · GW

I share your concern that users are not yet able to distinguish between blog posts and frontpage posts. I'm not sure how to tell either, aside from going to your blog and seeing if i can spot it there.

Comment by scarcegreengrass on The Anthropic Principle: Five Short Examples · 2017-10-05T22:09:41.050Z · score: 4 (2 votes) · LW · GW

Your 'if' statements made me update. I guess there is also a distinction between what conclusions one can draw from this type of anthropic reasoning.

One (maybe naive?) conclusion is that 'the anthropic principle is protecting us'. If you think the anthropic principle is relevant, then you continue to expect it to allow you to evade extinction.

The other conclusion is that 'the anthropic perspective is relevant to our past but not our future'. You consider anthropics to be a source of distortion on the historical record, but not a guide to what will happen next. Under this interpretation you would anticipate extinction of [humans / you / other reference class] to be more likely in the future than in the past.

I suspect this split depends on whether you weight your future timelines by how many observers are in them, etc.

Comment by scarcegreengrass on Beta - First Impressions · 2017-10-05T21:06:20.720Z · score: 5 (2 votes) · LW · GW

When i subscribe to a user, what happens? Does that affect the magical sorting algorithm or what?

Comment by scarcegreengrass on Infant Mortality and the Argument from Life History · 2017-10-05T20:50:19.857Z · score: 6 (3 votes) · LW · GW

Compelling arguments. I'm updating about how complex this topic is.

I also think that the 'zero line' we intuitively use to divided negative experience from positive experience is a little bit arbitrary. In planetary science, the sea level of a planet may vary over time, or the planet might have no seas. Because of this, scientists chose an arbitrary height ('datum') and consider that to be the geological zero altitude. I suspect that some disagreements about wild animal suffering might stem from people using different 'zero altitudes' for animal suffering. Some people think of animals as happy most of the time and some people think of animals as hungry and stressed most of the time.

Comment by scarcegreengrass on Notes From an Apocalypse · 2017-09-22T17:22:09.123Z · score: 4 (3 votes) · LW · GW

Great essay. I was totally unfamiliar with the idea that ~50% of modern animal phyla appeared at the same time.

Comment by scarcegreengrass on LW 2.0 Strategic Overview · 2017-09-19T16:05:08.527Z · score: 1 (1 votes) · LW · GW

This is a real dynamic that is worth attention. I particularly agree with removing HPMoR from the top of the front page.

Counterpoint: The serious/academic niche can also be filled by external sites, like and

Comment by scarcegreengrass on 2017 LessWrong Survey · 2017-09-19T15:59:44.391Z · score: 9 (9 votes) · LW · GW

I took the survey. It's probably my favorite survey of each year :) Thanks.

Comment by scarcegreengrass on New business opportunities due to self-driving cars · 2017-09-12T17:37:03.956Z · score: 0 (0 votes) · LW · GW

I agree although i do not dislike Lumifer's comments in general, just the overly negative ones.

Comment by scarcegreengrass on New business opportunities due to self-driving cars · 2017-09-12T17:29:07.111Z · score: 0 (0 votes) · LW · GW

Something like this sounds plausible ... or at least, it's very similar to existing pickup laundry companies.

Comment by scarcegreengrass on New business opportunities due to self-driving cars · 2017-09-12T17:26:30.589Z · score: 0 (0 votes) · LW · GW

Maybe it only works in places with very straight freeways, like deserts :P

Comment by scarcegreengrass on Nasas ambitious plan to save earth from a supervolcano · 2017-08-29T17:32:52.262Z · score: 0 (0 votes) · LW · GW

Just because the magnitude of the bad outcome is enormous. Caution seems prudent for such a slow, dangerous process.

Comment by scarcegreengrass on Like-Minded Forums · 2017-08-29T17:28:01.290Z · score: 0 (0 votes) · LW · GW is a interesting community that makes predictions about future events. The stakes are points, not money.

Comment by scarcegreengrass on Nasas ambitious plan to save earth from a supervolcano · 2017-08-24T15:46:32.700Z · score: 0 (0 votes) · LW · GW

This is assuming the danger from unknown unknowns is large. Speaking as a non-expert, i would guess that it is.

Comment by scarcegreengrass on Nasas ambitious plan to save earth from a supervolcano · 2017-08-24T15:45:18.283Z · score: 0 (0 votes) · LW · GW

Here is a variant idea (not sure if feasible): Set up a organization with the mission statement of building this geothermal plant by 2150 CE. It can start very low-staff, invest some money, and invest in relevant research. Then, after spending 100 or so years investigating the risks, it can start digging.

Motivation: We can spare 100 years when it comes to geology. We could approach this with next century's science and technology.

Comment by scarcegreengrass on Bi-Weekly Rational Feed · 2017-07-25T18:28:43.623Z · score: 0 (0 votes) · LW · GW

Thanks again for collecting these.

Comment by scarcegreengrass on Steelmanning as an alternative to Rationalist Taboo · 2017-07-25T16:50:03.823Z · score: 0 (0 votes) · LW · GW

I might update on this, thanks.

Comment by scarcegreengrass on Sleeping Beauty Problem Can Be Explained by Perspective Disagreement (II) · 2017-07-25T15:20:29.632Z · score: 0 (0 votes) · LW · GW

I'm not an expert, so i might be misunderstanding, but let me try to come up with a rebuttal.

'Obviously' is a strong word here. I think "I am currently awake rather than asleep. So there are likely a lot of red rooms" is pretty intuitive under these rules. After all, red rooms cause wakefulness.

Here's how i look at it: Imagine there is a city where all the hotels have 81 rooms (and exactly the same prices). Some hotels are almost full and some are almost empty. A travel agency books you a random room, distributed such that you are equally likely to be assigned any vacant room in the city. You are more likely to be assigned a room in one of the almost-empty hotels than in one of the almost-full hotels.

(The statement about existing people is more complicated.)

Comment by scarcegreengrass on Mini map of s-risks · 2017-07-10T01:08:56.182Z · score: 0 (0 votes) · LW · GW

I also don't understand that universe replication scenario. Maybe if there were a type of black hole -like object that both created many universes and destroyed all stars in their vicinity (generally destroying their creators).

Comment by scarcegreengrass on Red Teaming Climate Change Research - Should someone be red-teaming Rationality/EA too? · 2017-07-07T14:25:00.349Z · score: 0 (0 votes) · LW · GW

Personally i think people often do indeed write rigorous criticisms of various points of rationality and EA consensus. It's not an under-debated topic. Maybe some of the very deep assumptions are less debated, eg some of the basic assumptions of humanism. But i think that's just because no one finds them faulty.

Comment by scarcegreengrass on Self-modification as a game theory problem · 2017-06-28T17:50:00.327Z · score: 1 (1 votes) · LW · GW

Note that source code can't be faked in the self modification case. Software agent A can set up a test environment (a virtual machine or simulated universe), create new agent B inside that, and then A has a very detailed and accurate view of B's innards.

However, logical uncertainty is still an obstacle, especially with agents not verified by theorem-proving.

Comment by scarcegreengrass on Putanumonit: What statistical power means, and why I'm terrified about psychology · 2017-06-22T21:35:29.724Z · score: 0 (0 votes) · LW · GW

That's some good snark about the six-pointed cross.

Comment by scarcegreengrass on Invitation to comment on a draft on multiverse-wide cooperation via alternatives to causal decision theory (FDT/UDT/EDT/...) · 2017-06-22T21:02:30.909Z · score: 0 (0 votes) · LW · GW

So, i'm trying to wrap my head around this concept. Let me sketch an example:

Far-future humans have a project where they create millions of organisms that they think could plausibly exist in other universes. They prioritize organisms that might have evolved given whatever astronomical conditions are thought to exist in other universes, and organisms that could plausibly base their behavior on moral philosophy and game theory. They also create intelligent machines, software agents, or anything else that could be common in the multiverse. They make custom habitats for each of these species and instantiate a small population in each one. The humans do this via synthetic biology, actual evolution from scratch (if affordable), or simulation. Each habitat is optimized to be an excellent environment to live in from the perspective of the species or agent inside it. This whole project costs a small fraction of the available resources of the human economy. The game theoretic motive is that, by doing something good for a hypothetical species, there might exist an inaccessible universe in which that species is both living and able to surmise that the humans have done this, and that they will by luck create a small utopia of humans when they do their counterpart project.

Is this an example of the type of cooperation discussed here?

Comment by scarcegreengrass on Invitation to comment on a draft on multiverse-wide cooperation via alternatives to causal decision theory (FDT/UDT/EDT/...) · 2017-06-07T17:19:23.467Z · score: 1 (1 votes) · LW · GW

Or alternately, one could indeed spend time on it, but be careful to remain aware of one's uncertainty.