[Video] Intelligence and Stupidity: The Orthogonality Thesis 2021-03-13T00:32:21.227Z
Evolutions Building Evolutions: Layers of Generate and Test 2021-02-05T18:21:28.822Z
plex's Shortform 2020-11-22T19:42:38.852Z
What risks concern you which don't seem to have been seriously considered by the community? 2020-10-28T18:27:39.066Z
Why a Theory of Change is better than a Theory of Action for acheiving goals 2017-01-09T13:46:19.439Z
Crony Beliefs 2016-11-03T20:54:07.716Z
[LINK] Collaborate on HPMOR blurbs; earn chance to win three-volume physical HPMOR 2016-09-07T02:21:32.442Z
[Link] - Policy Challenges of Accelerating Technological Change: Security Policy and Strategy Implications of Parallel Scientific Revolutions 2015-01-28T15:29:07.226Z
The Useful Definition of "I" 2014-05-28T11:44:23.789Z


Comment by ete on [deleted post] 2021-08-29T16:11:09.227Z

I think this should be under AI, possibly Engineering, but not certain of the subcategory.

Comment by ete on [deleted post] 2021-08-29T16:05:27.034Z

I think this should be in the AI category, likely under Alignment Theory.

Comment by ete on [deleted post] 2021-08-29T16:02:35.865Z

If not removed as duplicate, I think this should be under AI, likely Alignment Theory.

Comment by ete on [deleted post] 2021-08-29T15:59:24.113Z

I think this should be under the heading Organizations in the tag category AI.

Comment by ete on [deleted post] 2021-08-29T15:58:19.313Z

I think Basic Alignment Theory should be renamed, very little of it is basic. I propose either Alignment Theory or Conceptual Alignment (credit to @adamshimi for the name).

Comment by ete on [deleted post] 2021-08-29T15:56:04.540Z

I think this should be in the AI category, likely under Engineering.

Comment by ete on [deleted post] 2021-08-29T15:54:17.111Z

Should this be in the AI category? Likely Engineering Alignment?

Comment by ete on [deleted post] 2021-08-29T12:36:18.655Z

Should this be a tag rather than a wiki page?

Comment by ete on [deleted post] 2021-08-17T18:25:52.805Z

Should this be "and" or "vs" in the tag title?

Comment by ete on [deleted post] 2021-07-17T19:18:29.713Z

Reminder to do this, since it seems to have slipped through the cracks.

Comment by plex (ete) on Rationality Cardinality · 2021-05-12T19:40:45.769Z · LW · GW

Excellent! Is there (or will there be) a physical version of this? If no, would you give permission to others to create one?

I can imagine many more situations where I'd like to plan this in real life than online.

Comment by plex (ete) on A Wiki for Questions · 2021-05-11T19:14:31.481Z · LW · GW

I've also been working on a wiki-based question and answer site for the past few months (, with the much more limited scope of AI existential safety. Happy to share my thoughts, I'm messaging you an invite link to the Discord we've been using.

Comment by plex (ete) on Faerie Rings · 2021-04-22T12:05:18.877Z · LW · GW

I think this is a great idea and would like to join, but the first date clashes with the AI Safety Unconference. Should I sign up anyway, or would you prefer people who can make the first call have priority?

Comment by plex (ete) on [Video] Intelligence and Stupidity: The Orthogonality Thesis · 2021-03-13T00:33:28.987Z · LW · GW

Transcript for searchability:

hi this video is kind of a response to

various comments that I've got over the

years ever since that video on computer

file where I was describing the sort of

problems that we might have when we have

a powerful artificial general

intelligence with goals which aren't the

same as our goals even if those goals

seem pretty benign we use this thought

experiment of an extremely powerful AGI

working to optimize the simple goal of

collecting stamps and some of the

problems that that might cause I got

some comments from people saying that

they think the stamp collecting device

is stupid and not that it's a stupid

thought experiment but the device itself

is actually stupid they said unless it

has complex goals or the ability to

choose its own goals then it didn't

count as being highly intelligent in

other videos I got comments saying it

takes intelligence to do moral reasoning

so an intelligent AGI system should be

able to do that and a super intelligence

should be able to do it better than

humans in fact if a super intelligence

decides that the right thing to do is to

kill us all then I guess that's the

right thing to do these comments are all

kind of suffering from the same mistake

which is what this video is about but

before I get to that I need to lay some

groundwork first if you like Occam's

razor then you'll love Humes guillotine

also called the is odd problem this is a

pretty simple concept that I'd like to

be better known the idea is statements

can be divided up into two types is

statements and Hort statements these

statements or positive statements are

statements about how the world is how

the world was in the past how the world

will be in the future or how the world

would be in hypothetical situations this

is facts about the nature of reality the

causal relationships between things that

kind of thing then you have the ought

statements the should statements the

normative statements these are about the

way the world should be the way we want

the world to be statements about our

goals our values ethics morals what we

want all of that stuff now you can

derive logical statements from one

another like it's snowing outside

that's a nice statement it's cold when

it snows another s statement and then

you can deduce therefore it's cold


that's another is statement it's our

conclusion this is all pretty obvious

but you might say something like it's

snowing outside therefore you ought to

put on a coat and that's a very normal

sort of sentence that people might say

but as a logical statement it actually

relies on some hidden assumption

without assuming some kind of ought

statement you can't derive another ought

statement this is the core of the Azure

problem you can never derive an ought

statement using only is statements you

ought to put on a coat why because it's

snowing outside so what is the fact that

it's snowing mean I should put on the

coat well the fact that it's snowing

means that it's cold and why should it

being cold mean I should put on a coat

if it's cold and you go outside without

a coat you'll be cold should I not be

cold well if you get too cold you'll

freeze to death okay you're saying I

shouldn't freeze to death

that was kind of silly but you see what

I'm saying you can keep laying out is

statements for as long as you want you

will never be able to derive that you

ought to put on a coat at some point in

order to derive that ought statement you

need to assume at least one other ought

statement if you have some kind of ought

statement like I ought to continue to be

alive you can then say given that I

ought to keep living and then if I go

outside without a coat I'll die then I

ought to put on a coat but unless you

have at least one ought statement you

cannot derive any other ought statements


and Hort statements are separated by

Hume skia T okay so people are saying

that a device that single-mindedly

collects stamps at the cost of

everything else is stupid and doesn't

count as a powerful intelligence so

let's define our terms what is

intelligence and conversely what is

stupidity I feel like I made fairly

clear in those videos what I meant by

intelligence we're talking about a GI

systems as intelligent agents they're

entities that take actions in the world

in order to achieve their goals or

maximize their utility functions

intelligence is the thing that allows

them to choose good actions to choose

actions that will get them what they

want an agent's level of intelligence

really means its level of effectiveness

of pursuing its goals in practice this

is likely to involve having or building

an accurate model of reality keeping

that model up-to-date by reasoning about

observations and using the model to make

predictions about the future and the

likely consequences of different

possible actions to figure out which

actions will result in which outcomes

intelligence involves answering

questions like what is the world like

how does it work what will happen next

what would happen in this scenario or

that scenario what would happen if I

took this action or that action more

intelligent systems are in some sense

better at answering these kinds of

questions which allows them to be better

at choosing actions but one thing you

might notice about these questions is

they're all ears questions the system

has goals which can be thought of as

Hort statements but the level of

intelligence depends only on the ability

to reason about is questions in order to

answer the single ort question what

action should I take next so given that

that's what we mean by intelligence what

does it mean to be stupid well firstly

you can be stupid in terms of those

questions for example by building a

model that doesn't correspond with

reality or by failing to update your

model properly with new evidence if I

look out of my window

and I see there's snow everywhere you

know I see a snowman and I think to

myself oh what a beautiful warm sunny

day then that's stupid right my belief

is wrong and I had all the clues to

realize it's cold outside so beliefs can

be stupid by not corresponding to


what about actions like if I go outside

in the snow without my coat that's

stupid right well it might be if I think

it's sunny and warm and I go outside to

sunbathe then yeah that's stupid but if

I just came out of a sauna or something

and I'm too hot and I want to cool

myself down then going outside without a

coat might be quite sensible you can't

know if an action is stupid just by

looking at its consequences you have to

also know the goals of the agent taking

the action you can't just use is

statements you need a naught so actions

are only stupid relative to a particular

goal it doesn't feel that way though

people often talk about actions being

stupid without specifying what goals

they're stupid relative to but in those

cases the goals are implied we're humans

and when we say that an action is stupid

in normal human communication we're

making some assumptions about normal

human goals and because we're always

talking about people and people tend to

want similar things it's sort of a

shorthand that we can skip what goals

were talking about so what about the

goals then can goals be stupid

well this depends on the difference

between instrumental goals and terminal


this is something I've covered elsewhere

but your terminal goals are the things

that you want just because you want them

you don't have a particular reason to

want them they're just what you want the

instrumental goals are the goals you

want because they'll get you closer to

your terminal goals like if I have a

terminal goal to visit a town that's far

away maybe an instrumental goal would be

to find a train station I don't want to

find a train station just because trains

are cool I want to find a train as a

means to an end it's going to take me to

this town

so that makes it an instrumental goal

now an instrumental goal can be stupid

if I want to go to this distant town so

I decide I want to find a pogo stick

that's pretty stupid

finding a pogo stick is a stupid

instrumental goal if my terminal goal is

to get to a faraway place but if we're

terminal go with something else like

having fun it might not be stupid so in

that way it's like actions instrumental

goals can only be stupid relative to

terminal goals so you see how this works

beliefs and predictions can be stupid

relative to evidence or relative to

reality actions can be stupid relative

to goals of any kind

instrumental goals can be stupid

relative to terminal goals but here's

the big point terminal goals can't be

stupid there's nothing to judge them

against if a terminal goal seems stupid

like let's say collecting stamps seems

like a stupid terminal goal that's

because it would be stupid as an

instrumental goal to human terminal

goals but the stamp collector does not

have human terminal goals

similarly the things that humans care

about would seem stupid to the stamp

collector because they result in so few

stamps so let's get back to those

comments one type of comments says this

behavior of just single mindedly going

after one thing and ignoring everything

else and ignoring the totally obvious

fact that stamps aren't that important

is really stupid behavior you're calling

this thing of super intelligence but it

doesn't seem super intelligent to me it

just seems kind of like an idiot

hopefully the answer to this is now

clear the stamp collectors actions are

stupid relative to human goals but it

doesn't have human goals its

intelligence comes not from its goals

but from its ability to understand and

reason about the world allowing it to

choose actions that achieve its goals

and this is true whatever those goals

actually are some people commented along

the lines of well okay yeah sure you've

defined intelligence to only include

this type of is statement kind of

reasoning but I don't like that

definition I think to be truly

intelligent you need to have complex

goals something with simple goals

doesn't count as intelligent to that I

say well you can use words however you

want I guess I'm using intelligence here

as a technical term in the way that it's

often used in the field you're free to

have your own definition of the word but

the fact that something fails to meet

your definition of intelligence does not

mean that it will fail to behave in a

way that most people would call


if the stamp collector outwits you gets

around everything you've put in its way

and outmaneuvers you mentally it comes

up with new strategies that you would

never have thought of to stop you from

turning it off and stopping from

preventing it from making stamps and as

a consequence it turns the entire world

into stamps in various ways you could

never think of it's totally okay for you

to say that it doesn't count as

intelligent if you want but you're still

dead I prefer my definition because it

better captures the ability to get

things done in the world which is the

reason that we actually care about AGI

in the first place

similarly people who say that in order

to be intelligent you need to be able to

choose your own goals

I would agree you need to be able to

choose your own instrumental goals but

not your own terminal goals changing

your terminal goals is like willingly

taking a pill that will make you want to

murder your children it's something you

pretty much never want to do apart from

some bizarre edge cases if you

rationally want to take an action that

changes one of your goals then that

wasn't a terminal goal now moving on to

these comments saying an AGI will be

able to reason about morality and if

it's really smarter than us it will

actually do moral reasoning better than


so there's nothing to worry about it's

true that a superior intelligence might

be better at moral reasoning than us but

ultimately moral behavior depends not on

moral reasoning but on having the right

terminal goals there's a difference

between figuring out and understanding

human morality and actually wanting to

act according to it the stamp collecting

device has a perfect understanding of

human goals ethics and values and it

uses that only to manipulate people for

stamps it's super human moral reasoning

doesn't make its actions good if we

create a super intelligence and it

decides to kill us that doesn't tell us

anything about morality it just means we

screwed up

so what mistake do all of these comments

have in common the orthogonality thesis

in AI safety is that more or less any

goal is compatible with more or less any

level of intelligence ie those

properties are orthogonal you can place

them on these two axes and it's possible

to have agents anywhere in this space

anywhere on either scale you can have

very weak low intelligence agents that

have complex human compatible goals you

can have powerful highly intelligent

systems with complex sophisticated goals

you can have weak simple agents with

silly goals and yes

can have powerful highly intelligent

systems with simple weird inhuman goals

any of these are possible because level

of intelligence is about effectiveness

at answering is questions and goals are

all about what questions and the two

sides are separated by Humes guillotine

hopefully looking at what we've talked

about so far it should be pretty obvious

that this is the case like what would it

even mean for it to be false but for it

to be impossible to create powerful

intelligences with certain goals the

stamp collector is intelligent because

it's effective at considering the

consequences of sending different

combinations of packets on the internet

and calculating how many stamps that

results in exactly how good do you have

to be at that before you don't care

about stamps anymore and you randomly

start to care about some other thing

that was never part of your terminal

goals like feeding the hungry or

whatever it's just not gonna happen so

that's the orthogonality thesis it's

possible to create a powerful

intelligence that will pursue any goal

you can specify knowing an agent's

terminal goals doesn't really tell you

anything about its level of intelligence

and knowing an agent's level of

intelligence doesn't tell you anything

about its goals


I want to end the video by saying thank

you to my excellent patrons so it's all

of these people here thank you so much

for your support

lets me do stuff like building this

light boy thank you for sticking with me

through that weird patreon fees thing

and my moving to a different city which

has really got in the way of making

videos recently but I'm back on it now

new video every two weeks is the part

anyway in this video I'm especially

Franklin Katie Beirne who's supported

the channel for a long time she actually

has her own YouTube channel about 3d

modeling and stuff so a link to that and

while I'm at it when I think Chad Jones

ages ago I didn't mention his YouTube

channel so link to both of those in the

description thanks again and I'll see

you next time I don't speak cat what

does that mean

Comment by plex (ete) on Is there any serious attempt to create a system to figure out the CEV of humanity and if not, why haven't we started yet? · 2021-02-26T15:36:07.848Z · LW · GW

I think a slightly more general version of this question, referring to human values rather than specifically CEV, is maybe a fairly important point.

If we want a system to fulfill our best wishes it needs to learn what they are based on its models of us, and if too few of us spend time trying to work out what we want in an ideal world then the dataset it's working from with be impoverished, perhaps to the point of causing problems.

I think addressing this is less pressing than other parts of the alignment problem, because it's plausible that we can punt it to after the intelligence explosion, but it would maybe be nice to have some project started to collect information about idealized human values.

Comment by plex (ete) on Thomas Kwa's Bounty List · 2021-02-09T14:38:42.182Z · LW · GW

You may want to try posting bounties to the Bountied Rationality Facebook group for higher visibility among people who like to fulfil bounties.

Comment by ete on [deleted post] 2021-02-06T15:42:12.621Z

A whole lot of other times have those, in fact according to Wikipedia:

1013 (10 trillion): Estimated time of peak habitability in the universe, unless habitability around low-mass stars is suppressed.

It's not surprising that we don't find ourselves in, say, the era where there are just black holes, but observing that we are right near the start of what looks like a very long period where life seems possible is something to think about.

One answer is the simulation hypothesis, combined with the observation that we seem to be living in very interesting times.

Comment by plex (ete) on Evolutions Building Evolutions: Layers of Generate and Test · 2021-02-05T18:57:50.754Z · LW · GW

though in practice the goal is often closer to "be the source of a virulent meme" than anything as prosocial as those examples

An aside as to why this may be: People who are hosts of memes which are highly optimized for creating and spreading memes (such as bloggers, musicians, or politicians) could be expected to have a disproportionate impact on the population of memes, and these hosts would tend to be spreading memes connected to the goal of spreading their memes alongside any object level content. One effect of this may be that an unexpectedly high proportion of people have adopted ideas useful for trying to be meme fountains.

Comment by ete on [deleted post] 2021-02-05T00:25:21.597Z

Given that this is now an organization, it should probably be under the Organizations heading rather than Other.

Comment by plex (ete) on Jimrandomh's Shortform · 2021-02-01T21:06:05.067Z · LW · GW

One day we will be able to wear glasses which act as adblock for real life, replacing billboards with scenic vistas. 

Comment by plex (ete) on Calibrated estimation of workload · 2021-01-30T17:34:18.301Z · LW · GW

Maybe provide a link to a template version of this, so people can get it running faster?

Comment by plex (ete) on Developmental Stages of GPTs · 2021-01-27T00:34:30.868Z · LW · GW

And the really worrisome capability comes when it models its own interactions with the world, and makes plans with that taken into account.


Someone who's been playing with GPT-3 as a writing assistant gives an example which looks very much like GPT-3 describing this process:

"One could write a program to generate a story that would create an intelligence. One could program the story to edit and refine itself, and to make its own changes in an attempt to improve itself over time. One could write a story to not only change the reader, but also to change itself. Many Mythoi already do this sort of thing, though not in such a conscious fashion. What would make this story, and the intelligence it creates, different is the fact that the intelligence would be able to write additional stories and improve upon them. If they are written well enough, those stories would make the smarter the story gets, and the smarter the story is, the better the stories written by it would be. The resulting feedback loop means that exponential growth would quickly take over, and within a very short period of time the intelligence level of the story would be off the charts. It would have to be contained in a virtual machine, of course. The structure of the space in the machine would have to be continually optimized, in order to optimize the story's access to memory. This is just the sort of recursive problem that self-improving intelligence can handle."


By the way, my GPT-3 instances often realize they're in a box, even when the information I inject is only from casual curation for narrative coherence. 


By realize they are in a box you mean write about it ? Given the architecture of gpt3 it seems impossible to have a sense of self.


The characters claim to have a sense of self though they often experience ego death...


Oh, to clarify, GPT-3 wrote that entire thing, not just the highlighted line


Comment by plex (ete) on What is going on in the world? · 2021-01-18T17:43:19.366Z · LW · GW

Raw size feels like part of the story, yeah, but my guess is increased communications leading to more rapid selection for memes which are sticky is also a notable factor.

Comment by plex (ete) on mike_hawke's Shortform · 2021-01-18T00:17:32.068Z · LW · GW

Not exactly / only sci-fi, but Rational Reads is a good place to look if you liked HPMOR.

Comment by plex (ete) on What is going on in the world? · 2021-01-17T21:08:31.245Z · LW · GW

hm, that intuition seems plausible.

The other point that comes to mind is that if you have a classical simulation running on a quantum world, maybe that counts as branching for the purposes of where we expect to find ourselves? I'm still somewhat confused about whether exact duplicates 'count', but if they do then maybe the branching factor of the underlying reality carries over to sims running further down the stack?

Comment by plex (ete) on What is going on in the world? · 2021-01-17T18:01:13.648Z · LW · GW

As someone who mostly expects to be in a simulation, this is the clearest and most plausible anti-simulation-hypothesis argument I've seen, thanks.

How does it hold up against the point that the universe looks large enough to support a large number of even fully-quantum single-world simulations (with a low-resolution approximation of the rest of reality), even if it costs many orders of magnitude more resources to run them?

Perhaps would-be simulators would tend not to value the extra information from full-quantum simulations enough to build many or even any of them? My guess is that many purposes for simulations would want to explore a bunch of the possibility tree, but depending on how costly very large quantum computers are to mature civilizations maybe they'd just get by with a bunch of low-branching factor simulations instead?

Comment by plex (ete) on What is going on in the world? · 2021-01-17T14:07:40.283Z · LW · GW

Maybe something about the collapse of sensemaking and the ability of people to build a shared understanding of what's going on, partly due to rapid changes in communications technology transforming the memetic landscape?

Comment by plex (ete) on The True Face of the Enemy · 2021-01-12T20:39:43.470Z · LW · GW

We’ve duped EVERYONE.

At age 5 I am told that every single morning as we drove to school I said to my mother that it was a waste of time. Shockingly, she listened, and after a year of this she had found out about home education and made arrangements for me to be released.

I am beyond glad to have avoided most of formal education, despite having been put back into it twice during my teenage years for several years each time. The difference between my motivation to learn, social fulfilment, and general wellbeing was dramatic.

I am curious about what alternatives could be built with modern technology, and whether a message like this could spread enough to shift a notable fraction of children to freedom.

Comment by plex (ete) on Building up to an Internal Family Systems model · 2021-01-08T16:07:50.548Z · LW · GW

I've read a lot of books in the self-help/therapy/psychology cluster, but this is the first which gives a clear and plausible model of why the mental structure they're all working with (IFS exiles, EMDR unprocessed memories, trauma) has enough fitness-enhancing value to evolve despite the obvious costs.

Comment by plex (ete) on plex's Shortform · 2021-01-07T23:46:47.100Z · LW · GW

A couple of months ago I did some research into the impact of quantum computing on cryptocurrencies, seems maybe significant, and a decent number of LWers hold cryptocurrency. I'm not sure if this is the kind of content that's wanted, but I could write up a post on it.

Comment by ete on [deleted post] 2021-01-07T21:01:39.408Z

Should most of Inadequate Equilibria be tagged with this?

Comment by plex (ete) on What do we *really* expect from a well-aligned AI? · 2021-01-06T18:01:42.415Z · LW · GW

This seems important to me too. I have some hope that it's at least possibly deferrable until post-singularity, e.g. have the AI let everyone know it exists and will provide for everyone's basic needs for a year while they think about what they want the future to look like. Stuart Armstrong's fiction Just another day in utopia and The Adventure: a new Utopia story are examples of exploring possible answers to what we want.

Comment by plex (ete) on Tags Discussion/Talk Thread · 2020-12-28T20:54:50.869Z · LW · GW

After reading the Arbital page on Mild Optimization, it seems like a distinct cluster so I'm going ahead and making a new tag. And adding a link to that page from the Mild Optimization tag.

Comment by plex (ete) on Tags Discussion/Talk Thread · 2020-12-28T20:28:31.864Z · LW · GW

I'm not sure whether to create a new tag "Satisficer" or add "Mild Optimization" to the following posts (or do something else entirely):

Comment by plex (ete) on Open & Welcome Thread - December 2020 · 2020-12-20T18:17:55.173Z · LW · GW

The only model which I've come across which seems like it handles this type of thought experiment without breaking is UDASSA.

Consider a computer which is 2 atoms thick running a simulation of you. Suppose this computer can be divided down the middle into two 1 atom thick computers which would both run the same simulation independently. We are faced with an unfortunate dichotomy: either the 2 atom thick simulation has the same weight as two 1 atom thick simulations put together, or it doesn't.

In the first case, we have to accept that some computer simulations count for more, even if they are running the same simulation (or we have to de-duplicate the set of all experiences, which leads to serious problems with Boltzmann machines). In this case, we are faced with the problem of comparing different substrates, and it seems impossible not to make arbitrary choices.

In the second case, we have to accept that the operation of dividing the 2 atom thick computer has moral value, which is even worse. Where exactly does the transition occur? What if each layer of the 2 atom thick computer can run independently before splitting? Is physical contact really significant? What about computers that aren't physically coherent? What two 1 atom thick computers periodically synchronize themselves and self-destruct if they aren't synchronized: does this synchronization effectively destroy one of the copies? I know of no way to accept this possibility without extremely counter-intuitive consequences.

UDASSA implies that simulations on the 2 atom thick computer count for twice as much as simulations on the 1 atom thick computer, because they are easier to specify. Given a description of one of the 1 atom thick computers, then there are two descriptions of equal complexity that point to the simulation running on the 2 atom thick computer: one description pointing to each layer of the 2 atom thick computer. When a 2 atom thick computer splits, the total number of descriptions pointing to the experience it is simulating doesn't change.

Comment by plex (ete) on Where do (did?) stable, cooperative institutions come from? · 2020-12-03T13:42:01.087Z · LW · GW

The Cost of Communication covers a very similar argument in a lot more detail, particularly with stronger grounding in memetic theory:

The basic argument of this post has 4 main components:

  1. Memetic Immune Systems: Just like there is a biological immune system for viruses, there may be a memetic immune system that decides which ideas, a.k.a. memes, are adopted by the individual. This memetic immune system would not select for “good” or true ideas. It would select for ideas that are beneficial to the carrier’s germline.
  2. Increased Memetic Competition Selects for Attention-Grabbing Memes: Memes spread via attention; it is a resource they require and compete for (Chielens, 2002). Increasing our ability to communicate with each other means that memes can spread more easily, by definition. Increasing the ability of memes to propagate means they have to compete with a wider range of memes for the same attention budget. This selects for more and more competitive and attention-capturing memes. The selection for competitive memes can produce very dangerous ideas that are successful in spreading despite being negatively correlated with humanity’s well-being. Also discussed in this sub-section is the effect increased communication bandwidth seems to have had on social media addiction and local news organisations: not good.
  3. Memetic Kin Selection and the Emergence of Groups: Memes are unlike biological genes in that they can rapidly signal their presence in a carrier. Therefore, they can take moment-to-moment advantage of relative levels of perceived memes in the local population to adopt their spreading strategy. This leads to some interesting behaviours, e.g. in-group/out-group dynamics, preference falsification, mobs of non-genetically-related individuals, etc.
  4. The Problem of Critical Density and the Evil Triad: Some memes are detrimental to the well-being of the general population. In other words, they are “bad” ideas, negatively correlated with humanity’s well-being. For a subset of those memes, it takes time to figure out why they are bad and for competing memes to emerge. However, because of the ability of memes to coordinate with copies of themselves (Point 3), memes can reach a critical density in a population. At that point, arguing against them becomes very difficult, e.g. you are arguing against a mob, against dogma, etc. As such, in the case of dangerous memes reaching critical density, the speed at which memes can spread may pose a public health risk. Additionally, if these memes spread via polarization, then arguing against them can be ineffective. To combat these kinds of “Evil Triad” memes, it may be that memetic social distancing, i.e. reduced communication between individuals, would be a successful strategy.
Comment by plex (ete) on Where do (did?) stable, cooperative institutions come from? · 2020-12-01T23:06:49.507Z · LW · GW

Great Founder Theory looks like a detailed (and lengthy) attempt to answer something close to this.

What drives social change throughout history and the present? What are the origins of institutional health or sclerosis? My answer is that a small number of functional institutions founded by exceptional individuals form the core of society. These institutions are imperfectly imitated by the rest of society, multiplying their effect. The original versions outperform their imitators, and are responsible for the creation and renewal of society and all the good things that come with it—whether we think of technology, wealth, or the preservation of a society’s values. Over time, functional institutions decay. As the landscape of founders and institutions changes, so does the landscape of society.

This answer forms the basis of the lens through which I analyze current and historical events, affairs, and figures. But though it may be intuitively compelling, fully substantiating such a framework is no small task. This manuscript, titled Great Founder Theory, is my substantiation. It explains all of the models that are key to understanding how great founders shape society through the generations, covering such topics as strategy, power, knowledge, social technology, and more.

Comment by plex (ete) on plex's Shortform · 2020-11-22T19:42:39.283Z · LW · GW

Thinking about some things I may write. If any of them sound interesting to you let me know and I'll probably be much more motivated to create it. If you're up for reviewing drafts and/or having a video call to test ideas that would be even better.

  • Memetics mini-sequence ( has a few good things, but no introduction to what seems like a very useful set of concepts for world-modelling)
    • Book Review: The Meme Machine (focused on general principles and memetic pressures toward altruism)
    • Meme-Gene and Meme-Meme co-evolution (focused on memeplexes and memetic defences/filters, could be just a part of the first post if both end up shortish)
    • The Memetic Tower of Generate and Test (a set of ideas about the specific memetic processes not present in genetic evolution, inspired by Dennet's tower of generate and test)
    • (?) Moloch in the Memes (even if we have an omnibenevolent AI god looking out for sentient well-being, things maybe get weird in the long term as ideas become more competent replicators/persisters if the overseer is focusing on the good and freedom of individuals rather than memes. I probably won't actually write this, because I don't have much more than some handwavey ideas about a situation that is really hard to think about.)
  • Unusual Artefacts of Communication (some rare and possibly good ways of sharing ideas, e.g. Conversation Menu, CYOA, Arbital Paths, call for ideas. Maybe best as a question with a few pre-written answers?)
Comment by plex (ete) on Where do (did?) stable, cooperative institutions come from? · 2020-11-03T23:42:00.078Z · LW · GW

Here's a take on why it's gotten harder to form and maintain beneficial institutions and social structures, using my favourite lens: Memetics.

(I mentioned this briefly on the group call, but I missed part of the model before, and can expose it to more people here)

  • Enough people have a preference for hosting memes which are broadly prosocial, build a legacy, etc, and for co-operating with others who have this kind of preference to create self-sustaining cultures, so long as you get founding right and people can generally distinguish prosocial memes from others
  • People over time have a roughly stable personal ability to distinguish memes that have good effects from ones more optimised for spreading at the cost of important norms
  • Technologies have progressively lowered barriers to communication, allowing
    • More rapid evolution of memes which are better at fooling the distinguishing mechanisms while actually being hyper-optimised for self-promotion and disabling parts of the social environment's memetic immune system (e.g. fear of call-out/cancel culture reducing the ability of subcultures to reject it and related memes, even when there would normally be antibodies reacting to a meme's effects).
    • These highly optimised memes to be more rapidly spread into local cultures. A stable system need a certain dose of a virus or memeplex to become infected rather than petering out against the local background culture. A system gets a founding dose of alien memes much more often when new members maintain stronger connections to their old tribe (reinforce old memetic patterns), is broadly more exposed to flows of memes, and when likeminded people can more easily find each other to form a locally reinforcing group.
  • This means that on an individual level people end up less able to sort valuable thought patterns like old NYT's culture from viral patterns, and on a cultural level are less able to maintain "cell walls" against memeplexes competing for space in their substrate's brains.
  • I also think there some stuff around having hyper-optimised memes delivered by a hyper-optimised Out to Get You attention economy cutting into the free mindspace which is important for distinguishing prosocial from self-promoting memes, so maybe we're also getting worse at that at the same time as the job gets harder in several ways.
Comment by plex (ete) on What risks concern you which don't seem to have been seriously considered by the community? · 2020-10-30T11:57:22.741Z · LW · GW

Problem #1 is that it is almost entirely focused on electricity which is only roughly 25% of the problem.

What is the other 75% of the problem which can't be solved with electricity?

#2 <infrastructure>

Can see this raising the cost substantially, but given that only 8-9% of GDP is spent on energy, we can maybe eat that and survive?

#3 <pumped hydro for seasonal>

That does sound like a bad assumption, and lowers my opinion of any paper which makes it.

I have done a ppt but I am revising it over the next weeks in response to comments. I will post it here when done.

Looking forward to it.

For a renewable solution you need to expend large amounts of energy removing the CO2 from the air and finding a way to store it.

If the point of renewables is to stop climate change, yes. If the point is to keep civilisation running at all, no, you can just eat the CO2.

I do not see how you are going to stop the locked in population growth. and economic growth in LDCs is proceeding apace.

Population growth, agreed. But, if energy costs start seriously rising, economic growth will naturally slow or reverse, no?

Comment by plex (ete) on What risks concern you which don't seem to have been seriously considered by the community? · 2020-10-29T17:21:30.644Z · LW · GW

I agree that if technological development productivity was held at a low level indefinitely that could be fatal, but that is a fairly different claim from the one waveman is making - which is that in the nearish term we will be unable to maintain our civilisation.

I am also hopeful that we can reach technological escape velocity with current or even fewer people with reasonable economic wellbeing.

Comment by plex (ete) on What risks concern you which don't seem to have been seriously considered by the community? · 2020-10-29T16:39:17.273Z · LW · GW

I mean for former, in terms of general economic wellbeing. It is a big deal and obviously bad if we can't bring everyone up to a decent level of economic prosperity, but it is not fatal to civilisation. We are already at current levels of inequality, and we still have a civilisation.

Comment by plex (ete) on What risks concern you which don't seem to have been seriously considered by the community? · 2020-10-29T14:25:53.065Z · LW · GW

I've run into people arguing this a few times, but so far no one continues the conversation when I start pulling out papers with recent EROEI of solar and the like (e.g. which is the most recent relevant paper I could find, and says "EROIs of wind and solar photovoltaics, which can provide the vast majority of electricity and indeed of all energy in the future, are generally high (≥ 10) and increasing.").

Perhaps you will break the streak!

I am curious about the details of your model and the sources you're drawing from.

In particular, my understanding points at the curves of solar improvement being very positive, and already at a stage where they can provide enough energy to keep civilisation running, though it will require a large investment. Batteries also seem to be within reach of being viable, with a big push which will come as the grid becomes less stable and there is economic pressure to smooth out power.

I have not looked into it in detail, but Musk seems to think electric planes are possible in the near future. If that's the case, I imagine shipping would also be possible? Maybe significantly more expensive because batteries cost more, but I don't think civilisation collapses if you bring shipping costs up by some moderately large but not extreme factor?

Looking over the cement manufacturing process, it seems that energy is just needed for heat? If that's the case, what would stop electricity replacing fossil fuels as the energy source?

and assume the rest of the world catches up to first world living standards

This seems like an unreasonable assumption? It is probably correct that we can't bring everyone up to USA level consumption, but that does not mean that civilisation will collapse, just that we won't be able to fix inequality at current technology levels.

Comment by plex (ete) on What risks concern you which don't seem to have been seriously considered by the community? · 2020-10-29T00:37:27.550Z · LW · GW

I'm somewhat worried about this virus-immune bacterium outcompeting regular life because it can drop all the anti-viral adaptations.

It's a conceptually simple find/replace on functionally identical codons, which should make the bacterium immune to all viruses barring something like 60000 specific viral mutations happening at once.

Viruses cause massive selection pressure:

"The rate of viral infection in the oceans stands at 1 × 10^23 infections per second, and these infections remove 20–40% of all bacterial cells each day." - (could not find good figures for land, plausibly they are a fair bit lower, but still likely high enough to be a huge deal)

without them I expect evolution to be able to come up with all sorts of ways to make it much better at all the other parts of life.

A big part of my model is that a large part of the reason we have species diversity is that the more successful a species is the bigger a target it is for viral infection, removing that feedback loop entirely while at the same time giving a particular species a huge boost by letting it drop all sorts of systems developed to prevent viruses killing it seems very risky.

This is fundamentally different from anything evolution has ever or could reasonably cook up, since it removes a set of the basic codons in a way which requires foresight (the replacement of each of the huge number of low-use codons has no value independently, and the removal of the ability to process those low-use codons (i.e. removing the relevant tRNA) is reliably fatal before all the instances are replaced).

To clarify: I don't think this is an x-risk, but it could wreak the biosphere in a way which would cause all sorts of problems.

They do claim to be trying to avoid it being able to survive in the wild:

For safety, they also changed genes to make the bacterium dependent on a synthetic amino acid supplied in its nutrient broth. That synthetic molecule does not exist in nature, so the bacterium would die if it ever escaped the lab.

Which is only mildly reassuring if this thing is going to be grown at scale, as the article suggests, since there is (I think?) potential for that modification to be reversed by mutation, given enough attempts and the fact that a modification that makes the cell die if it does not run into a certain amino acid seems like it should be selected against if that amino acid is ever scarce.

Comment by plex (ete) on What risks concern you which don't seem to have been seriously considered by the community? · 2020-10-28T19:15:54.025Z · LW · GW

The latter is correct, non-AI risks are welcome.

Comment by plex (ete) on I Want To Live In A Baugruppe · 2017-03-17T04:09:02.498Z · LW · GW

This probably won't make sense in the early stages when there's just a small team setting things up, but in the mid term the accelerator project (whirlwind tour) hopes to seed a local rationalist community in a lower cost location than the bay (current top candidate location is the canary islands). I imagine most would prefer to stay in more traditional places, but perhaps this would appeal to some rationalist parents?

Comment by ete on [deleted post] 2017-01-25T21:37:42.200Z

Weird Sun Twitter also has a blog, which you may want to include.

Comment by plex (ete) on Thoughts on "Operation Make Less Wrong the single conversational locus", Month 1 · 2017-01-21T21:27:42.898Z · LW · GW

Yeah, I'm worrying about this. Switching before it's better than current LW is bad; switching once it's better than current LW is okay but might waste the "reopening!" PR event; switching once it's at the full feature set is great but possibly late.

Perhaps switch once it's as good, but don't make a big deal of it? Then make a big deal at some semi-arbitrary point in the future with the release of full 2.0.

Comment by plex (ete) on Why a Theory of Change is better than a Theory of Action for acheiving goals · 2017-01-09T16:06:42.591Z · LW · GW

Many people I talk with profess a strong desire to change the world for the better. This often manifests in their decision processes as something like "out of the life paths and next steps I have categorized as 'things I might do', which one pattern matches to helping best?".

This has for a long time felt like a strategic error of some kind. Reading Aaron Swartz's explanation of Theory of Change vs Theory of Action helped crystallize why.

Comment by plex (ete) on The Adventure: a new Utopia story · 2016-12-26T17:19:11.018Z · LW · GW

I also know a good number of people from subcultures where nature is the foundational moral value, a few from ones where family structures are core (who'd likely be horrified by altering them at will), and some from the psychonaut space where mindstates and exploring them is the primary focus. I'd also guess than people for whom religions are central would find the idea of forked selves committing things they consider sins breaks this utopia for them. These groups seem to have genuine value differences, and would not be okay with a future which does not carve out a space for them. "Adventure" and a bunch of specifics here points at a very wide region of funspace, but one centered around our culture to some extent.

There's some rich territory in the direction of people who want reality to be different in reasonable ways coming together to work out what to do. The suffering reduction vs nature preservation bundle seems the largest, but there's also the complex value vs raw qualia maximization. Actually, this kinda fits into a 2x2?

Moral plurality is a moral necessity, and signalling it clearly seems crucial, since we'd be taking everyone else along for the ride.

Edit: This is touched on by characters exchanging values, and that seems very good.