Comment by mtrazzi on Corrigibility as Constrained Optimisation · 2019-04-11T11:54:03.971Z · score: 1 (1 votes) · LW · GW

Layman questions:

1. I don't understand what you mean by "state" in "Suppose, however, that the AI lacked any capacity to press its shutdown button, or to indirectly control its state". Do you include its utility function in its state? Or just the observations he receives from the environment? What context/framework are you using?

2. Could you define U_S and U_N? From the Corribility paper, U_S appears to be an utility function favoring shutdown, and U_N is a potentially flawed utility function, a first stab at specifying their own goals. Was that what you meant? I think it's useful to define it in the introduction.

3. I don't understand how an agent that "[lacks] any capacity to press its shutdown button" could have any shutdown ability. It's seems like a contradiction, unless you mean "any capacity to directly press its shutdown button".

4. What's the "default value function" and the "normal utility function" in "Optimisation incentive"? Is it clearly defined in the litterature?

5. "Worse still... for any action..." -> if you choose b as some action with bad corrigibility property, it seems reasonable that it can be better than most actions on v_N + v_S (for instance if b is the argmax). I don't see how that's a "worse still" scenario, it seems plausible and normal.

6. "From this reasoning, we conclude" -> are you infering things from some hypothetic b that would satisfy all the things you mention? If that's the case, I would need an example to see that it's indeed possible. Even better would be a proof that you can always find such b.

7. "it is clear that we could in theory find a θ" -> could you expand on this?

8. "Given the robust optimisation incentive property, it is clear that the agent may score very poorly on UN in certain environments." -> again, can you expand on why it's clear?

9. In the appendix, in your 4 lines inequality, do you assume that U_N(a_s) is non-negative (from line 2 to 3)? If yes, why?

Considerateness in OpenAI LP Debate

2019-03-12T19:05:27.643Z · score: 8 (3 votes)
Comment by mtrazzi on Renaming "Frontpage" · 2019-03-09T09:26:02.764Z · score: 5 (3 votes) · LW · GW

Name suggestions: "approved", "favored", "Moderators' pick", "high [information] entropy", "original ideas", "informative", "mostly ideas".

More generally, I'd recommend that each category has a name that bluntly states what the filter does (e.g. if it only uses karma as filter say "high karma").

Comment by mtrazzi on Alignment Research Field Guide · 2019-03-08T21:57:11.859Z · score: 43 (13 votes) · LW · GW

Hey Abram (and the MIRI research team)!

This post resonates with me on so many levels. I vividly remember the Human-Aligned AI Summer School where you used to be a "receiver" and Vlad was a "transmitter", when talking about "optimizers". Your "document" especially resonates with my experience running an AI Safety Meetup (Paris AI Safety).

On January 2019, I organized a Meetup about "Deep RL from human preferences". Essentially, the resources were by difficulty, so you could discuss the 80k podcast, the open AI blogpost, the original paper or even a recent relevant paper. Even if the participants were "familiar" to RL (because they got used to see written "RL" in blogs or hear people say "RL" in podcasts) none of them could explain to me the core structure of a RL setting (i.e. that a RL problem would need at least an environment, actions, etc.)

The boys were getting hungry (abram is right, $10 of chips is not enough for 4 hungry men between 7 and 9pm), when in the middle of a monologue ("in RL, you have so-and-so, and then it goes like so on and so forth..."), I suddenly realize that I'm talking to more than qualified attendees (I was lucky to have a PhD candidate in economics, a teenager who used to do international olympiads in informatics (IOI) and a CS PhD) that lack the necessary RL procedural knowledge to ask non-trivial questions about "Deep RL from human preferences".

That's when I decided to change the logistics of the Meetup to something much closer to what is described in "You and your research". I started thinking about what they would be interested in knowing. So I started telling the brillant IOI kid about this MIRI summer program, how I applied last year, etc. One thing lead to another, and I ended up asking what Tsvi had asked me one year ago for the AISFP interview:

If one of you was the only Alignment researcher left on Earth, and it was forbidden to convince other people to work on AI Safety research, what would you do?

That got everyone excited. The IOI boy took the black marker, and started to do math to the question, as a transmitter: "So, there is a probability p_0 that AI Researchers will solve the problem without me, and p_1 that my contribution will be neg-utility, so if we assume this and that, we get so-and-so."

The moment I asked questions I was truly curious about, the Meetup went from a polite gathering to the most interesting discussion of 2019.

Abram, if I were in charge of all agents in the reference class "organizer of Alignment-related events", I would tell instances of that class with my specific characteristics two things:

1. Come back to this document before and after every Meetup.

2. Please write below (can be in this thread or in the comments) what was your experience running an Alignment think-thank that resonates the most with the above "document".

Meditations on Territory

2019-03-04T22:13:36.614Z · score: 3 (2 votes)

Treacherous Turn, Simulations and Brain-Computer Interfaces

2019-02-25T15:49:44.375Z · score: 17 (10 votes)
Comment by mtrazzi on Greatest Lower Bound for AGI · 2019-02-05T23:14:48.666Z · score: 7 (3 votes) · LW · GW

I intuitively agree with your answer. Avturchin also commented saying something close (he said 2019, but for different reasons). Therefore, I think I might not be communicating clearly my confusion.

I don't remember exactly when, but there was some debates between Yann Le Cun and AI Alignment folks on a Fb group (maybe AI Safety discussion "open" a few months ago). What stroke me was how confident LeCun was about long timelines. I think, for him, the 1% would be in at least 10 years. How do you explain that someone who has access to private information (e.g. at FAIR) might have timelines so different than yours?

Meta: Thanks for expressing clearly your confidence levels through your writing with "hard", "maybe" and "should": it's very efficient.

EDIT: Le Cun thread:

Comment by mtrazzi on Greatest Lower Bound for AGI · 2019-02-05T23:06:19.435Z · score: 4 (3 votes) · LW · GW

Could you detail a bit more the Gott's equation? I'm not familiar with it.

Also, do you think that those 62 years are meaningful if we think about AI winters or exponential technological progress?

PS: I think you commented instead of giving an answer (different things in question posts)

Greatest Lower Bound for AGI

2019-02-05T20:17:24.675Z · score: 9 (5 votes)
Comment by mtrazzi on If You Want to Win, Stop Conceding · 2018-11-23T23:17:52.804Z · score: 5 (2 votes) · LW · GW

Thanks for the post!

It resonates with some experience I had in playing the game of go at a competitive level.

Go is a perfect information game but it's very hard to know exactly what will be the outcome of a "fight" (you would need to look up to 30 moves ahead in some cases).

So when the other guy would kill your group of stones after a "life or death" scenario, because he had a slight advantage in the fight, it feels like the other is lucky, and most people have really bad thoughts and just give up.

Once, I created an account with the bio "I don't resign" to see what would happen if I forced myself to not concede and keep playing after a big loss. It went surprisingly well and I even went to play the highest ranked guy connected on the server. At this point, I completely lost the game and there was 100+ people watching the game, so I just resigned.

Looking back, it definitely helped me to continue fighting even after a big loss, and stop the mental chatter. However, there's a trade-off between the time gained by correctly estimating the probability of winning and resigning when too improbable, and the mental energy gained from not resigning (minus the fact that your opponent may be pretty pissed off).

Comment by mtrazzi on Introducing the AI Alignment Forum (FAQ) · 2018-10-31T11:49:06.596Z · score: 3 (2 votes) · LW · GW

(the account databases are shared, so every LW user can log in on alignment forum, but it will say "not a member" in the top right corner)

I am having some issues in trying to log in from a github-linked account. It redirects me to LW with an empty page and does nothing.

Comment by mtrazzi on noticing internal experiences · 2018-10-16T11:37:13.921Z · score: 2 (2 votes) · LW · GW

This website is designed to make you write about three morning pages every day.

I've used it for about two years and wrote ~200k words.

Really recommend it to form an habit of daily free writing.

Comment by mtrazzi on Open Thread October 2018 · 2018-10-14T20:55:51.056Z · score: 2 (2 votes) · LW · GW

Same issue here with the <a class="users-name" href="/users/mtrazzi">Michaël Trazzi</a> tag. The e in "ë" is larger than the "a" (here is a picture).

The bug seems to come from font-family: warnock-pro,Palatino,"Palatino Linotype","Palatino LT STD","Book Antiqua",Georgia,serif;" in .PostsPage-author (in <style data-jss="" data-meta="PostsPage">).

If I delete this font-family line, the font changes but the "ë" (and any other letter with accent) appears to have the correct size.

Open Thread October 2018

2018-10-02T18:01:05.416Z · score: 13 (3 votes)
Comment by mtrazzi on A Dialogue on Rationalist Activism · 2018-09-11T09:11:16.281Z · score: 1 (1 votes) · LW · GW
You: Well.

The "You" should be bold.

Comment by mtrazzi on Formal vs. Effective Pre-Commitment · 2018-09-01T07:38:03.631Z · score: 3 (2 votes) · LW · GW

typo: "Casual Decision Theory"

Comment by mtrazzi on Bottle Caps Aren't Optimisers · 2018-08-31T20:15:01.901Z · score: 8 (4 votes) · LW · GW

Let me see if I got it right:

  1. Defining optimizers as an unpredictable process maximizing an objective function does not take into account algorithms that we can compute

  2. Satisfying the property P "give the objective function higher values than an inexistence baseline" is not sufficient:

  • the lid satisfies (P) with "water quantity in bottle" but is just a rigid object that some optimizer put there. However, not the best counter-example because not a Yudkwoskian optimizer.
  • if a liver didn't exist or did other random things then humans wouldn't be alive and rich, so it satisfies (P) with "money in bank account" as the objective function. However, the better way to account for its behaviour (cf. Yudkowskian definition) is to see it as a sub-process of an income maximizer created by evolution.
  1. One property that could work: have a step in the algorithm that provably augments the objective function (e.g. gradient ascent).

Properties I think are relevant:

  • intent: the lid did not "chose" to be there, humans did
  • doing something that the outer optimizer cannot do "as well" without using the same process as the inner optimizer : would be very tiring for humans to use our hands as lids. Humans cannot play go as well as Alpha Zero without actually running the algorithm.
Comment by mtrazzi on HLAI 2018 Field Report · 2018-08-29T13:44:45.709Z · score: 4 (3 votes) · LW · GW

it feels wrong to call other research dangerous, especially given its enormous potential for good.

I agree that calling 99.9% of AI research "dangerous" and AI Safety research "safe" is not an useful dichotomy. However, I consider AGI companies/labs and people focusing on implementing self-improving AI/code synthesis extremely dangerous. Same for any breakthrough in general AI, or things that greatly shorten the AGI timeline.

Do you mean that some AI research have positive expected utility (e.g. in medecine) and should not be called dangerous because the good they produce compensates for the increased AI-risk?

Comment by mtrazzi on HLAI 2018 Field Report · 2018-08-29T13:06:21.538Z · score: 12 (3 votes) · LW · GW

outside that bubble people still don't know or have confused ideas about how it's dangerous, even among the group of people weird enough to work on AGI instead of more academically respectable, narrow AI.

I agree. I run a local AI Safety Meetup and it's frustrating to see that the ones who better understand the discussed concepts consider that Safety is way less interesting/important than AGI Capabilities research. I remember someone saying something like: "Ok, this Safety thing is kind of interesting, but who would be interested in working on real AGI problems" and the other guys noding. What they say:

  • "I'll start an AGI research lab. When I feel we're close enough to AGI I'll consider Safety."
  • "It's difficult to do significant research on Safety without knowing a lot about AI in general."
Comment by mtrazzi on LW Update 2018-08-23 – Performance Improvements · 2018-08-24T20:36:00.353Z · score: 1 (1 votes) · LW · GW

Bug: On Chrome using a Samsung Galaxy S7/Android 8.0.0 the "click and hold" thing does not work. Same with the "click to see how many people voted".

Book Review: AI Safety and Security

2018-08-21T10:23:24.165Z · score: 54 (30 votes)
Comment by mtrazzi on Building Safer AGI by introducing Artificial Stupidity · 2018-08-14T20:52:30.232Z · score: 1 (1 votes) · LW · GW

Yes, typing mistakes in Turing Test is an example. It's "artificially stupid" in the sense that you go from a perfect typing to a human imperfect typing. I guess what you mean by "smart" is an AGI that would creatively make those typing mistakes to deceive humans into believing it is human, instead of some hardcoded feature in a Turing contest.

Comment by mtrazzi on Building Safer AGI by introducing Artificial Stupidity · 2018-08-14T20:07:29.322Z · score: 1 (1 votes) · LW · GW

The points we tried to make in this article were the following:

  • To pass the Turing Test, build chatbots, etc., AI designers make the AI artificially stupid to feel human-like. This tendency will only get worse as we get to interact more with AIs. The pb is that to have sth really "human-like" necessits Superintelligence, not AGI.
  • However, we can use this concept of "Artificial Stupidity" to limit the AI in different ways and make it human-compatible (hardware, software, cognitive biases, etc.). We can use several of those sub-human AGIs to design safer AGIs (as you said), or test them in some kind of sandbox environment.
Comment by mtrazzi on Building Safer AGI by introducing Artificial Stupidity · 2018-08-14T19:51:10.578Z · score: 4 (2 votes) · LW · GW

If I understand you correctly, every AGI lab would need to agree in not pushing the hardware limits too much, even though they would steel be incentivized to do so to win some kind of economic competition.

I see it as a containment method for AI Safety testing (cf. last paragraph on the treacherous turn). If there is some kind of strong incentive to have access to a "powerful" safe-AGI very quickly, and labs decide to skip the Safety-testing part, then that is another problem.

Building Safer AGI by introducing Artificial Stupidity

2018-08-14T15:54:33.832Z · score: 8 (4 votes)

Human-Aligned AI Summer School: A Summary

2018-08-11T08:11:00.789Z · score: 44 (13 votes)
Comment by mtrazzi on Human-Aligned AI Summer School: A Summary · 2018-08-10T06:49:26.484Z · score: 3 (3 votes) · LW · GW

Added "AI" to prevent death from laughter.

Comment by mtrazzi on Human-Aligned AI Summer School: A Summary · 2018-08-09T21:09:29.066Z · score: 3 (3 votes) · LW · GW

I agree that the "Camp" in the title was confusing, so I changed it to "Summer School". Thank you!

Comment by mtrazzi on A Gym Gridworld Environment for the Treacherous Turn · 2018-08-02T09:50:58.486Z · score: 1 (1 votes) · LW · GW
a treacherous turn involves the agent modeling the environment sufficiently well that it can predict the payoff of misbehaving before taking any overt actions.

I agree. To be able to make this prediction, it must already know about the preferences of the overseer, know that the overseer would punish unaligned behavior, potentially estimating the punishing reward or predicting the actions the overseer would take. To make this prediction it must therefore have some kind of knowledge about how overseers behave, what actions they are likely to punish. If this knowledge does not come from experience, it must come from somewhere else, maybe from reading books/articles/Wikipedia or oberving this behaviour somewhere else, but this is outside of what I can implement right now.

The Goertzel prediction is what is happening here.


It's important to start getting a grasp on how treacherous turns may work, and this demonstration helps; my disagreement is on how to label it.

I agree that this does not correctly illustrate a treacherous right now, but it is moving towards it.

Comment by mtrazzi on A Gym Gridworld Environment for the Treacherous Turn · 2018-07-31T12:34:56.968Z · score: 3 (1 votes) · LW · GW

Thanks for the suggestion!

Yes, it learned through Q-learning to behave differently when he had this more powerful weapon, thus undertaking multiple treacherous turn in training. A "continual learning setup" would be to have it face multiple adversaries/supervisors, so it could learn how to behave in such conditions. Eventually, it would generalize and understand that "when I face this kind of agent that punishes me, it's better to wait capability gains before taking over". I don't know any ML algorithm that would allow such "generalization" though.

About an organic growth: I think that, using only vanilla RL, it would still learn to behave correctly until a certain threshold in capability, and then undertake a treacherous turn. So even with N different capability levels, there would still be 2 possibilities: 1) killing the overseer gives the highest expected reward 2) the aligned behavior gives the highest expected reward.

Comment by mtrazzi on Saving the world in 80 days: Epilogue · 2018-07-29T00:17:28.235Z · score: 5 (2 votes) · LW · GW

Congrats on your meditation! I remember commenting on your Prologue, about 80 days ago. Time flies!

Good luck with your ML journey. I did the 2011 Ng ML course, that uses Matlab, and Ng's DL specialization. If you want to get a good grasp of recent ML I would recommend you to directly go to the DL specialization. Most of the original course is in the newer course, and the DL specialization uses more recent libraries (tf, keras, numpy).

A Gym Gridworld Environment for the Treacherous Turn

2018-07-28T21:27:34.487Z · score: 60 (24 votes)
Comment by mtrazzi on RFC: Mental phenomena in AGI alignment · 2018-07-06T10:14:33.215Z · score: 4 (2 votes) · LW · GW

Let me see if I got it right:

1) If we design an aligned AGI by supposing it doesn't have a mind, it will produce an aligned AGI even if it actually possess a mind.

2) In the case we suppose AGI have minds, the methods employed would fail if it doesn't have a mind, because the philosophical methods employed only work if the subject has a mind.

3) The consequence of 1) and 2) is that supposing AGI have minds has a greater risk of false positive.

4) Because of Goodhart's law, behavioral methods are unlikely to produce aligned AGI

5) Past research on GOFAI and the success of applying "raw power" show that using only algorithmic methods for aligning AGI is not likely to work

6) The consequence of 4) and 5) is that the approach supposing AGI do not have minds is likely to fail at producing aligned AI, because it can only use behavioral or algorithmic methods.

7) Because of 6), we have no choice but take the risk of false positive associated with supposing AGI having minds

My comments:

a) The transition between 6) and 7) assumes implicitly that:

(*) P( aligned AGI | philosophical methods ) > P( aligned AI | behavorial or algorithmic methods)

b) You say that if we suppose the AGI does not have a mind, and treat is a p-zombie, then the design would work even though it has mind. Therefore, when supposing that the AGI does not have a mind, there is no design choices that optimize the probability of aligned AGI by assuming it does not possess mind.

c) You assert that using philosophical methods (assuming the AGI does have a mind), a false positive would make the method fail, because the methods use extensively the hypothesis of a mind. I don't see why a p-zombie (which by definition would be indistinguishable from an AGI with a mind) would be more likely to fail than an AGI with a mind.

Comment by mtrazzi on RFC: Meta-ethical uncertainty in AGI alignment · 2018-06-12T21:03:31.365Z · score: 4 (2 votes) · LW · GW

As you mentionned, no axiology can be inferred from ontology alone.

Even with meta-ethical uncertainty, if we want to build an agent that takes decisions/actions, it needs some initial axiology. If you include (P) "never consider anything as a moral fact" as part of your axology, then two things might happen:

  • 1) This assertion (P) stays in the agent without being modified
  • 2) The agent rewrites its own axiology and modify/delete (P)

I see a problem here. If 1) holds, then it has considered (P) has a moral fact, absurd. If 2) holds, then your agent has lost the meta-ethics principle you wanted him to keep.

So maybe you wanted to put the meta-ethics uncertainty inside the ontology ? It this is what you meant, that doesn't seem to solve the axiology problem.

Comment by mtrazzi on Simulation hypothesis and substrate-independence of mental states · 2018-05-30T07:53:35.654Z · score: 3 (1 votes) · LW · GW

Thank you for your article. I really enjoyed our discussion as well.

To me, this is absurd. There must be something other than readability that defines what a simulation is . Otherwise, I could point to any sufficiently complex object and say : “this is a simulation of you”. If given sufficient time, I could come up with a reading grid of inputs and outputs that would predict your behaviour accurately.

I agree with the first part (I would say that this pile of sand is a simulation of you). I don't think you could accurately predict any behaviour accurately though.

  • If I want to predict what Tiago will do next, I don't need just a simulation of Tiago, I need at least some part of the environment. So I would need to find some more sand flying around, and then do more isomorphic tricks to be able to say "here is Tiago, and here is his environment, so here is what he will do next". The more you want to predict, the more you need information from the environment. But the problem is that, the more information you get at the beginning, and the more you get at the end, and the more difficult it gets to find some isomorphism between the two. And it might just be impossible because most spaces are not isomorph.
  • There is something to be said about complexity, and the information that drives the simulation. If you are able to give a precise mapping between sand (or a network of men) and some human-simulation, then this does not mean that the simulation is happening within the sand: it is happening inside the mind doing the computations. In fact, if you understand well-enough the causal relationships in the "physical" world, the law of physics etc., to precisely build some mapping from this "physical reality" to a pile of sand flying around, then you are kind of simulating it in your brain while doing the computations.
  • Why I am saying "while doing the computations"? Because I believe that there is always someone doing the computations. Your thought experiments are really interesting, and thank you for that. But in the real world, sand does not start flying around in some strange setting forever without any energy. So, when you are trying to predict things from the mapping of the sand, the energy comes from the energy of your brain doing those computations / thought experiments. For the network of men, the energy comes from the powerful king giving precise details about what computations the men should do. In your example, we feel that it must not be possible to obtain consciousness from that. But this is because the energy to effectively simulate a human brain from computations is huge. The number of "basic arithmetic calculations by hand" needed to do so is far greater than what a handful of men in a kingdom could do in their lifetime, just to simulate like 100 states of consciousness of the human being simulated.
The simulation may be a way of gathering information about what is rendered, but it can't influence it. This is because the simulation does not create the universe that is being simulated.

Well, I don't think I fully understand your point here. The way I see it, Universe B is inside Universe A. It's kind of a data compression, so a low-res Universe (like a video game in your TV). So whatever you do inside Universe A that influences the particles of the "Universe B" (which is part of the "physical" Universe A) will "influence" Universe B.

So, what you're saying is that the Universe B kind of exists outside the physical world, like in the theoretical world, and so when we're modifying Universe B (inside universe A) we are making the "analogy" wrong, and simulating another (theoretical) Universe, like Universe C?

If this is what you meant, then I don't see how it connects to your other arguments. Whenever we give more inputs to a simulated universe, I believe we're adding some new information. If you're simulation is a closed one, and we cannot interact with it or add any input, then ok, it's a closed simulation, and you cannot change it from the outside. But you have indeed a simulation of a human being and are asking what happens if you torture him, you might want to incorporate some "external inputs" from torture.

Comment by mtrazzi on [deleted post] 2018-05-12T09:42:15.530Z

You're right. I appreciate the time and effort you put in giving feedback, especially the google docs. I think I didn't said it enough, and didn't get to answer your last feedbacks (will do this weekend).

The question is: are people putting to much effort in giving feedback with small improvements in the writing/posts? If yes, then it feels utterly inefficient to continue giving feedback or writing those daily posts.

I also believe that one can control the time he spends on giving feedback, by saying only the most important thing (for instance Ikaxas saying the bold/underline thing).

I am not sure if this is enough to make daily LessWrong posts consistently better, and more importantly if it is enough to make them valuable/useful for the readers.

I am actively looking for a way to continue posting daily (on Medium or a personal website) and keep getting good feedback without spamming the community. I could request quality feedback (by posting every week max) only once in a while and not ask for too much of your time (especially you, Elo).

Thank you again for your time/efforts, and the feedback you gave in the google docs/comments.

Comment by mtrazzi on [deleted post] 2018-05-12T09:32:15.652Z

I gave some points about the higher quality/low quality debate in my two answers to Viliam, but I will answer more specifically to this here.

The quality of a post is relative to the other posts. Yes, if the other articles are from Scott Alexander, ialdaboth, sarahconstantin and Rob Bensiger, the quality of my daily posts are quite deplorable, and spamming the frontpage with low quality posts is not what LW users want.

However, for the last few days, I decided not to publish on the frontpage, and LW even changed the website so that I can't publish on the frontpage. So it's personal blog by default, and it will go to frontpage only if mods/LW users enjoy it and think it's insightful enough.

Are you saying that people might want high quality personal blogs then?

Well, I get why people might be interested in reading personal blogs, and want them to be of high quality. And, because you got to correct some of my posts, I understand the frustration of seeing articles published where there still is a lot of work to do.

However, the LW algorithm is also responsible for this. Maybe it promotes too much the recent posts, and should highlight more the upvoted ones. Then, my posts will never be visible. Only the 20+ upvotes will be visible in the personal blogs page.

I understand why people would prefer an article that took one week to write, short and concise, particularly insightful. I might prefer that as well, and start to only post higher-quality posts here. But I don't agree that it is not recommended for people to post not-well-thought-off articles on a website where you are able to post personal blogs.

I think volume is not a problem if the upvote/downvote system and the algorithms are good enough to filter the useful posts for the readers. People should not filter themselves, and keep articles they enjoy not as much as Scott Alexander ones ( but still find insightful), for themselves.

Comment by mtrazzi on [deleted post] 2018-05-12T09:15:32.193Z
So, twelve articles, one of them interesting, three or four have a good idea but are very long, and the rest feels useless.

I appreciate you took the time to read all of them (or enough to comment on them). I also feel some are better written than the others, and I was also more inspired for some. From what I understood, you want the articles to be "useful" and "not too long". I understand what you would want that (maximize the (learned stuff)/(time spent on learning) ration). I used to write on Medium where the read ratio of posts would decrease significantly with the length of the post. This pushed me to read shorter and shorter posts, if I wanted to be read entirely. I wanted to try LW because I imagined here people would have longer attention spans and could focus on philosophical/mathematical thinking. However, if you're saying I'm being "too long with very low density of ideas" I understand why this could be infuriating.

I typically do not downvote the "meh" articles, but that's under assumptions that they don't appear daily from the same author

I get your point, and it makes sense with what you said in the first comment. However, I don't feel comfortable with people downvoting "meh" articles because of the author (even though it's daily). I would prefer a website where people could rate articles independently of who the author is, and then check their other stuff.

My aggregate feedback would be: You have some good points. But sometimes you just write a wall of text.

Ok. So I should be more clear/concise/straight-to-the-point, gotcha.

And I suspect that the precommitment to post an article each day could be making this a lot worse. In a different situation, such as writing for an online magazine which wants to display a lot of ads, writing a lot of text with only a few ideas would be a good move; here it is a bad move.

Could you be more specific about what you think would be my move? For the online magazine, getting the maximum number of clicks/views to display the more ads makes sense, and so lots of text with lots of ads, and enough ideas to ensure the reader keeps seeing adds makes sense.

But what about LW? My move here was simple: understand better AI Safety by forcing myself to daily crystallize ideas about ideas related to the field, on a website with great feedback/discussions and low-tolerance for mistakes. For now, the result (in the discussions) is, overall, satisfying, and I feel that people here seem to enjoy AI Safety stuff.

More generally, I think the fact that if I generate 10% of headers or you get to click on all my articles may be correlated to other factors than me daily posting, such as:

  • The LW algorithm promotes them
  • You're "Michaël Trazzi" filter (you need one, because you get to see my header) is not tuned correctly, because you still seem to still be reading them, even if only 1/12 felt useful (or maybe you just read them to comment on this post?).

This comment is already long (sorry for the wall of text), so I will say more about the Meta LW high/low quality debate on Elo's comment below.

Comment by mtrazzi on [deleted post] 2018-05-12T08:48:25.192Z

Thank you Viliam for your honest feedback.

I think you're making some good points, but you're ignoring (in your comment) some aspects.

"do I want this kind of article, from the same author, to be here, every day?". And the answer is "hell no".

So what you're saying is "whenever deciding to upvote or downvote, I decide whether I want more articles like this or not. But because you're posting every day, when I am deciding whether or not to downvote, I am deciding if I want an article every single day and the answer to this is no".

I understand the difference in choice here (a choice for every article, instead of just for one). I assumed that on LW people could think about posts independently, and could downvote a post and upvote another from the same author, saying what felt useful or not, even if it is daily. I understand that you just want to say "no" to the article, to say "no" to the series, and this is even more true if the ratio of good stuff is the one you mention at the end.

It is easier to just ignore a one-off mistake than to ignore a precommitment to keep doing them every day.

What would be the mistake here? From what I understand, when reading an article and seeing a mistake, the mistake is "multiplied" by the number of time it could happen again in other articles, so every tiny mistakes becomes important? If I got you right, I think that by writing daily, those little mistakes (if possible to correct easily) could be corrected quickly by commenting on a post, and I would take it into account in the next posts. A short feedback loop could improve quickly the quality of the posts. However, I understand that people might not want LW to be an error-tolerant zone, but would prefer a performance zone.

And... you are polluting this filter. Not just once in a while, but each day. You generate more than 10% of headers on this website recently.

I had not thought about it in terms of daily % of headers of the website, interesting point of view. I also use Hacker News as a filter (for other interests) and LW is also a better option for the interests I mentioned in my posts. I think the real difference is the volume of posts in hacker news/reddit/LW. It is always a tradeoff between being in a pool of hundreds of high quality posts (more people reading, but more choices for them), or a pool of only a dozens of even-higher quality posts but less traffic.

The Multiple Names of Beneficial AI

2018-05-11T11:49:51.897Z · score: 17 (6 votes)
Comment by mtrazzi on Saving the world in 80 days: Prologue · 2018-05-10T09:51:49.186Z · score: 28 (7 votes) · LW · GW


The energy from this post is real.

You, sir, just bootstrapped my motivation so here are a few techniques/experiences:

  • Two years ago I wanted to study hard for a really though exam. What I did at the time: I published a status on Facebook saying that I would study 14 hours a day for the next 14 days with my bestfriend, and that if I didn't do it I would pay him 100 euros + give him my favorite sweatshirt. I also published a post on stickk to make the payment official. At that time, I was also waking up everyday before 4:30am for another challenge (similar setting). What I learned: With enough accountability, from (friends, colleagues, family), I can study hard, wake up very early, and do impressive things (when I think about it now, I really had a shit ton of energy at that time). However, mental/physical stretches alone don't necessarily imply that you will achieve something you find meaningful. Yes, I was really productive during this 14 days sprint. But I failed at my exams (my body was exhausted and I got sick two days before) and I didn't even learn anything useful (I studied to pass the exam, and didn't study deeply the material). Practical Advice: 16 hours a day is a long period of time. You might want to shorten this time to maybe 10, or 12 max. I know, when people told me this three years ago, I also thought they were not motivated enough. They were right. Please, try to spend at least 2-4 hours a day doing some really relaxing stuff, go outside, meet people, do something were you're truly uninterested in the results, just wanting to chill. This relaxing time is the most important time of the day, and no, 'I'll move on to living maintenance or meditation which I consider mentally relaxing.' doesn't seem to me like enough relaxation. Meditation is a mental stretch, and living maintenance/food/hygiene(sleeping, shower, food, etc.) does make you more relaxed, but is not really an activity where you can immerse yourself, meet people, talk about your emotions and let off steam. In particular, try not to eat/do living maintenance alone.
  • I used to write down some great plan (like you just did) every month or so, and giving up after one week every time. I remember saying to my friend that I would work more, code at least 4 hours a day, etc. None of this worked for me (in the long term). What I learned: whenever I make a plan (e.g. work 4 hours a day), a) I kind of force myself to do something I don't really want to do. Another way of putting this: I like the result, but I don't like the process (see this book). So, in the long term, I always end up quitting. Furthermore, b) I am always overly optimistic, forgetting life's hazards (planning fallacy). 4 hours a day (in my example) is a lot, and if you're busy all day with something else, you might not even have the time to do it (yesterday's scenario ("So I'm officially starting tomorrow") might happen again). Practical advice: About three weeks ago, I started being absurdly productive, by assigning a color to every 15m of my life on a Google Sheet. For the last five years, I tried to boost my productivity, and this is the technique that worked the best with me so far. Why this works: I don't force myself to do anything. I know what is important to me (writing and seeing people), and, just by looking at the colors, I know if I have been doing random unimportant stuff, or things that I value. In other words, I have enough flexibility to do whatever I like, but it also gives me enough motivation to write for hours, because I emotionally connect with my time. So, as a practical advice, I would say: try to insert more flexibility. This is especially true, given that you said "I'm definitely shooting for sustainability here, so I'm trying to figure out my limits without burning out"

Now, here are some questions about your program:

  • What exactly are you trying to achieve? Ok, you'll be doing some AI Safety reading, and some tensorflow. But like, what is your deeper motivation? Can you state your life-statement in a sentence? You commented on the AISF program in June, do you want to prepare for that? Can you give some more info about "someone to protect"?Feedback: think your post definitely needs a longer introduction (with a real paragraph, not bullet points) about what you're doing and why you're doing it.
  • "Create a more realistic model of what I can do/ of my limits" : what you can do in AI Safety? The impact you can have in the field? Your limits as a Computer Scientist/AI Researcher? Or, are you trying to grasp the limits of your productivity? Your biological limits?
  • "Radically improve my knowledge of the field & various maths" : could you be more specific? What exactly do you fell you're lacking? What would you want to study more? Is there some particular sub-domain that interest you most?

This is quite a long comment, because your post moved me. I also want to deepen my understanding of AI Safety, and also love to challenge my productivity. I will happily comment on your posts. I have been doing a series of daily Less Wrong posts for 12 days in a row, sometimes about AI Safety, sometimes about general AI/Philosophy/Decision Theory, and the energy in this post motivated me to keep writing and to keep pushing my limits. See you on the other side!

Talking about AI Safety with Hikers

2018-05-10T06:38:26.620Z · score: 8 (4 votes)
Comment by mtrazzi on Better Decisions at the Supermarket · 2018-05-09T17:31:52.447Z · score: 3 (1 votes) · LW · GW


Comment by mtrazzi on Is epistemic logic useful for agent foundations? · 2018-05-09T09:33:50.103Z · score: 7 (2 votes) · LW · GW

Yes. I had a course on Logic and Knowledge Representation last semester (October->January). In parallel, I attended an Autumn School about AI in late October, which included two 2h courses on Epistemic Logic. The speaker went super fast, so those 4 hours were ultra-productive (here are my notes in French). However, I did not fully understand everything, so I was happy to make my knowledge more solid with practical exercises/exams/homework etc. during my Logic Knowledge Representation course. The two approach (summer school and semester course) were complementary and gave me a good grasp on logic in general, and in particular epistemic logic.

This semester(February->May), I had a course on Multi-Agent Systems , and knowing about epistemic logic, and more generally modal logic, was handy. When the environment of a robot changes, it needs to take into account this change and integrate this into his representation of the world. When two agents communicate, having a representation of the knowledge of the other agent is essential to not send redundant or, worse, contradictory information.

A great part of Multi-Agent Systems is about communication, or how to harmonize the global knowledge, so knowing about epistemic logic is advantageous. In this article I talk about the gossip problem in the context of a project we had to do in a course on Multi-Agent Systems. The same teacher who exposed me the gossip problem, was the one who taught the Modal Logic/Epistemic Logic course. Epistemic Logic is useful to have the full picture about communication protocols, and the lower bounds in bits of information necessary to communicate secrets.

A little anecdote to end this comment: last week I had a mathematician/logician who came to a Meetup I organized on AI Safety. At the beginning, it was only the two of us, and he went to explain how he went from logic to AI. "You know, all my life I have been studying very theoretical problems. To be very specific, group theory was the most applied math I have ever done. But, now that I study AI, having studied intractable/undecidable problems for decades, I know almost instantaneously which theory will work for AGI and which won't." We ended up having some discussion about logic and knowledge representation. We could not have had this chat without me having taken some courses on epistemic logic.

Applied Coalition Formation

2018-05-09T07:07:42.014Z · score: 3 (1 votes)
Comment by mtrazzi on Better Decisions at the Supermarket · 2018-05-08T14:39:41.991Z · score: 3 (1 votes) · LW · GW

Amazing work! I accepted most of your corrections at the beginning, and left some replies where I disagreed/was unsure.

Would need a bit less than an additional hour to correct everything and write something that satisfies me (given your comments), so I might do it later or another day (writing something for today is my top priority for now).

Comment by mtrazzi on Better Decisions at the Supermarket · 2018-05-08T14:37:21.878Z · score: 3 (1 votes) · LW · GW

Thank you for your feedback. I used to write Medium articles for publications (e.g. this one). For some publications, the guideline was to use bold at least once every paragraph.

When a friend of mine (LW reader) read one of those articles, he gently commented that he felt he was reading buzzfeed (overly distracting formatting).

That's why I tried to switch to undeline/italic in my writing (besides, I have a general feeling of simplicity/minimalism here).

Will edit this post with a better formatting soon (cf. Elo's google doc below)

EDIT: just realized there was only bold and italic in Medium, no underline...

Comment by mtrazzi on Beliefs: A Structural Change · 2018-05-08T14:31:36.775Z · score: 4 (2 votes) · LW · GW

Thank you. Will try to give more day-to-day rationality applications if I can.

EDIT: about the "write more things down", I think writing specific to LessWrong stuff (like when your beliefs change) might prove useful. However, just a lot of personal journaling or thought-crystallization-on-the-internet is enough.

Comment by mtrazzi on Beliefs: A Structural Change · 2018-05-08T14:29:18.307Z · score: 3 (1 votes) · LW · GW

Interesting comment.

I feel that what shapes the behaviour is not the belief in itself, but what this belief implies.

It's more an empiric law like "This guy believes in A, so it's improbable that he also believes in B, given that he is smart enough to not have contradicting views" (e.g. Solipsism and Utilitarianism).

Better Decisions at the Supermarket

2018-05-07T22:32:00.723Z · score: 0 (7 votes)
Comment by mtrazzi on AI Summer Fellows Program · 2018-05-07T18:24:16.436Z · score: 6 (2 votes) · LW · GW

Dear MIRI Team,

You mentioned the 25th of April that you were still accepting applications.

I submitted an application to your AI Summer Fellow Application about one week ago.

I would like to know (before the 14th of May if possible) if my Application was either accepted (as part of a "finalist" group?), or if it did not fit with your expectations.

Best Regards,

[PS: If anyone reading this has already received an answer, would it be possible to let me know? Thank you]

Beliefs: A Structural Change

2018-05-06T13:40:30.262Z · score: 9 (5 votes)
Comment by mtrazzi on Are you Living in a Me-Simulation? · 2018-05-04T14:50:35.473Z · score: 2 (1 votes) · LW · GW
Aside from that. you seem to think that when I am talking about halting a sim, I am emulating some gradual process like fallign asleep. I'm not.

I was not thinking that you were talking about a gradual process.

I think you are just conflating consciousness (conscious experience) and sense-of-self. It is quite possible to have the one without the other. eg severe amnesiacs are not p-zombies.

I agree that I am not being clear enough (with myself and with you) and appear to be conflating two concepts. With you're example of amnesiacs and p-zombies two things come to my mind:

1) p-zombies: when talking about ethics (for instance in my Effective Egoist article) I was aiming at qualias, instead of just sim. conscious agents (with conscious experience as you say). To come back to your first comment, I wanted to say that "identity and continuity of consciousness" contribute to qualia, and make p-zombies less probable.

2) amnesiacs: in my video game I don't want to play with a world full of amnesiacs. If whenever I ask questions about their past they're being evasive, it does not feel real enough. I want them to have some memories. Here is a claim:

(P) "For memories to be consistent, the complexity needed would be the same as the complexity needed to emulate the experience which would produce the memory"

I am really unsure about this claim (one could produce fake memories just good enough for people not to notice anything. We don't have great memories ourselves). However, I think it casts light on what I wanted to express with "The question is always how do you fake it." Because it must be real/complex enough for them not to notice anything (and the guy in the me-sim. too) but also not too complex (otherwise you could just run full-simulations).

Comment by mtrazzi on Should an AGI build a telescope to spot intergalactic Segways? · 2018-05-04T12:43:48.326Z · score: 2 (1 votes) · LW · GW

Interesting question! I don't have any clue. Maybe you could answer your own question, or give more information about those stories or your work on prehistorical technology?

Comment by mtrazzi on [deleted post] 2018-05-04T12:38:44.078Z

Yes, I agree that cost-effectiveness does not mean me-simulations would be useful.

What is needed are specific/empiric reasons why such posthuman civilization would want to run those me-simulations, which I haven't done in this article. However, I tried to give some reasons in the next post (where you also commented) :

Comment by mtrazzi on Are you Living in a Me-Simulation? · 2018-05-04T12:33:48.774Z · score: 2 (1 votes) · LW · GW
humans don't seem to have continuity of consciousness, in that we sleep

Yes, humans do sleep. Let's suppose that the consciousness "pauses" or "vanishes" during sleep. Is it how would you define a discontinuity in consciousness? An interval of time delta_t without consciousness separating two conscious processes?

Bergson defines in Time and Free Will: An Essay on the Immediate Data of Consciousness duration as inseparable from consciousness. What would it mean to change consciousness instantaneously with teleportation? Would we need a minimum delta_t for it to make sense (maybe we could infer it from physical constraints given by general relativity?).

Also, it seems plausible that you could fake the subjective continuity of consciousness in a sim.

The question is always how do you fake it. Assuming physicalism, there would be some kind of threshold of cerebral activity which would lead to consciousness. At what point in faking "the subjective continuity of consciousness" do we reach this threshold?

I think my intuition behind the fact that (minimum cerebral activity + continuity of memories) is what lead to the human-like "consciousness" is the first season of WestWorld, where Maeve, a host, becomes progressively self-aware by trying to connect her memories during the entire season (same with Dolores).

Comment by mtrazzi on AGI Safety Literature Review (Everitt, Lea & Hutter 2018) · 2018-05-04T11:28:54.984Z · score: 2 (1 votes) · LW · GW

Thank you for the link.

Few questions that come to mind:

  • What would be the differences / improvements from the "Concrete problems in AI Safety" paper? (
  • What would be the most important concrete problems to work on (for instance for a thesis)?
  • More generally, does anyone know if someone already made some kind of graph of dependencies (e.g. this problem must be solved before that one) ?
Comment by mtrazzi on Are you Living in a Me-Simulation? · 2018-05-04T06:01:54.043Z · score: 1 (1 votes) · LW · GW

Thank you for reading me.

1- In this post I don't really mention "non-me-simulations". I try to compare the probability of only one full-time conscious being (me-simulation) to what Bostrom calls ancestor-simulations, as those full-scale simulations where one could replay "the entire mental history of humankind".

For any simulation consisting of N individuals (e.g. N = 7 billion), there could in principle exist simulations where 0, 1, 2, ... or N of those individuals are conscious.

When the number k of individuals being conscious satisfies k << N then I call the simulation selective.

I think your comment points out to the following apparent conjunction fallacy: I am trying to estimate the probability of the event "simulation of only one conscious individual" instead of "simulation of a limited number of individuals k << N" of greater probability (first problem)

The point I was trying to make is the following: 1) ancestor-simulations (i.e. full-scale and computationally-intensive simulations to understand ancestor's history) would be motivated by more and more evidence of a Great Filter behind the posthuman civilization. 2) the need for me-simulation (which would be the most probable type of selective simulation because it only needs one player (e.g. a guy in his spaceship)) do not appear to rely on the existence of a Great Filter behind the posthuman civilization. They could be like cost-efficient single consciousness play for fun, or prisoners are condemned to.

I guess the second problem with my argument for the probablity of me-simulations is that I don't give any probability of being in a me-simulation, whereas in the original simulation argument, the strength of Bostrom's argument is that whenever an ancestor-simulation is generated, 100 billion conscious lives are created, which greatly improves the probability of being in such a simulation. Here, I could only estimate the cost-effectiveness of me-simulation in comparison with ancestor simulation.

2- I think you are assuming I believe in Utilitarianism. Yes, I agree that if I am Utilitarian I may want to act altruistically, even with some very small non-zero probability of being in a non-me-simulation or in reality.

I already answered to this question Yesterday in the effective egoist post (cf. comment to Ikaxas) and I am realizing that my answer was wrong because I didn't assume that other people could be full-time-conscious.

My argument (supposing I am Utilitarian, for the sake of argument), essentially, was that if I had 10$ in my pocket and wanted to buy me an icecream (utility of 10 for me let's say) I would need to provide an utility of 10*1000 to someone being full-time conscious to consider giving him the icecream (his utility would rise to 10 000 for instance). In the absence of some utility monster, I believe this case to be extremely unlikely and would end up eating icecreams all by myself.

[copy paste from Yesterday's answer to Ikaxas] In practice, I don't share deeply the Utilitarian view. To describe it shortly I believe I am a Solipsist who values the perception of complexity. So I value my own survival because I may not have any proof of any kind of the complexity of the Universe if I stop to exist, but also the survival of Humanity (because I believe humans are amazingly complex creatures), but I don't value positive subjective perceptions of other conscious human beings. I value my own positive subjective perceptions because it maximizes my utility function of maximizing my perception of complexity.

Anyway, I don't want to enter the debate of highly-controversial Effective Egoism inside what I wanted to be a more scientific probability-estimation post about a particular kind of simulation.

Thank you for your comment. I hope I answered you well. Feel free to ask any other clarification or point out to other fallacies in my reasoning.

Are you Living in a Me-Simulation?

2018-05-03T22:02:03.967Z · score: 6 (5 votes)
Comment by mtrazzi on Warrior rationalists · 2018-05-03T13:03:38.419Z · score: 1 (1 votes) · LW · GW

As a general feeling, I am very confused because I don't know if this is a joke, or if you are really blaming the people you mention.

Tell me if I get it right:

1) There are other sets of qualities than the list you establish. Hence, it is a shame that people from the community only have those qualities.

2) In general, activities involving sports, martial arts or anything that strengthens the body/survival skills is valuable.

3) Problem: people intellectually flex without being neither "tough" nor "wise".

My questions:

  • Assuming the people you mention all exhibit certain qualities from your list. What would be the cause?
  • What exactly would a rationalist gain from martial/survival skills? Would it be different from what an average "unfit white male" , as you describe it, would gain?
  • What exactly makes you think that people flex? Do you have specific examples?
Comment by mtrazzi on [deleted post] 2018-05-03T09:33:40.466Z

First, let me thank you for taking the time of writing down the premises/arguments. I think you summarized the argumentation sufficiently well to allow a precise discussion.

I - "Premise 2) is false"

Ancestor simulations are defined in the simulation argument as "A single such a computer could simulate the entire mental history of humankind (call this an ancestor-simulation) [...].". I agree with you that the argument deals with the probability of ancestor simulations, and not first-person simulations. Therefore, I cannot infer the probability of a first-person simulation from the simulation argument.

However, such simulations are mentioned here:

"In addition to ancestor-simulations, one may also consider the possibility of more selective simulations that include only a small group of humans or a single individual. The rest of humanity would then be zombies or “shadow-people” – humans simulated only at a level sufficient for the fully simulated people not to notice anything suspicious. It is not clear how much cheaper shadow-people would be to simulate than real people. It is not even obvious that it is possible for an entity to behave indistinguishably from a real human and yet lack conscious experience. Even if there are such selective simulations, you should not think that you are in one of them unless you think they are much more numerous than complete simulations. There would have to be about 100 billion times as many “me-simulations” (simulations of the life of only a single mind) as there are ancestor-simulations in order for most simulated persons to be in me-simulations." (

In short, I should only believe that I live in a me-simulations if I think they are 100 billion times more probable than ancestor-simulations.

Let me try to estimate the probability of those me-simulations, and compare it with the probability of ancestor simulations.

First, I claim that one of these two assertions is true (assuming the existence of post-humans who can run ancestor-simulations):

i) Post-humans will run ancestor-simulations because they don't observe other forms of (conscious) intelligence in the universe (Fermi Paradox), and are therefore trying to understand if there was indeed a great filter before them.
ii) Post-humans will observe that other forms of conscious intelligence are abundant and the universe, and have low interest in running Ancestor simulations.

Second, even with the premise of physical consciousness, I claim that me-simulations could be made at lest 100 billion times less computationally expensive than full simulations. Here are my reasons to believe so:

1) Even though it would be necessary to generate consciousness to mimic human processes, it would only be necessary for the humans you directly interact to, so maybe 10 hours of human consciousness other than yours every day.

2) The physical density needed to simulate a me-simulation would be at most the size of your room (about 20 meters squared * the height of your room). If you are in room it is trivially true, and if you are in the outer world I believe you are less self-aware of the rest of the physical world, so the "complexity of reality" necessary so that you to believe the world is real is about the same as if you were in your room. However, Earth's Surface is about 500 million km squared, so 2.5 * 10^13 times greater. It follows that it would be at least 100 billion times less computationally intensive to run a me-simulation, assuming you would want to simulate at least the same height for the ancestor-simulation.

3) You would only need to run one ancestor civilization to run an infinitely large number of me-simulation: if you know about the environment and have in memory how the conscious humans behaved, you can easily run a me-simulation where a bunch of the other characters are just a copy of what they did in the past (when they are in your focus), but only one (or a small number of people) is conscious. A bit like in Westworld there are some plots where robots are really convincing, but in general they are not.

I am only starting to answer your comment and have already written a lot, so I might just create a post about selective simulations latter today. If so, you could reply to this part there.

II - "the disagreement between yourself and effective altruists isn't normative, it's empirical"

I think I understand your point. If I got it right, you are saying that I am not contradicting Effective Altruism, but only applying an empirical reasoning with EA's principles? If so, I agree with your claim. I guess I tried to apply Effective Altruism's principles, and in particular the Utilitarian view (which might be controversial,even inside of EA, I don't know) to the described world (a video-game-like life) to show that it resulted in what I called ethical egoism.

If, counterfactually, there were other conscious beings in the world, would you think that they also had moral worth?

I don't share deeply the Utilitarian view. To describe it shortly I believe I am a Solipsist who values the perception of complexity. So I value my own survival because I may not have any proof of any kind of the complexity of the Universe if I stop to exist, but also the survival of Humanity (because I believe humans are amazingly complex creatures), but I don't value positive subjective perceptions of other conscious human beings. I value my own positive subjective perceptions because it maximizes my utility function of maximizing my perception of complexity.

III - "I think that it's still often worth it to act altruistically"

Let's suppose I had answered yes.

To come back to what we said about physicalism and the simulation of other conscious agents in a first-person view. You said:

"[...] even in a first-person simulation, the people you were interacting with would be conscious as long as they were within your frame of awareness (otherwise the simulation couldn't be accurate), it's just that they would blink out of existence once they left your frame of awareness."

I claim that even though they would be conscious in the "frame of awareness", they would not deserve any altruism, even considering Expected Value. The reason is that if you give them sporadic consciousness, it greatly lacks of what I consider as a conscious human's subjective experience. In particular, if the other simulated humans I connect to do not have any continuity in their consciousness, and the rest is just false memories (e.g. WestWorld), I would give a much greater value to my own subjective experience (at least 1000 times more valuable I would say).

So if I have 10$ in my pocket, I would still use it to buy me an icecream and I would not buy it to some random guys in the street, even if my 10$ could buy them 100 icereams (but I might hesitate with like 10000 for instance). The issue here is the probability of consciousness. If I assume there is a 1/1 000 000 chance someone is conscious and that I value my subjective experiences 10 000 times more than theirs, I would need to be able to buy like 10 000 * 1 000 000 = 10 000 000 000 icecreams (for more people than on Earth) to not buy me an icecream.

Anyway, I am very glad you clarified my hypothesis with your comment, asked for clarification and objected courteously. My post was not explicit at all and lacked rational/details arguments, what you did. Answering you made me think a lot. Thank you.

Feel free to let me know what you think. PS: I might do a full post on the probability of me-simulations in less than 10h as I said above.

A Logician, an Entrepreneur, and a Hacker, discussing Intelligence

2018-05-01T20:45:58.143Z · score: 11 (9 votes)
Comment by mtrazzi on [deleted post] 2018-04-30T15:22:50.684Z

Yes, basically communicating is understanding how information is encoded in the receiver's model of reality and encoding your message without hurting their feelings (if not necessary).

Here is my feedback on your post:

Tact: Interesting and funny!

Nerd speaking: I think I already understood your point with the tact filters quote, so what went after was redundant.

More Tact: But I nonetheless liked the diagrams.

Comment by mtrazzi on [deleted post] 2018-04-30T15:12:02.969Z
Thanks again for this piece. I'll follow your daily posts and comment on them regularly!

Very kind of you!

if an AGI could simulate quasi-perfectly a human brain, with human knowledge encoded inside, would your utility function be satisfied?

Interesting thought experiment. I would say no, but it depends on how it simulates the human brain.

If brain scans allow to quasi-perfectly get the position of every cell in a brain, and that we know how to model their interactions, we could have an electric circuit without knowing much about the information inside it.

So we would have the code, but we would have nothing about the meaning, so it would not "understand how the knowledge is encoded".

is the goal of understanding all there is to the utility function? What would the AGI do, once able to model precisely the way humans encode knowledge? If the AGI has the keys to the observable universe, what does it do with it?

You're absolutely right in the sense that it does not constitute a valid utility function for the alignment problem, or for anything really useful, if it is the one used for the final goal.

My point was that Yudkowsky showed how the encoding utility function was limited because of the simple way of maximizing it, but if we changed it to my "understanding the encoding" it could be much more interesting, enough to lead to AGI.

Once the AGI knows how humans encode knowledge, it can basically have, at least, the same model of reality of humans (assuming it has the same Input sensors (e.g. things like eyes, ears, etc.) , which is not very difficult) by encoding knowledge the same way. And then, because it is in Silico (and not in a biological prison), it can do everything humans do but much faster (basically an ASI).

I guess if it reaches this point, it would be the same as humans living for thousands of subjective years, and so it would be able to do whatever humans would believe useful if we were given more time to think about it.

Should we implement ought statements inside it in addition to the "understanding the encoding" utility function? Or the "faster human" is enough? I don't know.

Comment by mtrazzi on Should an AGI build a telescope to spot intergalactic Segways? · 2018-04-29T11:31:50.111Z · score: 2 (1 votes) · LW · GW

Thank you for your well-formulated comment. I agree that more details/precision could be much appreciated.

I am confused by the title, and the conclusion.

Not understanding the title and the conclusion is a natural/expected reaction. I wanted to write this Meetup summary for a long time and only thought of this funny headline for a title and I guess the conclusion might seem like a weird way to come back on feet. I was also short on time so I had to be overly implicit. I will nonetheless try to answer your comment the best as I can.

If an ASI sees a Segway, a single time, would it be able to infer what is does, what's it for, how to build it, etc.? I think so! The purpose of one-shot learning models is to provide a context, a structure, that can be augmented with a new concept based on a single example. This is far simpler than coming up with said new concept from scratch.

I also think so! I totally agree that providing a structure/context is much simpler to truly innovate by creating a completely new idea (such as general relativity for Einstein).

See, on efficient use of sensory data, That Alien Message.

Totally relevant reference, thank you.

I interpret your post as « no, an ASI shouldn't build the telescope, because it's a waste of resources and it wouldn't even need it » but I'm not sure this was the message you wanted to send.

I think I was not clear enough about the message. Thank you for asking for clarifications.

Actually, I believe the ASI should build the telescope (and it might not even be a waste of resource if it knows physics well enough to optimize it in a smart way).

The Segway is not, in itself, a complicated engineering product. An ASI could, in principle, generalize the concept of a Segway from seeing it only once (as you mentioned) and understand the usage humans would have of it (if it had some prior knowledge about humans, of course).

What I meant by "Intergalactic Segway" is an ad hoc engineering product made by some strange intergalactic empire we have never met. Segways seem really convenient for humans, but they are so because they fit our biological bodies which are very specific and adapted from natural selection (which, in turn, adapted from planet Earth).

I believe aliens might have different needs and engineering features, and would end up building "Intergalactic Segways" to suit their needs, and that we would have not a single clue about what those "Intergalactic Segways" even look like.

Furthermore, even if for the ASI it was more resource efficient to generate 10^30 simulations of the Universe to know how other aliens behave, I think it is not enough.

I think the search space for alien civilizations (if we assume that human-level-intelligence civilizations are rare in the universe) is huge, and that to run sufficiently precise physical simulations in this incredibly huge space would prove impossible, and that building a telescope (or just send von Neumann probes at the edges of the observable universe) would be the only efficient solution.

This is all I have to say for now (had not thought more about it).

If you have more critics/questions I would be happy to discuss it further.

Should an AGI build a telescope to spot intergalactic Segways?

2018-04-28T21:55:15.664Z · score: 14 (4 votes)