Wanting to Succeed on Every Metric Presented 2021-04-12T20:43:01.240Z
Using GPT-N to Solve Interpretability of Neural Networks: A Research Agenda 2020-09-03T18:27:05.860Z
What's a Decomposable Alignment Topic? 2020-08-21T22:57:00.642Z
Mapping Out Alignment 2020-08-15T01:02:31.489Z
Writing Piano Songs: A Journey 2020-08-10T21:50:25.099Z
Solving Key Alignment Problems Group 2020-08-03T19:30:45.916Z
No Ultimate Goal and a Small Existential Crisis 2020-07-24T18:39:40.398Z
Seeking Power is Often Convergently Instrumental in MDPs 2019-12-05T02:33:34.321Z
"Mild Hallucination" Test 2019-10-10T17:57:42.471Z
Finding Cruxes 2019-09-20T23:54:47.532Z
False Dilemmas w/ exercises 2019-09-17T22:35:33.882Z
Category Qualifications (w/ exercises) 2019-09-15T16:28:53.149Z
Proving Too Much (w/ exercises) 2019-09-15T02:28:51.812Z
Arguing Well Sequence 2019-09-15T02:01:30.976Z
Trauma, Meditation, and a Cool Scar 2019-08-06T16:17:39.912Z
Kissing Scars 2019-05-09T16:00:59.596Z
Towards a Quieter Life 2019-04-07T18:28:15.225Z
Modelling Model Comparisons 2019-04-04T17:26:45.565Z
Formalizing Ideal Generalization 2018-10-29T19:46:59.355Z
Saving the world in 80 days: Epilogue 2018-07-28T17:04:25.998Z
Today a Tragedy 2018-06-13T01:58:05.056Z
Trajectory 2018-06-02T18:29:06.023Z
Gaining Approval: Insights From "How To Prove It" 2018-05-13T18:34:54.891Z
Saving the world in 80 days: Prologue 2018-05-09T21:16:03.875Z
Mental TAPs 2018-02-08T17:26:36.774Z


Comment by elriggs on Going Out With Dignity · 2021-07-11T22:03:28.521Z · LW · GW

High-detachment is great!...for certain situation for certain times. I really enjoy Rob Burbea's "Seeing That Frees" meta-framework regarding meditation techniques/ viewpoints: they are tools to be picked up and put down. If viewing the world in complete acceptance helps your suffering in that moment, then great! But you wouldn't want to do this all the time; eating and basic hygiene are actions of non-acceptance at a certain conceptual level. Same with impermanence and no-self. Your math friend may be open to that book recommendation.

TurnTrout argues that Tsuyoku Naritai is not it, and maybe he is right. I do not know what the correct emotion feels like, but I think maybe DF knew.

I've had a similar experience with feeling Tsuyoku Naritai, but it being a temporary state (a few hours or days at a time maybe). I'm currently extremely interested in putting on different mindsets/perspectives for different purposes. An example is purposely being low-energy for sleeping and high-energy for waking up (as in purposely cultivating a "low-energy" mindset and having a TAP to tune into that mindset when you're trying to sleep).  In this case, Tsuyoku Naritai may be good for lighting the fire for a day or two to get good habits started. But I think people may use unnecessary muscles when tuning into this mindset, causing it to be tense, head-ache inducing, and aversive. 

This is speculation though, but I am, again, very interested in this topic and discussing it more. Feel free to dm me as well if you'd like to chat or call.

Comment by elriggs on Today a Tragedy · 2021-06-10T09:27:44.628Z · LW · GW

Hey Will, 

I'm a couple days early, but I just woke up from a dream where I was doing a Duo event with the Cat in the Hat and was thinking of you.

You remember that one year we did a Duo event but as fillers? There were other filler groups and the judges were all supposed to mark us last below the actual uo groups. But we thought, if there are multiple filler groups, the judge will have to rank them relative to each other, why not be the best filler?

So we made a skit out of being the filler group, asking the judge to rate us below any actual Duo pairs, but above all the fillers. I remember telling a few Chuck Norris jokes as well; it was really fun! Our plan worked a little too well though, haha, and we ended up progressing to Semi-Finals, then the Finals the next day. 

I remember our hubris at thinking we could prepare a working piece by the next day (we did Pinocchio!) and try to beat everyone else who had been practicing all year. We asked our parents to drop us off early the next day so we could practice, but we were the first ones there so the doors were locked, so my dad stayed to watch us. It was crazy, but we ended up running through the new piece 10-20 times really quick, memorizing (approximations of) all the lines, creating the choreography, and actually having a decent duo piece after a couple hours!

We still got 6/6 place as we should! Hahahahaha. But it was really cool to actually try to learn a piece really quickly, to have someone else with me as we flew too close to the sun. I remember the exact room we performed it in too, and I remember us reenacting the "stuck in the whale" scene. It was really fun, hahaha. 

I'm currently on day 4 of an intensive math study retreat. It's really slow at times and my eyes definitely glaze over at times. Your death made it much easier to ignore the unimportant. I almost quit my job when you died! Though, if I had quit, I would still have two working eyes; hindsight is 20/20, hahaha. 

Dang, I'm back in our hometown, and it would've been nice to catch up with you.

It really sucks that you're gone,


Comment by elriggs on Quick examination of miles per micromort for US drivers, with adjustments for safety-increasing behavior · 2021-04-21T14:02:51.795Z · LW · GW

Driving at night is not just about your own tiredness/circadian rhythm, there are other people driving tired and drunk. 

In my college town, there's a 1/4 mile long plastic-fenced in road, leading to rental houses. Every Thursday-Sunday night, someone will run their car into this fence, leaving broken fence marks the next morning. 

Comment by elriggs on Networks of Meaning · 2021-04-17T21:33:27.131Z · LW · GW

Conspiracy theories are usually represented with a large amount of connections (and a distrust of those in power). Notably, I love Scott Alexander's many self-created non-sense connections (see all of Unsong), which still end up evoking this sense of importance even though I know it's fiction. 

I'm glad you honed in on coherence, existential mattering, and purpose because there are an infinite amount of connections between things that feel unmeaningful (i.e. the grass and my mouse pad are both green, it is hot outside my door and also hot outside my door to a few feet to the right, etc.). Honing in on what specific properties makes a connection feel meaningful seems interesting (as well as looking at the existing literature and listing specific, real-life examples but that's just my personal preference).

The strong emotion causing meaning (as opposed to a connection evoking meaning) was interesting, though couldn't you say that specific connections cause strong emotions? For example, someone making fun of something I strongly identify with ("All your actions are selfish!") as opposed to something I don't really care about ("You're a bad tuba player!") affects me differently; I could describe each activity as weaker and stronger "connections" to myself. 

A specific strong emotion that's doesn't quite fit is experiencing jhana, which I could describe as a meditative flow state that feels really good. It felt important and meaningful, though part of that is I had a pre-existing model of what "jhana" was and what it may mean. Specifically, I thought it meant that the rest of the crazy-sounding meditation claims like infinite happiness and willpower are way more likely to be true.

Comment by elriggs on Wanting to Succeed on Every Metric Presented · 2021-04-17T18:43:07.013Z · LW · GW

I'm currently interested in the idea of "the physical sensation correlation of different mental states", like becoming intimately aware of the visceral, physical felt sense of being stressed or triggered, or relaxed and open, or having a strong sense of self or a small sense of identity, or a strong emotion in physical sensations only or a strong emotion with a story and sense of self attached or...

Specifically practicing this would look like paying attention to your body's felt sense while doing [thing] in different ways (like interacting with your emotions using different system's techniques). Building this skill will create higher quality feedback from your body's felt-sense, allowing a greater ability to identify different states in the wild. This post's idea of hijacked values and your comment point to a specific feeling attached to hijacked values. 

This better bodily intuition may be a more natural, long term solution to these types of problems than what I would naively come up with (like TAPs or denying the part of me that actually wants the "bad" thing)

Comment by elriggs on Wanting to Succeed on Every Metric Presented · 2021-04-14T03:19:49.886Z · LW · GW

There is something here along the lines of "becoming skilled at a thing helps you better understand the appeal (and costs) of being skilled at other things". It's definitely not the only thing you need because I've been highly skilled at improv piano, but still desired these other things.

What I want to point out in the post is the disconnect between becoming highly skilled and what you actually value. It's like eating food because it's popular as opposed to actually tasting it and seeing if you like that taste (there was an old story here on LW about this, I think). 

Making the cost explicit does help ("it would take decades to become a grandmaster"), but there can be a lack of feedback on why becoming a national master sounds appealing to you. Like the idea of being [cool title] sounds appealing, but is the actual, visceral, moment-to-moment experience of it undeniably enjoyable to you? (in this case, you can only give an educated guess until you become it, but an educated guess can be good enough!)

Comment by elriggs on Wanting to Succeed on Every Metric Presented · 2021-04-13T19:47:05.706Z · LW · GW

Oh! That makes sense as a post on it's own. 

Listing pros and cons of current rationalist techniques could then be compared to your ideal version of rationality to see what's lacking (or points out holes in the "ideal version"). Also, "current rationality techniques" is ill-defined in my head and the closest I can imagine is the CFAR manual, though that is not the list I would've made. 

Comment by elriggs on Wanting to Succeed on Every Metric Presented · 2021-04-13T19:19:27.540Z · LW · GW

No, which is part of the point.

I don't know what point you're referring to here. Do you mean that listing specific skills of rationality is bad for systematized winning?

I also want to wrangle more specifics from you, but I can just wait for your post:)

Comment by elriggs on Wanting to Succeed on Every Metric Presented · 2021-04-13T17:24:13.347Z · LW · GW

Regarding "problems we don't understand", you pointed out an important meta-systematic skill: figuring out when different systems apply and don't apply (by applying new systems learned to a list of 20 or so big problems). 

The new post you're eluding to sounds interesting, but rationality is a loaded term. Do you have specific skills of rationality in mind for that post?

Comment by elriggs on Wanting to Succeed on Every Metric Presented · 2021-04-13T17:18:36.505Z · LW · GW

Your bulleted self-inquiries are very useful. These seem like more playful questions that I would feel comfortable asking someone else if I felt they were being hijacked by a metric/scaling (where a more naive approach could come across as judgmental and untactful). 

Not all of your questions fit every situation of course, but that's not the point! Actually, I want to try out a few examples:

Long-distance running

What would it be like to be very skilled? I would be much fitter than I am now!, so less winded when doing other things. I feel like there's a bragging angle, but who likes a bragger?

What would suck? The long practice hours, I will likely be more prone to injuries and joint problems.

What's the good part of training to be a skilled runner? Consistently being outside would be nice. I think I would feel better after training.

What would be the bad part of training? That out-of-breath feeling and burning muscles is uncomfortable.

Are there people who aren't skilled long distance runners, but are still better in a meaningful way? Swimmers are very fit, have greater upper body strength, and aren't as prone to injuries (though looking it up, they do suffer shoulder injuries)

AI Alignment Researcher

What would it look like to be successful? Being paid to do research full time. Making meaningful contributions that reduce x-risk. Having lots of smart people who will listen and give you feedback. Have a good understanding of lots of different, interesting topics.

What would suck about it? Maybe being in an official position will cause counter-productive pressure/responsibility to make meaningful contributions. I will be open to more criticism. I may feel responsible and slightly helpless regarding people who want to work on alignment, but have trouble finding funding.

What would be great about the process of becoming successful? Learning interesting subjects and becoming better at working through ideas.  Gaining new frames to view problems. Meeting new people to discuss interesting ideas and "iron sharpening iron". Knowing I'm working on something that feels legitimately important.

What would suck about the process? The context-loading of math texts is something to get used to. There's a chance of failure due to lack of skill or not knowing the right people. There is no road map to guarantee success, so there is a lot of uncertainty on what to do specifically. 

Any people who also are great but not successful Alignment researchers? There's people who are good at communicating these ideas with others (for persuasion and distillation), or work at machine learning jobs and will be in good positions of power for AI safety concerns. There are also other x-risks to work on out there and EA fields that also viscerally feel important. 

I'll leave it here due to time, but I think I would add "How could I make the process of getting good more enjoyable?" and making explicit what goals I actually care about.

Comment by elriggs on Wanting to Succeed on Every Metric Presented · 2021-04-13T12:24:24.325Z · LW · GW

"Rationality" was a vague metric for me when I first started reading the sequences. Breaking it down into clear skills (taking ideas seriously, noticing confusion, "truth" as predictive accuracy, etc) with explicit benefits and common pitfalls would be useful.

Once you nail down what metrics you're talking about when you say "rationality", I believe the costs and benefits of investing in becoming more rational will be clearer. 

Feel free to brainstorm as replies to this comment, I would enjoy a full post on the subject.

Comment by elriggs on Wanting to Succeed on Every Metric Presented · 2021-04-13T07:12:09.527Z · LW · GW

If the details are available within you

Maybe! One framing is: I expected "great accomplishments that people I admire say is good" to make me very happy or very liked, but reality was not as great, even negative sometimes. This pattern was hidden because:

  •  I wasn't explicit with my expectations - if I was clear with how happy all A's would make me and paid attention when I did get an A, I would realize the disconnect sooner. 
    • Related: making explicit the goals that all A's helps me with (seriously considering why it matters in fine-grained details) would've been much more aligned with my goals than the proxy "get all A's". This serious analysis is not something I really did, but rationality skills of thinking an idea through while noticing confusions helped (I include focusing here)
  • I was in love with the idea of e.g. running a marathon and didn't pay attention to how accomplishing it actually made me feel in physical sensations, or how the process I went about achieving that goal made me feel in physical sensations. This even happened with food! I used to eat a box of Zebra cakes (processed pastry cakes) on my drive home, but one time I decided to actually taste it instead of eating while distracted (inspired by mindful eating meditation). It was actually kind of gross and waxy and weirdly sweet, and I haven't eaten more than a few of them these past several years.

I liked that you provided a lot of examples!

Thanks! Real life examples keep me honest. I was even thinking of your post, specifically the image of you scrambling to maintain and improve 10+ skills. How would you answer your own question?

Comment by elriggs on Wanting to Succeed on Every Metric Presented · 2021-04-13T00:22:49.467Z · LW · GW

For the gamification one, they tend to involve a bunch of open loops that leave you wanting to resolve (where the cliff hanger is a great example). This causes thoughts regarding the loop to come up spontaneously. In context, the loops aren't that important, but locally, they may appear more important (like being angry at a loved one for interrupting you or preventing you from finishing a show/chapter/etc). I think being triggered in general here counts. Typical antidotes is the traditional "take a walk" and regarding meditation, better awareness and capacity to let go (not arguing that meditation works here, but may write a post on that)

This is different than cults and abusive relationships, where there is a strong motivation to leave your normal environment (the type of abusive relationship I have in mind is "you can't see your friends anymore"), making the local rewards and punishments more salient as time goes by. I may even include drugs w/ withdrawals here. The usual solution is leaving those environments for healthier ones to compare against, though this happens in transitions due to ideas coupling (bucket errors). [This feels unsubstantiated to me and would benefit from more specific examples]. The gamification one had two answers: "change environment" or "change your relationship to the environment". There may be some situations where you're forced in a horrible environment and your only choice is to change your relationship to your environment, but this would require some high-level meditation insights in my opinion. "Leaving" seems the most actionable response. Maybe "recognizing" you're in a cult is an important vein for future thought. 

The identity based will cause ignoring/flinching from incompatible thoughts. This may benefit from becoming more sensitive to subtle thoughts you typically ignore (noticing confusions was a similar process for me). I feel like meditating relates, but I'm unsure on the mechanism. It's mumble mumble everything is empty mumble.

There's also a thread on "horrible events cause you to realize what's important" to look into.

Comment by elriggs on Wanting to Succeed on Every Metric Presented · 2021-04-12T22:48:54.238Z · LW · GW

Thanks! Changed to "social appraisals". Someone's opinion is definitely a loaded term which may lead to pattern matching. I'm also fine with more novel phrasing since it's explained immediately after.

Comment by elriggs on Wanting to Succeed on Every Metric Presented · 2021-04-12T22:43:20.792Z · LW · GW

I may have been misleading, but my point is not about tradeoffs, but about not pursuing things that you don't actually care about upon reflection. 

Thanks for bringing this up. I believe explicitly stating tradeoffs is important because you may then realize that you actually don't care about them. For example, I don't actually care about being "enlightened" or reaching stage 10 in TMI (though I thought I did). I would have come to a better conclusion and had better meditation sessions earlier if I made the metrics I care about explicit. 

[Though, this isn't true for looking cool dancing or eating new foods because I don't know if I like them until it happens]

Comment by elriggs on Wanting to Succeed on Every Metric Presented · 2021-04-12T21:23:10.254Z · LW · GW

This post's purpose is only to point out the pattern and nudge basic self-reflection, and that it is sometimes enough to solve the problem. It doesn't solve all problems regarding hijacked values, which is what I'm currently trying to find a good-enough solution to (or create a good-enough taxonomy of types of hijacked values and heuristics).

For example, some of these are identity based. I saw myself as a hard worker, so I worked hard at every school assignment, even when it wasn't at all necessary. 

Others are gamificiation-y (like a video game): 100%-ing a game, reaching a certain rank/level, daily rewards!, or recommended videos/articles, or a cliff-hanger in a story with the next chapter available. 

Others are extreme social incentive-y, such as cults, abusive relationships, and multi-level marketing, where local rewards and punishments become more salient than what you used to value (or would value if you left that environment for a few years).

I'm currently not in love with these divisions. A better framing (according to the metric of achieving your goals better) would make it clear what action to take in each situation to better know your true values.

Comment by elriggs on Today a Tragedy · 2021-04-11T22:48:53.254Z · LW · GW

Happy belated birthday, brother. My grandfather got married yesterday, so I was away from my laptop. You've missed a lot. I think it would've been real fun to discuss Gamestop stocks with you; I think you would've invested!

 I've almost finished grad school as well and intend to study math for a bit on my own. You got to experience life after college for a bit; I wonder how your business and overall life would've gone since now. I think we would've argued vaccines, lockdowns, and mask when we saw each other. I remember arguing with you about politics and religion, but I don't remember us ever getting mad at each other (except freshman year, but that was over a girl, haha).

I don't even know where your grave is, but I'll continue to talk to you here.

~Take care

Comment by elriggs on Trauma, Meditation, and a Cool Scar · 2021-01-11T03:49:49.962Z · LW · GW

Looking back, this all seems mostly correct, but missing a couple, assumed steps. 

 I've talked to one person since about their mild anxiety talking to certain types of people; I found two additional steps that helped them.

  1. Actually trying to become better
  2. Understanding that their reaction is appropriate for some situations (like the original trauma), but it's overgeneralized to actually safe situations.

These steps are assumed in this post because, in my case, it's obvious I'm overreacting (there's no drone) and I understand PTSD is common and treatable. Step 2 is very much Kaj Sotala's internal family system's post while this post is mainly about accessing lower-level sensory information about the trauma-reaction. 

Comment by elriggs on Reframing Impact · 2020-12-09T23:29:51.567Z · LW · GW

This post (or sequence of posts) not only gave me a better handle on impact and what that means for agents, but it also is a concrete example of de-confusion work. The execution of the explanations gives an "obvious in hindsight" feeling, with "5-minute timer"-like questions which pushed me to actually try and solve the open question of an impact measure. It's even inspired me to apply this approach to other topics in my life that had previously confused me; it gave me the tools and a model to follow.

And, the illustrations are pretty fun and engaging, too.

Comment by elriggs on Using GPT-N to Solve Interpretability of Neural Networks: A Research Agenda · 2020-09-05T20:44:06.743Z · LW · GW

I’m expecting either (1) A future GPT’s meta-learning combined with better prompt engineering will be able to learn the correct distribution and find the correct distribution, respectively. Or (2) curating enough examples will be good enough (though I’m not sure if GPT-3 could do it even then).

Comment by elriggs on Using GPT-N to Solve Interpretability of Neural Networks: A Research Agenda · 2020-09-04T15:15:07.713Z · LW · GW

I also expect it to be harder as well, and 10-30% chance that it will require some new insight seems reasonable.

Comment by elriggs on What's a Decomposable Alignment Topic? · 2020-08-25T01:51:46.026Z · LW · GW

b) seems right. I'm unsure what (a) could mean (not much overhead?).

I feel confused to think about decomposability w/o considering the capabilities of the people I'm handing the tasks off to. I would only add:

By "smart", assume they can notice confusion, google, and program

since that makes the capabilities explicit.

Comment by elriggs on What's a Decomposable Alignment Topic? · 2020-08-24T22:16:10.791Z · LW · GW

If you only had access to people who can google, program, and notice confusion, how could you utilize that to make conceptual progress on a topic you care about?

Decomposable: Make a simple first person shooter. Could be decomposed into creating asset models, and various parts of the actual code can be decomposed (input-mapping, getting/dealing damage).

Non-decomposable: Help me write an awesome piano song. Although this can be decomposed, I don't expect anyone to have the skills required (and acquiring the skills requires too much overhead).

Let's operationalize "too much overhead" to mean "takes more than 10 hours to do useful, meaningful tasks".

Comment by elriggs on What's a Decomposable Alignment Topic? · 2020-08-23T23:24:58.839Z · LW · GW

The first one. As long as you can decompose the open problem into tractable, bite-sized pieces, it's good.

Vanessa mentioned some strategies that might generalize to other open problems: group decomposition (we decide how to break a problem up), programming to empirically verify X, and literature reviews.

Comment by elriggs on What's a Decomposable Alignment Topic? · 2020-08-22T03:27:44.221Z · LW · GW

I don't know (partially because I'm unsure who would stay and leave).

If you didn't take math background that in consideration and wrote a proposal (saying "requires background in real analysis" or ...), then that may push out people w/o that background but also attract people with that background.

As long as pre-reqs are explicit, you should go for it.

Comment by elriggs on Writing Piano Songs: A Journey · 2020-08-15T00:07:19.153Z · LW · GW

I tend to write melodies in multiple different ways:

1. Hearing it in my head, then playing it out. It's very easy to generate (like GPT but with melodies), but transcribing is very hard! The common advice is to sing it out, and then match it with the instrument. This is exactly what you did with whistling. If I don't record it, I will very often not remember it at all later; very similar to forgetting a dream. When I hear someone else's piano piece (or my own recorded), I will often think "I would've played that part differently" which is the same as my brain predicting a different melody.

2. "Asemic playing" (thanks for the phrase!) - I've improv-ed for hundreds of hours, and I very often run into playing similar patterns when I'm in similar "areas" such as playing the same chord progression. I'll often have (1) melodies playing in my head while improvising, but I will often play the "wrong" note and it still sound good. Over the years, I've gotten much better at remembering melodies I just played (because my brain predicts that the melody will repeat) and playing the "correct" note in my head on the fly.

3. Smashing "concepts" into a melody:

  • What if I played this melody backwards?
  • Pressed every note twice?
  • Held every other note a half-note longer?
  • Used a different chord progression (so specific notes of the melody needs to change to harmonize)
  • Taking a specific pattern of a melody, like which notes it uses, and playing new patterns there.
  • Taking a specific pattern of a melody, like the rhythm between the notes (how long you hold each note, including rests) and applying it to other melodies.
  • Taking a specific patter of a melody, like the exact rhythm and relative notes, and starting on a different note (then continuing to play the same notes, relatively)
Comment by elriggs on Solving Key Alignment Problems Group · 2020-08-08T17:13:05.149Z · LW · GW

Thanks for reaching out. I've sent you the links in a DM.

I would like to be listed in the list of various AI Safety initiatives.

I'm looking forward to this month's AI Safety discussion day (I saw yours and Vanessa's post about it in Diffractor's Discord).

I'll start reading other's maps of Alignment in a couple days, so I would appreciate the link from FLI; thank you. Gyrodiot's post has several links related to "mapping AI", including one from FLI (Benefits and Risks of AI), but it seems like a different link than the one you meant.

Comment by elriggs on Solving Key Alignment Problems Group · 2020-08-04T11:59:35.801Z · LW · GW

It's not clear in the OP, but I'm planning on a depth-first search as opposed to breadth. Week 2-XX will focus on a singular topic (like turntrout's impact measures or johnswentworth's abstractions).

I am looking forward to disjunctive maps though!

Comment by elriggs on No Ultimate Goal and a Small Existential Crisis · 2020-07-27T14:22:47.561Z · LW · GW

But how do you verify that? What does it mean (to you) to become more conscious of it?

Comment by elriggs on Become a person who Actually Does Things · 2020-07-26T21:59:40.353Z · LW · GW

I'm very confused about your interpretation of the post. I read the post as saying:

Most people have too high of a risk/reward threshold for action (it has to be the perfect opportunity to act). Having a lower threshold leads to much more rewards. To become that person, install the TAP to notice when a problem shows up now and try to fix it now. Being that type of person increases the chances of finding golden opportunities/ black swans.

But I've also installed this habit before (noticing the risks were much smaller in reality than in my head!), so maybe that's why the purpose/message was clear to me?

My personal standard for LW posts would prefer more specific examples, so that it's more fun, clear, and vivid in my mind.

What benefits do you think this post would gain if it fit your standard of (1-4) in your comment?

Comment by elriggs on No Ultimate Goal and a Small Existential Crisis · 2020-07-26T16:03:35.887Z · LW · GW

I didn't mean to come across as "not knowing what I want at all", but it's more like your last paragraphs on uncertainty (I've added a tl;dr at the beginning to help clarify).

1) I have my values, but they are not completely coherent, and I don't know their extrapolation... Give me immortality, food, and books, and I will gradually find out what else do I want

Thanks to your comment, I think I understand the question I want to ask: What sensations/ feelings do you experience that you use to know "this is what I value"?

Comment by elriggs on No Ultimate Goal and a Small Existential Crisis · 2020-07-26T15:47:58.386Z · LW · GW

It doesn't make sense to me either, but how do you specifically know you're on the right track? What specific qualia do you experience?

Comment by elriggs on No Ultimate Goal and a Small Existential Crisis · 2020-07-24T23:10:00.075Z · LW · GW

I found this absurdly hilarious; loved the punchline. Thank you.

But then, how do you decide what to do?

Comment by elriggs on No Ultimate Goal and a Small Existential Crisis · 2020-07-24T22:34:47.785Z · LW · GW

I think you're saying that there are two different relationships to goals (lighthouses and final destinations).

Could you give an example of a goal you used to treat as a final destination, but you now treat as a lighthouse? And in what way it is better?

Comment by elriggs on No Ultimate Goal and a Small Existential Crisis · 2020-07-24T22:30:09.716Z · LW · GW

Thanks for all the links! It will take some time to read through them, but it's good to have them all in one place:)

I especially appreciate you writing out your current views explicitly. Your view seems to boil down into useful heuristics/ strategies, but it doesn't explain why they're good. For example

Following these strategies for setting goals, I've noticed myself and those close to me being happier, and I haven't burnt out doing this so it's more sustainable.
Comment by elriggs on No Ultimate Goal and a Small Existential Crisis · 2020-07-24T22:14:53.593Z · LW · GW

I like the "how to weigh multiple values" frame.

But to use the initial argument in the post, why should pleasure and pain be the ultimate metric?

Comment by elriggs on No Ultimate Goal and a Small Existential Crisis · 2020-07-24T22:00:24.458Z · LW · GW

Thanks for the link! I've just read ~12 of them, and I think I've read them before. They sort of touched on the subject, but do you have a specific article(s) in the sequence that speak on it more directly?

Comment by elriggs on Meta-preferences are weird · 2020-07-18T20:08:30.085Z · LW · GW

I appreciate the write-up!

Explaining 3 possible meanings of "meta-preferencers" was insightful and rang true to me.

I was very confused about the lake/ocean distance metaphor, but I think I've got it now (?) Coordinates represent preferences and an arrow/gradient represents meta-preferences. Ex. I want to smoke (0,1), but I want to quit smoking [wanting to move from (0,1) --> (0,-1)].

Suicide as a "discontinuous jump", I assumed to mean in the OP, "a preference to have no preference". This is a large jump, but it's not how I interpret suicide. How is it even a meta-preference? "I prefer to be hurt/ depressed/ a burden, but I have a meta-preference to not be"?

Comment by elriggs on Does equanimity prevent negative utility? · 2020-06-17T02:56:11.637Z · LW · GW

Based on your comment on Ricraz's answer, "something that is bad for me", I will make a guess at what you mean. Let me know if it answers your question.

Objectively (outside-perspective):

"Bad" requires defining. Define the utility function, and the answer falls out.

Depending on your goals and the context of being hurt, it might be negative, positive, or a mix of both! (ex. being unintentionally burned while cooking, being a masochist, and being burned to protect a clumsy loved one, respectively)


If you mean negative utility as in the negative valence of an observation, then I would argue that negative valence is a signal telling you how well you're achieving a goal. (this is from Kaj's Non-mystical sequence)

From a multi-agent view, you may have an agent giving you valence on how well you're doing at a goal (say a video game). If you're really invested in the game, you might fuse with that sub-agent (identify that with a "self" tag), and suffer when you fail at the game. If you're separated from the game, you can still receive information about how well you're doing, but you don't suffer.

The more equanimity you have (you're okay with things as they are), the less you personally suffer. Though you can still be aware of the negative/positive signal of valence.

Comment by elriggs on Today a Tragedy · 2020-06-13T01:43:53.703Z · LW · GW

I think I've convinced my girlfriend that it's okay for me to be sad because of what happened to you. She used to try to cheer me up, but I would tell her that it's okay for me to be sad. It sucks and it's okay if I act like it sucks.

I had honestly thought the day was July 15th. Then I saw my calendar and saw that it was today. As soon as I noticed, I started watching youtube, I guess to distract myself. When I stopped, it all just weighed on me again.

It's hard to accept your death. You had your goals and friends and all of your expectations, and then it just ended.

Your sister still posts for you on Facebook. Your friends still think of you.

I remember you. I miss you man

Comment by elriggs on Corrigibility as outside view · 2020-05-12T17:27:01.006Z · LW · GW

Okay, the outside view analogy makes sense. If I were to explain it to me, I would say:

Locally, an action may seem good, but looking at the outside view, drawing information from similar instances of my past or other people like me, that same action may seem bad.

In the same way, an agent can access the outside view to see if it’s action is good by drawing on similar instances. But how does it get this outside view information? Assuming the agent has a model of human interactions and a list of “possible values for humans”, it can simulate different people with different values to see how well it learned their values by the time it’s considering a specific action.

Considering the action “disable the off-switch”. It simulates itself interacting with Bob who values long walks on the beach. By the time it considers the disable action, it can check it’s simulated self’s prediction of Bob’s value. If the prediction is “Bob likes long walks on the beach”, then that’s an update towards doing the disable action. If it’s a different prediction, that’s an update against the disable action.

Repeat 100 times for different people with different values and you’ll have a better understanding of which actions are safe or not. (I think a picture of a double-thought bubble like the one in this post would help explain this specific example.)

Comment by elriggs on Meditation: the screen-and-watcher model of the human mind, and how to use it · 2020-05-03T02:06:00.800Z · LW · GW

I pattern match this to the Buddhist idea of interdependence, where what you are is reliant on the environment and the environment is reliant on you (or embedded agency).

Comment by elriggs on My experience with the "rationalist uncanny valley" · 2020-04-24T02:40:38.855Z · LW · GW

If I understand you right, you value some things (finding them meaningful) because you robustly value them regardless of circumstances (like I value human life regardless of whether I had coffee this morning). Is this correct?

But you also mentioned that this only accounts for some values, and other things you value and find meaningful aren’t robust?

Comment by elriggs on Today a Tragedy · 2020-04-11T00:03:46.650Z · LW · GW

Happy Birthday Will,

I remember in 9th grade you started dating my ex right after consoling me. I was so mad! Haha. I never told you this, but me and the others on our forensics team saw y’all just sitting, holding hands, and having a good time, and Jennifer suggested that me and her hold hands and sit next to y’all giggling.

I said no, though it would’ve made a better story if I went through with it, haha. I think we started getting along again after she moved, although I can’t remember saying anything mean to you because of it.

I’m not sure she knows what happened to ya. I know y’all kept in touch when she moved, and maybe she checks Facebook more than I do.

Anyways, a lot of us are back home cause of the Coronavirus, and I would love to be able to give you a call and see how your life’s progressed these past few years.

Love you Will,


Comment by elriggs on Link Retrospective for 2020 Q1 · 2020-04-10T01:55:46.637Z · LW · GW

Thanks for the links and I hope you post another next quarter!

Comment by elriggs on "No evidence" as a Valley of Bad Rationality · 2020-04-01T21:20:41.390Z · LW · GW

Correct, favoring hypothesis H or NOT H simply because you label one "null hypothesis" are both bad. Equally bad when you don't have evidence either way.

In this case, intuition favors "more chemo should kill more cancer cells", and intuition counts as some evidence. The doctor ignores intuition (which is the only evidence we have here) and favors the opposite hypothesis because it's labeled "null hypothesis".

Comment by elriggs on Attainable Utility Preservation: Scaling to Superhuman · 2020-02-27T18:26:33.926Z · LW · GW

Thanks for the link (and the excellent write-up of the problem)!

Regarding the setting, how would the agent gain the ability to create a sub-agent, roll a rock, or limit it's own abilities initially? Throughout AUP, you normally start with a high penalty for acquiring power, and then you scale it down to reach reasonable, non-catastrophic plans, but your post begins with having higher power.

I don't think AUP prevents abuse of power you have currently have (?), but prevents gaining that power in the first place.

Comment by elriggs on Attainable Utility Preservation: Scaling to Superhuman · 2020-02-27T12:46:15.577Z · LW · GW

I expect AUP to fail in embedded agency problems (which I interpret the subagent problem to be included). Do you expect it to fail in other areas?

Comment by elriggs on Firming Up Not-Lying Around Its Edge-Cases Is Less Broadly Useful Than One Might Initially Think · 2020-01-13T08:20:43.566Z · LW · GW

I realized afterwards that only “not sharing others secrets” is an example of “it’s ethical to lie if someone asks a direct question”. The other two were more “don’t go out of your way to tell the whole truth in this situation (but wait for a better situation)”

I do believe my ethics is composed of wanting what’s “best” for others and truthful communication is just an instrumental goal.

If I had to blatantly lie every day, so that all my loved ones could be perfectly healthy and feel great, I would lie every day.

I don’t think anyone would terminally value honesty (in any of it’s forms).

Comment by elriggs on Firming Up Not-Lying Around Its Edge-Cases Is Less Broadly Useful Than One Might Initially Think · 2020-01-13T07:47:58.526Z · LW · GW

Thanks for the clarification.

For me the answer is no, I don’t believe it’s ethically mandatory to share all information I know to everyone if they happen to ask the right question. I can’t give a complete formalization of why, but three specific situations are 1) keeping someone else’s information secret & 2) when I predict the other person will assume harmful implications that aren’t true &3) when the other person isn’t in the right mind to hear the true information.

Ex for #3: you would like your husband to change more diapers and help clean up a little more before they leave work every day, but you just thought of it right when he came home from a long work day. It would be better to wait to give a criticism when you’re sure they’re in a good mood.

An example for #2: I had a friend have positive thoughts towards a girl that wasn’t his girlfriend. He was confused about this and TOLD HIS GIRLFRIEND WHEN THEY WERE DATING LONG DISTANCE. The two girls have had an estranged relationship for years since.

If I was my friend, I would understand that positive thoughts towards a pretty girl my age doesn’t imply that I am required to romantically engage them. Telling my girlfriend about these thoughts might be truthful and honest, but it would likely cause her to feel insecure and jealous, even though she has nothing to worry about.