Posts

Double's Shortform 2024-03-11T05:57:35.781Z
Should an undergrad avoid a capabilities project? 2023-09-12T23:16:39.817Z
Don't Jump or I'll... 2023-03-02T02:58:43.058Z
Gatekeeper Victory: AI Box Reflection 2022-09-09T21:38:39.218Z
AI Box Experiment: Are people still interested? 2022-08-31T03:04:38.518Z
If you know you are unlikely to change your mind, should you be lazier when researching? 2022-08-21T17:16:53.343Z
Florida Elections 2022-08-13T20:10:01.023Z
We Need a Consolidated List of Bad AI Alignment Solutions 2022-07-04T06:54:36.541Z

Comments

Comment by Double on Acting Wholesomely · 2024-03-12T19:16:04.832Z · LW · GW

Sorry, I’ll be doing multiple unwholesome things in this comment.

For one, I’m commenting without reading the whole post. I was expecting it to be about something else and was disappointed. The conception of wholesomeness as “considering a wider perspective for your actions” is not very interesting. Everyone considers a wider perspective to be valuable, and nobody takes that more seriously already than EAs.

The conception of wholesomeness I was hoping you’d write about (let’s call it wholesomeness2 for distinction from your wholesomeness) is a type of prestige. Prestige is high status freely conferred by the beneficiaries of the prestigious. Contrast with dominance, which is demanded with force.

It’s hard to pin down, but I think I’d say that Wholesomeness2 is a reputation for not being evil. Clearly, it would be good for EA’s ability to do good if they had wholesomeness2. On top of that, if actions that are not wholesome2 tend to be bad and actions that are wholesome2 tend to be good, then wholesome2 is a good heuristic. (Although the tails come apart, as they always do. https://slatestarcodex.com/2018/09/25/the-tails-coming-apart-as-metaphor-for-life/ ).

If someone has wholesomeness2, then people will assume mistakes rather than malice, will defend the wholesome2 person from attack, and help the wholesome2 when they are in need.

I was hoping your post would be about how to be wholesome2. Here are my thoughts:

Incapable of plotting: dogs and children are wholesome because they don’t have the capacity to be evil.

Wholesomeness2 chains, so since candy is associated with children who are wholesome2, associating yourself with candy can increase your wholesomeness2.

Generating warm-fuzzies: the Make a Wish Foundation is extremely wholesome2, while deworming is not. When someone (like an EA) “attacks” Make a Wish by saying it doesn’t spend its funds in a way that helps many people much compared to alternatives, everyone will come to Make a Wish‘s defense.

Vibes: “wholesome2 goths” feels like an oxymoron. The goth aesthetic is contrary to the idea of being not evil, even though the goths themselves are usually nice people. If you call one “wholesome”, they might even get upset at you.

Actually being not evil: It doesn’t matter how wholesome2 he was before; Bill Cosby lost all his wholesome2 when the world found out he was evil. Don’t be Bill Cosby.

I’d appreciate comments elaborating and adding to this list.

….

By analyzing the concept like this, I lost some wholesomeness2, because I have shown that I have the capacity and willingness to gain wholesomeness2 independent of whether I’m really plotting something evil. I’d argue that I’m just not very willing to self-censor, so you should trust me more instead of less… but that is exactly what an unwholesome2 individual would do.

EA will have some trouble gaining wholesomeness2 because it tends to seek power and has the intelligence and agency needed to be evil.

Comment by Double on Double's Shortform · 2024-03-12T17:08:19.392Z · LW · GW

Plenty of pages get the bare minimum. The level of detail in the e/acc page (eg including the emoji associated with the movement) makes me think that it was edited by an e/acc. The EA page must have been edited by the e/acc since it includes “opposition to e/acc”, but other than that it seems like it was written by someone unaffiliated with either (modulo my changes). We could probably check out the history of the pages to resolve our speculation.

Comment by Double on Double's Shortform · 2024-03-11T05:57:35.888Z · LW · GW

It is worrying that the Wikidata page for e/acc is better than the page for EA and the page for Less Wrong. I just added EA's previously absent "main subject"s to the EA page.

Looks like a Symbolic AI person has gone e/acc. That's unfortunate, but rationalists have long known that the world would end in SPARQL.

Comment by Double on A Longlist of Theories of Impact for Interpretability · 2024-01-07T06:23:18.677Z · LW · GW

I’d call that “underselling it”! Your description of Microscope AI may be accurate, but even I didn’t realize you meant “supercharging science”, and I was looking for it in the list!

Comment by Double on A Longlist of Theories of Impact for Interpretability · 2024-01-01T23:40:58.192Z · LW · GW

This is a great reference for the importance and excitement in Interpretability.

I just read this for the first time today. I’m currently learning about Interpretability in hopes I can participate, and this post solidified my understanding of how Interpretability might help.

The whole field of Interpretability is a test of this post. Some of the theories of change won’t pan out. Hopefully many will. Perhaps more theories not listed will be discovered.

One idea I’m surprised wasn’t mentioned is the potential for Interpretability to supercharge all of the sciences by allowing humans to extract the things that machine learning models discovered to make their predictions. I remember Chris Olah being excited about this possibility on the 80k Podcast, and that excitement meme has spread to me. Current AIs know so much about how the world works, but we can only indirectly use that knowledge indirectly through their black box interface. I want that knowledge for myself and for humanity! This is another incentive for Interpretability, and although it isn’t a development that clearly leads to “AI less likely to kill us” it will make humanity wiser, more prosperous, and on more even footing with the AIs.

Nanda’s post probably deserves a spot in a compilation of Alignment plans.

Comment by Double on Here's the exit. · 2023-12-20T05:40:49.504Z · LW · GW

I'm glad you enjoyed my review! Real credit for the style goes to whoever wrote the blurb that pops up when reviewing posts; I structured my review off of that.

When it comes to "some way of measuring the overall direction of some [AI] effort," conditional prediction markets could help. "Given I do X/Y, will Z happen?" Perhaps some people need to run a "Given I take a vacation, will AI kill everyone?" market in order to let themselves take a break.

What would be the next step to creating a LessWrong Mental Health book?

Comment by Double on Here's the exit. · 2023-12-19T06:13:13.186Z · LW · GW

Ideally reviews would be done by people who read the posts last year, so they could reflect on how their thinking and actions changed. Unfortunately, I only discovered this post today, so I lack that perspective.

Posts relating to the psychology and mental well being of LessWrongers are welcome and I feel like I take a nugget of wisdom from each one (but always fail to import the entirety of the wisdom the author is trying to convey.) 

 
The nugget from "Here's the exit" that I wish I had read a year ago is "If your body's emergency mobilization systems are running in response to an issue, but your survival doesn't actually depend on actions on a timescale of minutes, then you are not perceiving reality accurately." I panicked when I first read Death with Dignity (I didn't realize it was an April Fools Joke... or was it?). I felt full fight-or-flight when there wasn't any reason to do so. That ties into another piece of advice that I needed to hear, from Replacing Guilt: "stop asking whether this is the right action to take and instead ask what’s the best action I can identify at the moment." I don't know if these sentences have the same punch when removed from their context, but I feel like they would have helped me. This wisdom extends beyond AI Safety anxiety and generalizes to all irrational anxiety. I expect that having these sentences available to me will help me calm myself next time something raises my stress level.

I can't speak to the rest of the wisdom in this post. “Thinking about a problem as a defense mechanism is worse (for your health and for solving the problem) than thinking about a problem not as a defense mechanism” sounds plausible, but I can’t say much for its veracity or its applicability

I would be interested to see research done to test the claim. Does increased sympathetic nervous system activation cause decreased efficacy? A correlational study could classify people in AI safety by (self reported?) efficacy and measure their stress levels, but causation is always trickier than correlation. 

A flood of comments criticized the post, especially for typical-minding. The author responded with many comments of their own, some of which received many upvotes and agreements and some of which received many dislikes and disagreements. A follow up post from Valentine would ideally address the criticism and consolidate the valid information from the comments into the post.

A sequence or book compiled from the wisdom of many LessWrongers discussing their mental health struggles and discoveries would be extremely valuable to the community (and to me, personally) and a modified version of this post would earn a spot in such a book.

Comment by Double on Gemini 1.0 · 2023-12-07T23:27:37.741Z · LW · GW

Liv Boeree: This is pretty nuts, looks like they’ve surpassed GPT4 on basically every benchmark… so this is most powerful model in the world?! Woweee what a time to be alive.

Link doesn't work. Maybe she changed her mind?

Comment by Double on Gemini 1.0 · 2023-12-07T23:14:01.499Z · LW · GW
Comment by Double on Hammers and Nails · 2023-07-17T04:15:22.213Z · LW · GW

Hammer: when there’s low downside, you’re free to try things. (Yeah, this is a corollary of expected utility maximization that seems obvious, but I still feel like I needed to explicitly and recently learn it.) Ten examples:

  1. Spend a few hours on a last-minute scholarship application.
  2. Try out dating apps a little (no luck yet, still looking into more effective use. But I still say that trying it was a good choice.)
  3. Call friends/parents when feeling sad.
  4. Go to an Effective Altruism retreat for a weekend.
  5. Be (more) honest with friends.
  6. Be extra friendly in general.
  7. Show more gratitude (inspired by “More Dakka”, which I read thanks to the links at the top of this post.
  8. Spend a few minutes writing a response to this post so that I can get practice with the power of internalizing ideas.
  9. When headache -> Advil and hot shower. It just works. Why did I keep just waiting and hoping the headache would go away on its own? Takes a few seconds to get some Advil, and I was going to shower anyways. It’s a huge boost to my well-being and productivity with next to no cost.
  10. Ask questions. It seriously seems like I ask >50% of the questions in whatever room I’m in, and people have thanked me for this. They were ashamed or embarrassed to ask questions or something? What’s the downside?
Comment by Double on Don't Jump or I'll... · 2023-03-02T17:04:24.647Z · LW · GW

I hadn’t considered this. You point out a big flaw in the neighbor’s strategy. Is there a way to repair it?

Comment by Double on Don't Jump or I'll... · 2023-03-02T16:11:32.902Z · LW · GW

I only have second-hand descriptions of suicidal thoughts-processes, but I’ve heard from some who say they had become convinced that their existence was a negative on the world and the people they care about, and they came to their decision to commit suicide from a sort of (misguided) utilitarian calculation. I tried to give the man this perspective rather than the apathetic perspective you suggest. There’s diversity in the psychology of suicidal people. Do no suicidal people (or sufficiently few) have the Utilitarian type of psychology?

Comment by Double on Don't Jump or I'll... · 2023-03-02T16:00:28.973Z · LW · GW

I’m glad you enjoyed it! I had heard of people making promises similar to your Trump-donation one. The idea for this story came from applying that idea to the context of suicide prevention. The part about models is my attempt to explain my (extremely incomplete grasp of) Functional Decision Theory in the context of a story. https://www.lesswrong.com/tag/functional-decision-theory

Comment by Double on Voting Results for the 2021 Review · 2023-02-01T19:26:55.710Z · LW · GW

4/8 of Eliezer Yudkowsky's posts in this list have a minus 9. Compare this with 1/7 for duncan_sabien, 0/6 for paulfchristiano, 0/5 for Daniel Kokotajlo, or 0/3 for HoldenKarnofsky. I wonder why that is.

Comment by Double on Against neutrality about creating happy lives · 2023-01-11T03:04:42.058Z · LW · GW

On one level, the post used a simple but emotionally and logically powerful argument to convince me that the creation of happy lives is good. 

On a higher level, I feel like I switch positions of population ethics every time I read something about it, so I am reluctant to predict that I will hold the post's position for much time. I remain unsettled that the field of population ethics, which is central to long-term visions of what the future should look like, has so little solid knowledge. My thinking, and therefore my actions, will remain split among the convincing population ethics positions.

This sequence made me doubt the soundness of philosophical arguments founded on what is "intuitive" (which this post very much relies upon). I don't know how someone might go about doing population ethics from a psychology point of view, but the post's subtitles "Preciousness," "Gratitude," and "Reciprocity" give some clues.

A testable aspect of the post would be to find out if the responses to the Wilbur and Michael thought experiments are universal. Also, I'd be interested to know how many of the people who read this post in 2021 (and have interacted with population ethics since then) maintain their position.

Carlsmith should follow up with his take on the Repugnant Conclusion. The Repugnant Conclusion is the central question of population ethics, so excluding it from this post is a major oversight.

Notes: The "famously hard" link is broken.

Comment by Double on Evanston, IL – ACX Meetups Everywhere 2022 · 2022-10-02T01:00:45.045Z · LW · GW

He has shown up.

Comment by Double on Evanston, IL – ACX Meetups Everywhere 2022 · 2022-10-02T00:53:52.892Z · LW · GW

I’m here with a few others in a booth near the door. We haven’t seen Uzair.

Comment by Double on Gatekeeper Victory: AI Box Reflection · 2022-09-11T23:00:56.603Z · LW · GW

Yes, it is. I wanted to win, and there is no rule against “going against the spirit” of AI Boxing.

I think about AI Boxing in the frame of Shut up and Do the Impossible, so I didn’t care that my solution doesn’t apply to AI Safety. Funnily, that makes me an example of incorrect alignment.

Comment by Double on If you know you are unlikely to change your mind, should you be lazier when researching? · 2022-08-22T02:27:03.754Z · LW · GW

I have spent many hours on this, and I have to make a decision by two days from now. There's always the possibility that there is more important information to find, but even if I stayed up all night and did nothing else, I would not be able to read the entirety of the websites, news articles, opinion pieces, and social media posts relating to the candidates. Research costs resources! I suppose what I'm asking for is a way of knowing when to stop looking for more information. Otherwise I'll keep trying possibility 2 over and over and end up missing the election deadline!

Comment by Double on Florida Elections · 2022-08-14T22:13:34.319Z · LW · GW

Thanks for the response. Those are fair reasons. I should have contributed more.

The LessWrong community is big and some are in Florida. If anyone had interesting things to share about the election I wanted to encourage them to do so.

Comment by Double on Florida Elections · 2022-08-14T21:58:23.552Z · LW · GW

I guess that makes sense, but very rarely is there a post that appeals to EVERYONE. A better system would be for people to be able to seek out the content that interests them. If something doesn’t interest you, then you move on.

Comment by Double on Florida Elections · 2022-08-13T23:36:15.486Z · LW · GW

Those are interesting questions! Perhaps you should make your own post instead of using mine to get more of an audience.

Expressing disapproval of both candidates by e.g. voting for Harambe makes sense, but I think that voting for bad policies is a bad move because “obvious” things aren’t obvious to many people, and voting for bad candidates (as opposed to joke candidates) makes their policies more mainstream and likely to be adopted by candidates with chances to win.

Why do you think my post is being shot down?

Comment by Double on We Need a Consolidated List of Bad AI Alignment Solutions · 2022-07-05T01:14:04.429Z · LW · GW

AI safety research has been groping in the dark, and half-baked suggestions for new research directions are valuable. It isn't as though we've made half of a safe AI. We haven't started, and all we have are ideas.

Comment by Double on We Need a Consolidated List of Bad AI Alignment Solutions · 2022-07-05T01:10:53.536Z · LW · GW

I think that a problem with my solution is that how can the AI "understand" the behaviors and thought-processes of a "more powerful agent." If you know what someone smarter than you would think then you are simply that smart. If we abstract the specific more-powerful-agent's-thoughts away, then we are left with Kantian ethics, and we are back where we started, trying to put ethics/morals in the AI.

 

It's a bit rude to call my idea so stupid that I must not have thought about it for more than five minutes, but thanks for your advice anyways. It is good advice. 

Comment by Double on We Need a Consolidated List of Bad AI Alignment Solutions · 2022-07-04T04:10:42.616Z · LW · GW

The AI Box:

A common idea is for the AI to be in a "box" where it can only interact through the world by talking to a human. This doesn't work for a few reasons:

The AI would be able to convince the human to let it out.

The human wouldn't know the consequences of their actions as well as the AI.

Removing capabilities from the AI is not a good plan because the point is to create a useful AI. Importantly, the AI should be able to stop all dangerous AI from being created.

Comment by Double on How Common Are Science Failures? · 2022-04-24T19:13:06.651Z · LW · GW

Thanks. That fits the first three criteria well, but there is still controversy about many of the results, so maybe not the fourth one yet.

Comment by Double on What Is Signaling, Really? · 2022-04-24T02:24:05.073Z · LW · GW

This sentence is a HUGE RED FLAG: “it shattered my illusion that I mostly avoid thinking about class signals, and instead convinced me that pretty much everything I do from waking up in the morning to going to bed at night is a class signal.”

If signaling can explain everything, then it is in the same category as Freudian psychoanalysis—unfalsifiable and therefore useless.

The idea that signaling explains everything leads to the idea that “people who say that they don’t bother with signaling and don’t use the symbols available to them are REALLY just signaling that they are the kind of person who can afford to not care about signaling.”

This is not the conclusion of a respectable theory; this is mental gymnastics. Having a theory that can explain anything is identical to having no clue.

I’ll admit that this post is the extent of my knowledge of signaling, so others might have fleshed-out the theory to the point that it can make predictions, but this essay was too much representativeness heuristic and not enough evidence.

Comment by Double on How Common Are Science Failures? · 2022-04-14T02:17:15.740Z · LW · GW

Come back! I don’t know what you are referencing!