Posts

Distilling and approaches to the determinant 2022-04-06T06:34:09.553Z
How can a layman contribute to AI Alignment efforts, given shorter timeline/doomier scenarios? 2022-04-02T04:34:47.154Z
[Link] sona ike lili 2022-04-01T18:57:06.373Z
Blatant Plot Hole in HPMoR [Spoilers] 2022-04-01T16:01:45.594Z
AprilSR's Shortform 2022-03-23T18:36:27.608Z
Arguments are good for helping people reason about things 2022-03-11T23:02:33.158Z
What (feasible) augmented senses would be useful or interesting? 2021-03-06T04:28:13.320Z

Comments

Comment by AprilSR on Is being a trans woman (or just low-T) +20 IQ? · 2024-04-25T07:47:55.861Z · LW · GW

All the smart trans girls I know were also smart prior to HRT.

Comment by AprilSR on ProjectLawful.com: Eliezer's latest story, past 1M words · 2024-01-18T23:22:10.914Z · LW · GW

I feel like Project Lawful, as well as many of Lintamande's other glowfic since then, have given me a whole lot deeper an understanding of... a collection of virtues including honor, honesty, trustworthiness, etc, which I now mostly think of collectively as "Law".

I think this has been pretty valuable for me on an intellectual level—I think, if you show me some sort of deontological rule, I'm going to give a better account of why/whether it's a good idea to follow it than I would have before I read any glowfic.

It's difficult for me to separate how much of that is due to Project Lawful in particular, because ultimately I've just read a large body of work which all had some amount of training data showing a particular sort of thought pattern which I've since learned. But I think this particular fragment of the rationalist community has given me some valuable new ideas, and it'd be great to figure out a good way of acknowledging that.

Comment by AprilSR on Gender Exploration · 2024-01-14T23:17:48.605Z · LW · GW

i think they presented a pretty good argument that it is actually rather minor

Comment by AprilSR on Staring into the abyss as a core life skill · 2023-12-21T19:10:37.159Z · LW · GW

While the concept that looking at the truth even when it hurts is important isn't revolutionary in the community, I think this post gave me a much more concrete model of the benefits. Sure, I knew about the abstract arguments that facing the truth is valuable, but I don't know if I'd have identified it as an essential skill for starting a company, or as being a critical component of staying in a bad relationship. (I think my model of bad relationships was that people knew leaving was a good idea, but were unable to act on that information—but in retrospect inability to even consider it totally might be what's going on some of the time.)

Comment by AprilSR on Alignment work in anomalous worlds · 2023-12-17T08:09:25.806Z · LW · GW

So if a UFO lands in your backyard and aliens ask if you if you want to go on a magical (but not particularly instrumental) space adventure with them, I think it's reasonable to very politely decline, and get back to work solving alignment.

I think I'd probably go for that, actually, if there isn't some specific reason to very strongly doubt it could possibly help? It seems somewhat more likely that I'll end up decisive via space adventure than by mundane means, even if there's no obvious way the space adventure will contribute.

This is different if you're already in a position where you're making substantial progress though.

Comment by AprilSR on AI Views Snapshots · 2023-12-13T00:50:03.177Z · LW · GW

Here's mine

Comment by AprilSR on [deleted post] 2023-12-08T09:51:30.756Z

nonetheless, i think the analogy is still suggestive that an AI selectively shaped for whatever might end up deliberately maximizing something else

Comment by AprilSR on LessWrong Has Agree/Disagree Voting On All New Comment Threads · 2023-12-08T04:27:28.296Z · LW · GW

i think, in retrospect, this feature was a really great addition to the website.

Comment by AprilSR on AI Alignment Breakthroughs this week (10/08/23) · 2023-10-09T08:12:57.506Z · LW · GW

This post introduced me to a bunch of neat things, thanks!

Comment by AprilSR on Sam Altman's sister, Annie Altman, claims Sam has severely abused her · 2023-10-08T22:50:33.363Z · LW · GW

There are several comments "suggesting that maybe the cause is mental illness".

Comment by AprilSR on Evaluating the historical value misspecification argument · 2023-10-05T19:39:18.376Z · LW · GW

But personally, I think having such a standard is both unreasonable and inconsistent with the implicit standard set by essays from Yudkowsky and other MIRI people.

I think this is largely coming from an attempt to use approachable examples? I could believe that there were times when MIRI thought that even getting something as good as ChatGPT might be hard, in which case they should update, but I don't think they ever believed that something as good as ChatGPT is clearly sufficient. I certainly never believed that, at least.

Comment by AprilSR on Petrov Day Retrospective, 2023 (re: the most important virtue of Petrov Day & unilaterally promoting it) · 2023-09-28T18:16:00.842Z · LW · GW

Yes, we told everyone they were in the minority. It's a "game".

 

I think this is bad. I mean, it's not that big a deal, but I generally speaking expect messages I receive from The LessWrong Team to not tell falsehoods.

Comment by AprilSR on Petrov Day Retrospective, 2023 (re: the most important virtue of Petrov Day & unilaterally promoting it) · 2023-09-28T18:14:16.997Z · LW · GW

Hmm.

I don't think Avoiding actions that noticeably increase the chance civilization is destroyed is necessarily the most practically-relevant virtue, for most people, but it does seem to me like it's the point of Petrov day in particular. If we're recognizing Petrov as a person, I'd say that was Petrov's key virtue.

Or maybe I'd say something like "not doing very harmful acts despite incentives to do so"—I think "resisting social pressure" isn't quite on the mark, but I think it is important to Petrov day that there were strong incentives against what Petrov did.

I think other virtues are worth celebrating, but I think I'd want to recognize them on different holidays.

Comment by AprilSR on UDT shows that decision theory is more puzzling than ever · 2023-09-14T04:52:03.979Z · LW · GW

I mean, that's a thing you might hope to be true. I'm not sure if it actually is true.

Comment by AprilSR on UDT shows that decision theory is more puzzling than ever · 2023-09-14T03:50:29.730Z · LW · GW

I think, if you had several UDT agents with the same source code, and then one UDT agent with slightly different source code, you might see the unique agent defect.

I think the CDT agent has an advantage here because it is capable of making distinct decisions from the rest of the population—not because it is CDT.

Comment by AprilSR on UDT shows that decision theory is more puzzling than ever · 2023-09-14T03:46:56.574Z · LW · GW

I'm not sure "original instantiation" is always well-defined

Comment by AprilSR on UDT shows that decision theory is more puzzling than ever · 2023-09-14T03:45:36.460Z · LW · GW

I think personally I'd say it's a clear advancement—it opens up a lot of puzzles, but the naïve intuition corresponding to it it still seems more satisfying than CDT or EDT, even if a full formalization is difficult.

(Not to comment on whether there might be a better communications strategy for getting the academic community interested.)

Comment by AprilSR on Should an undergrad avoid a capabilities project? · 2023-09-13T07:22:20.444Z · LW · GW

Provided that you make sure you don't publish some massive capabilities progress—which I think is pretty unlikely for most undergrads—I think the benefits from having an additional alignment-conscious person with relevant skills probably outweighs the very marginal costs of tiny incremental capabilities ideas.

Comment by AprilSR on Sharing Information About Nonlinear · 2023-09-08T02:53:04.313Z · LW · GW

I think a lot of travel expenses?

Comment by AprilSR on Sharing Information About Nonlinear · 2023-09-08T00:58:44.368Z · LW · GW

I was confused by the disagree votes on this comment, so I looked—the comment in question is highest on the default "new and upvoted" sorting, but it isn't highest on the "top" sorting.

Comment by AprilSR on AprilSR's Shortform · 2023-09-07T20:11:38.361Z · LW · GW

I'm more confident that we should generally have norms prohibiting using threats of legal action to prevent exchange of information than I am of the exact form those norms should take. But to give my immediate thoughts:

I think the best thing for Alice to do if Bob is lying about her is to just refute the lies. In an ideal world, this is sufficient. In practice, I guess maybe it's insufficient, or maybe refuting the lies would require sharing private information, so if necessary I would next escalate to informing forum moderators, presenting evidence privately, and requesting a ban.

Only once those avenues are exhausted might I consider threatening a libel suit acceptable.

I do notice now that the Nonlinear situation in particular is impacted by Ben Pace being a LessWrong admin—so if step 1 doesn't work, step 2 might have issues, so maybe escalating to step 3 might be acceptable sooner than usual.

Concerns have been raised that there might be some sort of large first-mover advantage. I'm not sure I buy this—my instinct is that the Nonlinear cofounders are just bad-faith actors making any arguments that seem advantageous to them (though out of principle I'm trying to withhold final judgement). That said, I could definitely imagine deciding in the future that this is a large enough concern to justify weaker norms against rapid escalation.

Comment by AprilSR on Sharing Information About Nonlinear · 2023-09-07T18:16:54.339Z · LW · GW

Kudos for doing the exercise!

Comment by AprilSR on Sharing Information About Nonlinear · 2023-09-07T17:47:27.535Z · LW · GW

I think a comment "just asking for people to withhold judgement" would not be especially downvoted. I think the comments in which you've asked people to withhold judgement include other incredibly toxic behavior.

Comment by AprilSR on AprilSR's Shortform · 2023-09-07T17:33:46.863Z · LW · GW

I think we should have a community norm that threatening libel suits (or actually suing) is incredibly unacceptable in almost all cases—I'm not sure what the exact exceptions should be, but maybe it should require "they were knowingly making false claims."

I feel unsure whether it would be good to enforce such a norm regarding the current Nonlinear situation because there wasn't common knowledge beforehand and because I feel too strongly about this norm to not be afraid that I'm biased (and because hearing them out is the principled thing to do). But I think building common knowledge of such a norm would be good.

Comment by AprilSR on Sharing Information About Nonlinear · 2023-09-07T17:26:59.298Z · LW · GW

While I guess I will be trying to withhold some judgment out of principle, I legitimately cannot imagine any plausible context which will make this any different.

Comment by AprilSR on Bogdan Ionut Cirstea's Shortform · 2023-08-07T22:12:57.831Z · LW · GW

I don't think having a beauty-detector that works the same way humans' beauty-detectors do implies that you care about beauty?

Comment by AprilSR on Bogdan Ionut Cirstea's Shortform · 2023-08-04T03:12:27.803Z · LW · GW

hmm. i think you're missing eliezer's point. the idea was never that AI would be unable to identify actions which humans consider good, but that the AI would not have any particular preference to take those actions.

Comment by AprilSR on The "spelling miracle": GPT-3 spelling abilities and glitch tokens revisited · 2023-08-01T02:56:14.162Z · LW · GW

There's definitely also many misspellings in the training data without correction which it needs to nonetheless make sense of

Comment by AprilSR on AprilSR's Shortform · 2023-07-27T18:09:02.387Z · LW · GW

Does anyone know if there is a PDF version of the Sequence Highlights anywhere? (Or any ebook format is fine probably.)

Comment by AprilSR on Is the Endowment Effect Due to Incomparability? · 2023-07-21T18:22:36.347Z · LW · GW

...are they trading with, like, a vending machine, rather than with each other?

Comment by AprilSR on Is the Endowment Effect Due to Incomparability? · 2023-07-18T08:40:24.521Z · LW · GW

I'm confused by the pens and mugs example. Sure if only 10 of the people who got mugs would prefer a pen, then that means that at most ten trades should happen—once the ten mug-receiving pen-likers trade, there won't be any other mug-owners willing to trade? so don't you get 20 people trading, 20%, not 50%?

Comment by AprilSR on Why Yudkowsky Is Wrong And What He Does Can Be More Dangerous · 2023-06-06T20:34:03.746Z · LW · GW

AI systems, such as Large Language Models (LLMs), are trained on human data and designed by human engineers. It's impossible for them to exceed the bounds of human knowledge and expertise, as they're inherently limited by the information they've been exposed to.

Maybe, on current algorithms, LLMs run into a plateau around the level of human expertise. That does seem plausible. But it is not because being trained on human data necessarily limits you to human level!

Accurately predicting human text is much harder than just writing stuff on the internet. If GPT were to perfect the skill it is being trained on, it would have to be much smarter than a human!

Comment by AprilSR on An Impossibility Proof Relevant to the Shutdown Problem and Corrigibility · 2023-05-02T21:56:19.984Z · LW · GW

Even if shut down in particular isn't something we want it to be indifferent to, I think being able to make an agent indifferent to something is very plausibly useful for designing it to be corrigible?

Comment by AprilSR on How can one rationally have very high or very low probabilities of extinction in a pre-paradigmatic field? · 2023-05-01T06:29:19.176Z · LW · GW

I don't think you can have particularly high confidence one way or the other without just thinking about AI in enough detail to have an understanding of the different ways that AI development could end up shaking out in. There isn't a royal road.

Both the "doom is disjunctive" and "AI is just like other technologies" arguments really need a lot more elaboration to be convincing, but—personally I find the argument that AI is different from other technologies pretty obvious and I have a hard time imagining what the counterargument would be.

Comment by AprilSR on GPTs are Predictors, not Imitators · 2023-04-08T21:24:17.204Z · LW · GW

I can imagine a world where LLMs tend to fall into local maxima where they get really good at imitation or simulation, and then they plateau (perhaps only until their developers figure out what adjustments need to be made). But I don't have a good enough model of LLMs to be very sure whether that will happen or not.

Comment by AprilSR on Eliezer on The Lunar Society podcast · 2023-04-08T05:56:52.148Z · LW · GW

I really like the "when you don't have a good detailed model you need to figure out what space you should have the maximum entropy distribution over" framing

Comment by AprilSR on Abuse in LessWrong and rationalist communities in Bloomberg News · 2023-03-09T06:54:41.385Z · LW · GW

I think abuse issues in rationalist communities are worth discussing, but I don't think people who have been excluded from the community for years are a very productive place to begin such a discussion.

Comment by AprilSR on Project "MIRI as a Service" · 2023-03-09T06:32:11.132Z · LW · GW

This feels worth trying to me

Comment by AprilSR on The Kids are Not Okay · 2023-03-08T20:42:44.910Z · LW · GW

I do know that I want my own children to stay off social media, and minimize their ownership and use of smart phones, for as long as they possibly can. And that I intend to spend quite a lot of my available points, if needed, to fight for this. And that if I was running a school I’d do my best to shut the phones down during school hours.

(...)

We can also help this along by improving alternatives to phone use. If children aren’t allowed to go places without adults knowing, or worse adults driving them and coming along and watching them, what do you think they are going to do all day? What choices do they have?


I'm not certain whether my intuition should be trusted here, since this is definitely the kind of thing my brain would form a habit of rationalizing about. But my guess is that I would've been way worse off without phones/social media/stuff. I didn't really have any great alternatives to socializing on the internet—the only people I ever interacted with in person were devout Christians.

So I tentatively think it might be better to really focus on the improving alternatives part first? I'm sure I would've been much better off if I had good in-person friends, but I don't think not having access to social media would have really helped with that, it'd just have meant I wouldn't have any good friends at all.

(I would expect Zvi in particular has good enough parenting skills to not run into that. But I know a lot of people with terrible parents who think they can fix the problem just by monitoring their children's access to technology, which seems terrible for them to me? So I worry about how good it is as general advice.)

Comment by AprilSR on GÖDEL GOING DOWN · 2023-03-07T06:17:12.330Z · LW · GW

I don't think you understand what mathematicians mean by the word "complete." It means that all theorems which can be stated in the system can also be proven in the system (or something similar).

Comment by AprilSR on Acausal normalcy · 2023-03-04T21:44:26.566Z · LW · GW

A (late) section of Project Lawful argues that there would likely be acausal coordination to avoid pessimizing the utility function (of anyone you are coordinating with), as well as perhaps to actively prevent utility function pessimization.

Comment by AprilSR on Help kill "teach a man to fish" · 2023-03-01T19:34:01.434Z · LW · GW

but cannot afford the boat

this already seems pretty good imo

Comment by AprilSR on a narrative explanation of the QACI alignment plan · 2023-02-22T19:31:35.520Z · LW · GW

I don't think security mindset means "look for flaws." That's ordinary paranoia. Security mindset is something closer to "you better have a really good reason to believe that there aren't any flaws whatsoever." My model is something like "A hard part of developing an alignment plan is figuring out how to ensure there aren't any flaws, and coming up with flawed clever schemes isn't very useful for that. Once we know how to make robust systems, it'll be more clear to us whether we should go for melting GPUs or simulating researchers or whatnot."

That said, I have a lot of respect for the idea that coming up with clever schemes is potentially more dignified than shooting everything down, even if clever schemes are unlikely to help much. I respect carado a lot for doing the brainstorming.

Comment by AprilSR on a narrative explanation of the QACI alignment plan · 2023-02-21T20:46:40.934Z · LW · GW

I mostly expect by the time we know how to make a seed superintelligence and give it a particular utility function... well, first of all the world has probably already ended, but second of all I would expect progress on corrigibility and such to have been made and probably to present better avenues.

If Omega handed me aligned-AI-part-2.exe, I'm not quite sure how I would use it to save the world? I think probably trying to just work on the utility function outside of a simulation is better, but if you are really running out of time then sure, I guess you could try to get it to simulate humans until they figure it out. I'm not very convinced that referring to a thing a person would have done in a hypothetical scenario is a robust method of getting that to happen, though?

Comment by AprilSR on a narrative explanation of the QACI alignment plan · 2023-02-20T06:11:26.134Z · LW · GW

I have a pretty strong heuristic that clever schemes like this one are pretty doomed. The proposal seems to lack security mindset, as Eliezer would put it. 

The most immediate/simple concrete objection I have is that no one has any idea how to create aligned-AI-part-2.exe? I don't think figuring out what we'd do if we knew how to make a program like that is really the difficult part here.

Comment by AprilSR on devansh's Shortform · 2023-01-15T06:02:52.666Z · LW · GW

CDT gives into blackmail (such as the basilisk), whereas timeless decision theories do not.

Comment by AprilSR on Why The Focus on Expected Utility Maximisers? · 2022-12-27T16:51:52.734Z · LW · GW

My personal suspicion is that an AI being indifferent between a large class of outcomes matters little; it's still going to absolutely ensure that it hits the pareto frontier of its competing preferences.

Comment by AprilSR on How Death Feels · 2022-12-21T21:34:20.665Z · LW · GW

Have you read / are you interested in reading Project Lawful? It eventually explores this topic in some depth—though mostly after a million words of other stuff.

Comment by AprilSR on X-risk Mitigation Does Actually Require Longtermism · 2022-11-15T00:01:03.768Z · LW · GW

I think "existential risk" is a bad name for a category of things that isn't "risks of our existence ending."

Comment by AprilSR on Remember to translate your thoughts back again · 2022-11-03T06:24:38.563Z · LW · GW

I mostly think the phrase "psychologically addictive" is way less clear than necessary to communicate to me.

I think I would write the paragraph as something vaguely like:

"The physiological withdrawal symptoms of Benzodiazepines can be avoided—but often people have a bad time coming of Benzodiazepines because they start relying on them over other coping mechanisms. So doctors try to avoid them."

It seems possible to come up with something that is both succinct and actually communicates the gears.