Posts
Comments
In case it's a helpful data point: lines of reasoning sorta similar to the ones around the infohazard warning seemed to have interesting and intense psychological effects on me one time. It's hard to separate out from other factors, though, and I think it had something to do with the fact that lately I've been spending a lot of time learning to take ideas seriously on an emotional level instead of only an abstract one.
I mostly think it's too loose a heuristic and that you should dig into more details
Some of the probability questions (many worlds, simulation) are like... ontologically weird enough that I'm not entirely certain it makes sense to assign probabilities to them? It doesn't really feel like they pay rent in anticipated experience?
I'm not sure "speaking the truth even when it's uncomfortable" is the kind of skill it makes sense to describe yourself as "comfortable" with.
I think it's pretty good to keep it in mind that heliocentrism is literally speaking just a change in what coordinate system you use, but it is legitimately a much more convenient coordinate system.
Switch to neuroscience. I think we have an innate “sense of sociality” in our brainstem (or maybe hypothalamus), analogous to how (I claim) fear-of-heights is triggered by an innate brainstem “sense” that we’re standing over a precipice.
I think lately I've noticed how much written text triggers this for me varying a bit over time?
...Does that hold together as a potential explanation for why our universe is so young? Huh.
I think my ideal is to lean into weirdness in a way that doesn't rely on ignorance of normal conventions
For a while I ended up spending a lot of time thinking about specifically the versions of the idea where I couldn't easily tell how true they were... which I suppose I do think is the correct place to be paying attention to?
I think there is rather a lot of soap to be found... but it's very much not something you can find by taking official doctrine as an actual authority.
That does seem likely.
There's a complication where sometimes it's very difficult to get people not to interpret things as an instruction. "Confuse them" seems to work, I guess, but it does have drawbacks too.
I don't really have a good idea of the principles, here. Personally, whenever I've made a big difference in a person's life (and it's been obvious to me that I've done so), I try to take care of them as much as I can and make sure they're okay.
...However, I have ran into a couple issues with this. Sometimes someone or something takes too much energy, and some distance is healthier. I don't know how to judge this other than intuition, but I think I've gone too far before?
And I have no idea how much this can scale. I think I've had far bigger impacts than I've intended, in some cases. One time I had a friend who was really in trouble and I had to go to pretty substantial lengths to get them to a better place, and I'm not sure all versions of them would've endorsed that, even if they do now.
...But, broadly, "do what you can to empower other people to make their own decisions, when you can, instead of trying to tell them what to do" does seem like a good principle, especially for the people who have more power in a given situation? I definitely haven't treated this as an absolute rule, but in most cases I'm pretty careful not to stray from it.
I don't really think money is the only plausible explanation, here?
I think the game is sufficiently difficult.
I read this post several years ago, but I was... basically just trapped in a "finishing high school and then college" narrative at the time, it didn't really seem like I could use this idea to actually make any changes in my life... And then a few months ago, as I was finishing up my last semester of college, I sort of fell head first into Mythic Mode without understanding what I was doing very much at all.
And I'd say it made a lot of things better, definitely—the old narrative was a terrible one for me—but it was rocky in some ways, and... like, obviously thoughts like "confirmation bias" etc were occurring to me, but "there are biases involved here" doesn't, really, in and of itself tell you what to do?
It would make sense if there's some extent to which everyone who spent the first part of their life following along with a simple "go to school and then get a job i guess" script is going to have a substantial adjustment period once they start having some more interesting life experiences, but... also seems plausible that if I'd read a lot more about this sort of thing I'd've been better equipped.
To have a go at it:
Some people try to implement a decision-making strategy that's like, "I should focus mostly on System 1" or "I should focus mostly on System 2." But this isn't really the point. The goal is to develop an ability to judge which scenarios call for which types of mental activities, and to be able to combine System 1 and System 2 together fluidly as needed.
I, similarly, am pretty sure I had a lot of conformist-ish biases that prevented me from seriously considering lines of argument like this one.
Like, I'm certainly not entirely sure how strong this (and related) reasoning is, but it's definitely something one ought to seriously think about.
This post definitely resolved some confusions for me. There are still a whole lot of philosophical issues, but it's very nice to have a clearer model of what's going on with the initial naïve conception of value.
I do actually think my practice of rationality was benefited by spending some time seriously grappling with the possibility that everything I knew was wrong. Like, yeah, I did quickly reaccept many things, but it was still a helpful exercise.
This feels more like an argument that Wentworth's model is low-resolution than that he's actually misidentified where the disagreement is?
Huh. I... think I kind of do care terminally? Or maybe I'm just having a really hard time imagining what it would be like to be terrible at predicting sensory input without this having a bunch of negative consequences.
you totally care about predicting sensory inputs accurately! maybe mostly instrumentally, but you definitely do? like, what, would it just not bother you at all if you started hallucinating all the time?
Probably many people who are into Eastern spiritual woo would make that claim. Mostly, I expect such woo-folk would be confused about what “pointing to a concept” normally is and how it’s supposed to work: the fact that the internal concept of a dog consists of mostly nonlinguistic stuff does not mean that the word “dog” fails to point at it.
On my model, koans and the like are trying to encourage a particular type of realization or insight. I'm not sure whether the act of grokking an insight counts as a "concept", but it can be hard to clearly describe an insight in a way that actually causes it? But that's mostly deficiency in vocab plus the fact that you're trying to explain a (particular instance of a) thing to someone who has never witnessed it.
Robin Hanson has written about organizational rot: the breakdown of modularity within an organization, in a way which makes it increasingly dysfunctional. But this is exactly what coalitional agency induces, by getting many different subagents to weigh in on each decision.
I speculate (loosely based on introspective techniques and models of human subagents) that the issue isn't exactly the lack of modularity: when modularity breaks down over time, this leads to subagents competing to find better ways to work around the modularity, and creates more zero sum-ish dynamics. (Or maybe it's more that techniques for working around modularity can produce an inaction bias?) But if you intentionally allow subagents to weigh-in, they may be more able to negotiate and come up with productive compromises.
I think I have a much easier time imagining a 3D volume if I'm imagining, like, a structure I can walk through? Like I'm still not getting the inside of any objects per se, but... like, a complicated structure made out of thin surfaces that have holes in them or something is doable?
Basically, I can handle 3D, but I won't by default have all the 3Dish details correctly unless I meaningfully interact with the full volume of the object.
This does necessitate that the experts actually have the ability to tell when an argument is bad.
All the smart trans girls I know were also smart prior to HRT.
I feel like Project Lawful, as well as many of Lintamande's other glowfic since then, have given me a whole lot deeper an understanding of... a collection of virtues including honor, honesty, trustworthiness, etc, which I now mostly think of collectively as "Law".
I think this has been pretty valuable for me on an intellectual level—I think, if you show me some sort of deontological rule, I'm going to give a better account of why/whether it's a good idea to follow it than I would have before I read any glowfic.
It's difficult for me to separate how much of that is due to Project Lawful in particular, because ultimately I've just read a large body of work which all had some amount of training data showing a particular sort of thought pattern which I've since learned. But I think this particular fragment of the rationalist community has given me some valuable new ideas, and it'd be great to figure out a good way of acknowledging that.
i think they presented a pretty good argument that it is actually rather minor
While the concept that looking at the truth even when it hurts is important isn't revolutionary in the community, I think this post gave me a much more concrete model of the benefits. Sure, I knew about the abstract arguments that facing the truth is valuable, but I don't know if I'd have identified it as an essential skill for starting a company, or as being a critical component of staying in a bad relationship. (I think my model of bad relationships was that people knew leaving was a good idea, but were unable to act on that information—but in retrospect inability to even consider it totally might be what's going on some of the time.)
So if a UFO lands in your backyard and aliens ask if you if you want to go on a magical (but not particularly instrumental) space adventure with them, I think it's reasonable to very politely decline, and get back to work solving alignment.
I think I'd probably go for that, actually, if there isn't some specific reason to very strongly doubt it could possibly help? It seems somewhat more likely that I'll end up decisive via space adventure than by mundane means, even if there's no obvious way the space adventure will contribute.
This is different if you're already in a position where you're making substantial progress though.
nonetheless, i think the analogy is still suggestive that an AI selectively shaped for whatever might end up deliberately maximizing something else
i think, in retrospect, this feature was a really great addition to the website.
This post introduced me to a bunch of neat things, thanks!
There are several comments "suggesting that maybe the cause is mental illness".
But personally, I think having such a standard is both unreasonable and inconsistent with the implicit standard set by essays from Yudkowsky and other MIRI people.
I think this is largely coming from an attempt to use approachable examples? I could believe that there were times when MIRI thought that even getting something as good as ChatGPT might be hard, in which case they should update, but I don't think they ever believed that something as good as ChatGPT is clearly sufficient. I certainly never believed that, at least.
Yes, we told everyone they were in the minority. It's a "game".
I think this is bad. I mean, it's not that big a deal, but I generally speaking expect messages I receive from The LessWrong Team to not tell falsehoods.
Hmm.
I don't think Avoiding actions that noticeably increase the chance civilization is destroyed is necessarily the most practically-relevant virtue, for most people, but it does seem to me like it's the point of Petrov day in particular. If we're recognizing Petrov as a person, I'd say that was Petrov's key virtue.
Or maybe I'd say something like "not doing very harmful acts despite incentives to do so"—I think "resisting social pressure" isn't quite on the mark, but I think it is important to Petrov day that there were strong incentives against what Petrov did.
I think other virtues are worth celebrating, but I think I'd want to recognize them on different holidays.
I mean, that's a thing you might hope to be true. I'm not sure if it actually is true.
I think, if you had several UDT agents with the same source code, and then one UDT agent with slightly different source code, you might see the unique agent defect.
I think the CDT agent has an advantage here because it is capable of making distinct decisions from the rest of the population—not because it is CDT.
I'm not sure "original instantiation" is always well-defined
I think personally I'd say it's a clear advancement—it opens up a lot of puzzles, but the naïve intuition corresponding to it it still seems more satisfying than CDT or EDT, even if a full formalization is difficult.
(Not to comment on whether there might be a better communications strategy for getting the academic community interested.)
Provided that you make sure you don't publish some massive capabilities progress—which I think is pretty unlikely for most undergrads—I think the benefits from having an additional alignment-conscious person with relevant skills probably outweighs the very marginal costs of tiny incremental capabilities ideas.
I think a lot of travel expenses?
I was confused by the disagree votes on this comment, so I looked—the comment in question is highest on the default "new and upvoted" sorting, but it isn't highest on the "top" sorting.
I'm more confident that we should generally have norms prohibiting using threats of legal action to prevent exchange of information than I am of the exact form those norms should take. But to give my immediate thoughts:
I think the best thing for Alice to do if Bob is lying about her is to just refute the lies. In an ideal world, this is sufficient. In practice, I guess maybe it's insufficient, or maybe refuting the lies would require sharing private information, so if necessary I would next escalate to informing forum moderators, presenting evidence privately, and requesting a ban.
Only once those avenues are exhausted might I consider threatening a libel suit acceptable.
I do notice now that the Nonlinear situation in particular is impacted by Ben Pace being a LessWrong admin—so if step 1 doesn't work, step 2 might have issues, so maybe escalating to step 3 might be acceptable sooner than usual.
Concerns have been raised that there might be some sort of large first-mover advantage. I'm not sure I buy this—my instinct is that the Nonlinear cofounders are just bad-faith actors making any arguments that seem advantageous to them (though out of principle I'm trying to withhold final judgement). That said, I could definitely imagine deciding in the future that this is a large enough concern to justify weaker norms against rapid escalation.
Kudos for doing the exercise!
I think a comment "just asking for people to withhold judgement" would not be especially downvoted. I think the comments in which you've asked people to withhold judgement include other incredibly toxic behavior.