Posts
Comments
I have an irrational preference
If your utility function weights you knowing things higher than most people's, that is not an irrationality.
It's "101"? I searched the regular internet to find out, but I got some yes's and some no's, which I suspect were just due to different definitions of intelligence.
It's controversial?? Has that stopped us before? When was it done to death?
I'm just confused, because if people downvote my stuff, they're probably trying to tell me something, and I don't know what it is. So I'm just curious.
Thanks. By the way, do you know why this question is getting downvoted?
I already figured that. The point of this question was to ask if there could possibly exist things that look indistinguishable from true alignment solutions (even to smart people), but that aren't actually alignment solutions. Do you think things like this could exist?
By the way, good luck with your plan. Seeing people actively go out and do actually meaningful work to save the world gives me hope for the future. Just try not to burn out. Smart people are more useful to humanity when their mental health is in good shape.
- Yes, human intelligence augmentation sounds like a good idea.
- There are all sorts of "strategies" (turn it off, raise it like a kid, disincentivize changing the environment, use a weaker AI to align it) that people come up with when they're new to the field of AI safety, but that are ineffective. And their ineffectiveness is only obvious and explainable by people who specifically know how AI behaves. Supposes there are strategies which ineffectiveness is only obvious and explainable by people who know way more about decisions and agents and optimal strategies and stuff than humanity has currently figured out thus far. (Analogy: A society who only know basic arithmetic could reasonably stumble upon and understand the Collatz conjecture; and yet, with all our mathematical development, we can't do anything to prove it. Just like we could reasonably stumble upon an "alignment solution" that we can't disprove that it would work, because that would take a much higher understanding of these kinds of situations.)
- If the solution to alignment were simple, we would have found it by now. Humans are far from simple, human brains are far from simple, human behavior is far from simple. That there is one simple thing from which comes all of our values, or a simple way to derive such a thing, just seems unlikely.
Uh, this is a human. Humans find it much harder to rationalize away the suffering of other humans, compared to rationalizing animal suffering.
And the regular, average people in this future timeline consider stuff like this ethically okay?
hack reality via pure math
What - exactly - do you mean by that?
The above statement could be applied to a LOT of other posts too, not just this one.
How were these discovered? Slow, deliberate thinking, or someone trying some random thing to see what it does and suddenly the AI is a zillion times smarter?
I certainly believe he could. After reading Tamsin Leake's "everything is okay" (click the link if you dare), I felt a little unstable, and felt like I had to expend deliberate effort to not think about the described world in sufficient detail in order to protect my sanity. I felt like I was reading something that had been maximized by a semi-powerful AI to be moving, almost infohazardously moving, but not quite; that this approached the upper bound of what humans could read while still accepting the imperfection of their current conditions.
utopia
It's a protopia. It is a word better than ours. It is not perfect. It would be advisable to keep this in mind. dath ilan likely has its own, separate problems.
And I’m not even mentioning the strange sexual dynamics
Is this a joke? I'm confused.
yeah, the moment i looked at the big diagram my brain sort of pleasantly overheated
I think the flaw is how he claims this:
No one begins to truly search for the Way until their parents have failed them, their gods are dead, and their tools have shattered in their hand.
I think that these three things are not things that cause a desire for rationality, but things that rationality makes you notice.
why is this so downvoted? just curious
If I am not sufficiently terrified by the prospect of our extinction, I will not take as much steps to try and reduce its likelihood. If my subconscious does not internalize this sufficiently, I will not be as motivated. Said subconscious happiness affects my conscious reasoning without me consciously noticing.
Harry's brain tried to calculate the ramifications and implications of this and ran out of swap space.
this is very relatable
That's a partial focus.
particularly girls
why!?
i'd pick dust & youtube. I intrinsically value fairness
The YouTube is pure happiness. The sublimity is some happiness and some value. Therefore I choose the sublimity, but if it was "Wireheading vs. Youtube", or "Sublimity vs. seeing a motivational quote", I would choose the YouTube or the motivational quote, because I intrinsically value fairness.
Ok, yeah, I don't think the chances are much smaller than one in a million. But I do think the chances are not increased much by cryonics. Here, let me explain my reasoning.
I assume that eventually, humanity will fall into a topia (Tammy's definition) or go extinct. Given that it does not go extinct, it will spend a very long amount of subjective time, possibly infinite, in said topia. In the event that this is some sort of brilliant paradise of maximum molecular fun where I can make stuff for eternity, we can probably reconstruct a person solely based on little bits of information left behind (like how we can reconstruct Proto-Indo-European from the bits and influences it leaves on our modern languages), so I consider the slightly improved chances of revival negligible even when compared to the massive length of time (possibly infinite, which is why this is a Pascal's mugging) I would be living in such a world.
(Besides, the infiniteness is balanced out by the slightly increased chances of experiencing maximally horrible agony like in WYS.)
There's also a chance that we figure out how to revive frozen people before reaching a topia, but that seems kind of low of a chance (and even then, completely nullified by the maybe-infinity we might spend our time in)
I could have completely flawed logic in my head. I'm sorta new to all this "thinking about the long term future" stuff you guys really like doing. Please correct me because I'm probably wrong.
What's the meaning of life?
There is none. And that's the best thing ever, because it means there's no big crazy one true meaning that we all have to follow. We can do whatever we want.
Then what is the unit of Effort? Any ideas?
could this mean someone's physical appearance could be infohazardous? If I believe that looking at someone will cause my terminal goals to be modified into wanting to be with them by limerence, then I won't do so, because I want to spend my time making cool things and reducing the probability that we all die, and if I suddenly end up caring less about those things and just care about passing on my genes or whatever then that increases the odds that we all die, as well as decreasing the amount of cool things in the world leading up to that point.
apart from that, this is a good post
I've aimed to have this read equally well whether or not you like him.
hover to invoke Crocker's Rules:
you failed miserably
so, don't donate to people who will take my money and go buy OpenAI more supercomputers while thinking that they're doing a good thing?
and even if I do donate to some people who work on alignment, they might publish it and make OpenAI even more confident that by the time they finish we'll have it under control?
or some other weird way donating might increase P(doom) that I haven't even thought of?
that's a good point
now i really don't know what to do
Screw that. That's just stupid. Delete it without a qualm.
Nope. No no no. Nononononono. Our happiness baselines are part of us. Those with high happiness baselines have less utility in other ways in the form of not needing to look for things to make them happier. Those with low happiness baselines have less utility simply by having a lower happiness baseline. It's part of who we are. You are welcome to delete it in your brain without a qualm, but I'm fine with my set level of happiness. The lowness of my baseline is what makes me create, what makes me think of interesting ideas of things to do.
Have you heard of the language Toki Pona? It forces you to taboo your words by virtue of the language only containing 120-ish words. It was invented by a linguist named Sonja Lang who was depressed and wanted a language that would force her to break her thoughts into manageable pieces. I'm fluent in it and can confirm that speaking it can get rid of certain confusions like this, but it also creates other, different confusions. [mortal, not-feathers, biped] has 3 confusions in it while [human] only has 1. Tabooing a word splits the confusion into 3 pieces. If we said [mortal, not-feathers, biped] instead of human, that could result in ambiguities related to bipedal-ness (what about creatures that are observed to sometimes walk on 2 legs and sometimes 4), lack of feathers (do porcupine quills count) and mortal (i forgot where i read this or if it's true but apparently there are some microorganisms that can be reanimated by other microorganisms)