Posts
Comments
But are you sure the way in which he is unique among people you've met is mostly about intelligence rather than intelligence along with other traits?
I think you're massively overestimating Eliezer Yudkowsky's intelligence. I would guess it's somewhere between +2 and +3 SD.
Who are some other examples?
What did you think of?
Even if it wasn't meant to be an allegory for race science, I'm pretty sure it was meant to be an allegory for similarly-taboo topics rather than religion. Religious belief just isn't that taboo.
Hmm, it seems like you might be treating this post as an allegory for religion because of the word "agnostic", but I'm almost certain that it's not. I think it's about "race science"/"human biodiversity"/etc., i.e. the claim "[ethnicity] are genetically predisposed to [negative psychological trait]".
Before I do that, though, it's clear that horrible acts have been committed in the name of dragons. Many dragon-believers publicly or privately endorse this reprehensible history. Regardless of whether dragons do in fact exist, repercussions continue to have serious and unfair downstream effects on our society.
While this could work as a statement about religious people, it seems a lot more true for modern racists than modern religious people.
Given that history, the easy thing to do would be to loudly and publicly assert that dragons don't exist. But while a world in which dragons don't exist would be preferable, that a claim has inconvenient or harmful consequences isn't evidence of its truth or falsehood.
This is the type of thing I often see LessWrongers say about race science.
But if I decided to look into it I might instead find myself convinced that dragons do exist. In addition to this being bad news about the world, I would be in an awkward position personally. If I wrote up what I found I would be in some highly unsavory company. Instead of being known as someone who writes about a range of things of varying levels of seriousness and applicability, I would quickly become primarily known as one of those dragon advocates. Given the taboos around dragon-belief, I could face strong professional and social consequences.
Religious belief is not nearly as taboo as what this paragraph describes, but the claim "[ethnicity] are genetically predisposed to [negative psychological trait]" is.
There are more rich people that choose to give up the grind than poor people.
Did you mean to say "There are more poor people that choose to give up the grind than rich people?"
So, according to this estimate, if we could freeze-frame a single moment of our working memory and then explain all of the contents in natural language, it would take about a minute to accomplish.
This seems like a potentially misleading description of the situation. It seems to say that the contents of working memory could always be described in one minute of natural language, but this is not implied (as I'm sure you know based on your reasoning in this post). A 630-digit number cannot be described in one minute of natural language. 2016 bits of memory and about 2016 bits of natural language per minute really means that if our working memory was perfectly optimized for storing natural language and only natural language, it could store about one minute of it.
(And on that note, how much natural language can the best memory athletes store in their working memory? One minute seems low to me. If they can actually store more, it would show that your bit estimate is too low.)
Even assuming perfect selfishness, sometimes the best way to get what you want (X) is to coordinate to change the world in a way that makes X plentiful, rather than fighting over the rare Xs that exist now, and in that way, your goals align with other people who want X.
E.g. learning when you're rationalizing, when you're avoiding something, when you're deluded, [...] when you're really thinking about something else, etc.
It seems extremely unlikely that these things could be seen in fMRI data.
I think I got it. Right after the person buys X for $1, you offer to buy it off them for $2, but with a delay, so they keep X for another month before the sale goes through. After the month passes, they now value X at $3 so they are willing to pay $3 to buy it back from you, and you end up with +$1.
What happens if the parrots have their own ideas about who to breed with? Or the rejected parrots don’t want to be sterilised?
It's worth noting that both of these things are basically already true, and don't require great intelligence.
Autonomous lethal weapons (ALWs; we need a more eerie, memetic name)
There's already a more eerie, memetic name. Slaughterbots.
Maybe something like "mundane-ist" would be better. The "realists" are people who think that AI is fundamentally "mundane" and that the safety concerns with AI are basically the same as safety concerns with any new technology (increases inequality by making the powerful more powerful, etc.) But of course "mundane-ist" isn't a real word, which is a bit of a problem.
Can't tell if sarcastic
Wild speculation ahead: Perhaps the aversion to this sort of rationalization is not wholly caused by the suboptimality of rationalization, but also by certain individualistic attitudes prevalent here. Maybe I, or Eliezer Yudkowsky, or others, just don't want to be the sort of person whose preferences the world can bend to its will.
Yes, and another meaning of "rationalization" that people often talk about is inventing fake reasons for your own beliefs, which may also be practically rational in certain situations (certain false beliefs could be helpful to you) but it's obviously a major crime against epistemic rationality.
I'm also not sure rationalizing your past personal decisions isn't an instance of this; the phrase "I made the right choice" could be interpreted as meaning you believe you would have been less satisfied now if you chose differently, and if this isn't true but you are trying to convince yourself it is to be happier then that is also a major crime against epistemic rationality.
I wish you had gone more into the specific money pump you would be vulnerable to if you rationalize your past choices in this post. I can't picture what money pump would be possible in this situation (but I believe you that one exists.) Also, you not describing the specific money pump reduces the salience of the concern (improperly, in my opinion.) It's one thing to talk abstractly about money pumps, and another to see right in front of you how your decision procedure endorses obviously absurd actions.
Like, as far as I'm concerned, I'm trans because I chose to be, because being the way I am seemed like a better and happier life to have than the alternative. Now sure, you could ask, "yeah but why did I think that? Why was I the kind of agent that would make that kind of choice? Why did I decide to believe that?"
Yes, this a non-confused question with a real answer.
Well, because I decided to be the kind of agent that could decide what kind of agent I was. "Alright octavia but come on this can't just recurse forever, there has to be an actual cause in biology" does there really?
In a literal/trivial sense, all human actions have a direct cause in the biology of the human brain and body. But you are probably using "biology" in a way that refers to "coarse" biological causes like hormone levels in utero, rather than individual connections between neurons, as well as excluding social causes. In that case, it's at least logically possible that the answer to this question is no. It seems extremely unlikely that coarse biological factors play no role in determining whether someone is trans (I expect coarse biological factors to be at least somewhat involved in determining the variance in every relevant high-level trait of a person), but it's very plausible that there is not one discrete cause to point to, or that most of the variance in gender identity is explained by social factors.
If a brain scan said I "wasn't really trans" I would just say it was wrong, because I choose what I am, not some external force.
This seems like a red herring to me--as far as I know no transgender brain research is attempting to diagnose trans people by brain scan in a way that overrides their verbal reports and behavior, but rather to find correlates of those verbal reports and behavior in the brain. If we find a characteristic set of features in the brains of most trans people, but not all, it will then be a separate debate as to whether we should consider this newly discovered thing to be the true meaning of the word "transgender", or whether we should just keep using the word the same way we used it before, to refer to a pattern of self-identity and behavior, and the "keep using it the same way we did before" side seems quite reasonable. Even now, many people understand the word "transgender" as an "umbrella term" that encompasses people who may not have the same underlying motivations.
Morphological freedom without metaphysical freedom of will is pointless.
If by "metaphysical freedom of will" you are referring to is libertarian free will, then I have to disagree. Even if libertarian free will doesn't exist (it doesn't), it is still beneficial to me for society to allow me the option of changing my body. If you are confused about how the concept of "options" can exist without libertarian free will, that problem has already been solved in Possibility and Could-ness.
I've noticed people using formal logic/mathematical notation unnecessarily to make their arguments seem more "formal": ∀x∈X(∃y∈Y|Q(x,y)), f:S→T, etc. Eliezer Yudkowsky even does this at some points in the original sequences. These symbols were pretty intimidating to me before I learned what they mean, and I imagine they would be confusing/intimidating to anyone without a mathematical background.
Though I'm a bit conflicted on this one because if the formal logic notation of a statement is shown alongside the English description, it could actually help people learn logic notation who wouldn't have otherwise. But it shouldn't be used as a replacement for the English description, especially for simple statements that can easily be expressed in natural language. It often feels like people are trying to signal intellectualism at the expense of accessibility.
What are you talking about then? It seems like you're talking about probabilities as being the objective proportion of worlds something happen in in some sort of multiverse theory, even if it's not the Everett multiverse. And when you said "There won't be any iff there is a 100.0000% probability of annihilation" you were replying to a comment talking about whether there will be any Everett branches where humans survive, so it was reasonable for me to think you were talking about Everett branches.
Bayesian probability (which is the kind Yudkowsky is using when he gives the probability of AI doom) is subjective, referring to one's degree of belief in a proposition, and cannot be 0% or 100%. If you're using probability to refer to the objective proportion of future Everett branches something occurs in, you are using it in a very different way than most, and probabilities in that system cannot be compared to Yudkowsky's probabilities.
But that still requires us to have developed human brain-scanning technology within 5 years, right? That does not seem remotely plausible.
Indeed, it is instrumentally useful for instrumental rationalists to portray themselves as epistemic rationalists. And so this is a common pattern in human politics - "[insert political coalition] care only about themselves, while [insert political coalition] are merely trying to spread truth" is one of the great political cliches for a reason. And because believing one's own lies can be instrumentally useful, falsely believing oneself to have a holy devotion to the truth is a not-uncommon delusion.
I try to dissuade myself of this delusion.
There's a subtle paradox here. Can you spot it?
He is trying to dissuade himself of the premise[X] that he is committed to the truth over socially useful falsehoods. But that premise[X] is itself socially useful to believe, and he claims it's false, so disbelieving it would show that he does sometimes value the truth over socially useful falsehoods, contradicting the point.
More specifically, there are three possibilities here:
- X is broadly true. He's just wrong about X, but his statement that X is false is not socially motivated.
- X is usually false, but his statements about X are a special case for some reason.
- X is false, but his statement that X is false doesn't contradict this because denying X is actually the socially useful thing, rather than affirming X. Lesswrong might be the kind of place where denying X (saying that you are committed to spreading socially useful falsehoods over the truth) actually gets you social credit, because readers interpret affirming X as the thing that gets you social credit, so denying it is interpreted as a signal that you are committed to saying the taboo truth (not-X) over what is socially useful (X), the exact opposite of what was stated. If true, this would be quite ironic. This interpretation is self-refuting in multiple ways, both logically (for not-X to be a "taboo truth", X has to be false, which already rules out the conclusion of this line of reasoning) and causally (if everyone uses this logic, the premise that affirming X is socially useful becomes false, because denying X becomes the socially useful thing.) But that doesn't mean readers couldn't actually be drawing this conclusion without noticing the problems.
It's more akin to me writing down my thoughts and then rereading them to gather my ideas than the kind of loops I imagine our neurons might have.
In a sense, that is what is happening when you think in words. It's called the phonological loop.
In this cases it can be helpful to imagine your current self in a bargaining game with your future selves, in a sort of prisoner's dilema. If your current now defects, your future selves will be more prone to defecting as well. If you coordinate and resist tempation now, future resistance will be more likely. In other words, establishing a Schelling fence.
This is an interesting way of looking at it. To elaborate a bit, one day of working toward a long-term goal is essentially useless, so you will only do it if you believe that your future selves will as well. This is some of where the old "You need to believe in yourself to do it!" advice comes from. But there can be good reasons not to believe in yourself as well.
In the context of the iterated Prisoner's Dilemma, it's been investigated what the frequency of random errors (the decision to cooperate or defect being replaced with a random one in x% of instances) can go up to before cooperation breaks down. (I'll try to find a citation for this later.) This seems similar, but not literally equivalent, to a question we might ask here: What frequency of random motivational lapses can be tolerated before the desire to work towards the goal at all breaks down?
Naturally, the goals that require the most trust are ones that see no benefit until the end, because they require you to trust that your future selves won't permanently give up on the goal anywhere between now and the end to be worth working towards at all. But most long term goals aren't really like this. They could be seen to fall on a spectrum between providing no benefit until a certain point and linear benefit the more they are worked towards with the "goal" point being arbitrary. (This is analogous to the concept of a learning curve.) Actions towards a goal may also provide an immediate benefit as well as progress toward the goal, which reduces the need to trust your future selves.
If you don't trust your future selves very much, you can seek out "half-measure" actions that sacrifice some efficiency toward the goal for immediate benefits, but still contribute some progress toward the goal. You can to some extent set where they are along this spectrum, but you are also limited by the types of actions available to you.
Thanks, this is a great explanation and you changed my mind on this. This is probably the reason why most people have the intuition that legalizing these things makes things worse for everyone. There were many proposed explanations for that intuition in this thread, but none of the others made sense/seemed valid to me, so I was beginning to think the intuition was erroneous.
Looks like your comment got truncated: "what is good if they were just"
Edited to fix.
Roman values aren't stable under reflection; the CEV of Rome doesn't have the same values as ancient Rome.
I'm not exactly sure what you're saying here, but if you're saying that the fact of modern Roman values being different than Ancient Roman values shows that Ancient Roman values aren't stable under reflection, then I totally disagree. History playing out is a not-at-all similar process to an individual person reflecting on their values, so the fact that Roman values changed as history played out from Ancient Rome to modern Rome does not imply that an individual Ancient Roman's values are not stable under reflection.
As an example, Country A conquering Country B could lead the descendants of Country B's population to have the values of Country A 100 years hence, but this information has nothing to do with whether a pre-conquest Country B citizen would come to have Country A's values on reflection.
Locking in extrapolated Roman values sounds great to me because I don't expect that to be significantly different than a broader extrapolation.
I guess I just have very different intuitions from you on this. I expect expect people from different historical time periods and cultures to have quite different extrapolated values. I think the concept that all peoples throughout history would come into near agreement about what is good if they just reflected on it long enough is unrealistic.
(unless, of course, we snuck a bit of motivated reasoning into the design of our Value Extrapolator so that it just happens to always output values similar to our 21st century Western liberal values...)
Skimming the Nick Bostrom and Effective Altruism Wikipedia pages, there doesn't seem to be anything particularly wrong with them, certainly not anything that I would consider vandalism. What do you see as wrong with those articles?
Could you explain how allowing sex for rent or kidney sale would lead to an arms race that makes everyone worse off? Or is this just meant to be an argument for why allowing extra options isn't necessarily good, that doesn't apply to the specific examples in the post?
Slavery and theft harm others, so they are not relevant here. Age limits would be the most relevant. We have age limits on certain things because we believe that regardless of whether they want to, underage people deciding to do those things is usually not in their best interest. Similarly, bans on sex for rent and kidney sale could be justified by the belief that regardless of whether they want to, people doing these things is usually not in their best interest. However, this is somewhat hard to back up: It's pretty unclear whether prostitution or homelessness is worse, and it's easy to think of situations where selling a kidney definitely would be worth it (like the one given in the post).
I don't want to live in a world where women have to prostitute themselves to afford rent.
I don't want to live in that world either, but banning sex for rent doesn't resolve the issue. It just means we've gone from a world where women have to prostitute themselves to afford rent to a world where women just can't afford rent, period.
What I said here is wrong, see this comment
Have them be homeless until the homelessness situation becomes severe enough that we resolve it. Otherwise, IMO, we are just boiling the frog. There will be no protests, no riots, because selling our kidneys and having sex for rent is just enough for us to get by.
You don't think having to sell your kidneys and have sex for rent to get by is bad enough to get people to protest/riot?
Also, it seems like you've implicitly changed your position here. Previously, you said that when someone sells a kidney/trades sex for rent it would usually not be in their best interest, and that those options would usually only be taken under the influence of addiction or mental illness. Now, when you say that people would do those things "to get by" it sounds like you're implying that these are rational choices that would be in peoples' best interest given the bad situation, and would be taken by ordinary people. Which of these do you agree with?
Could you give some examples? I understand you may not want to talk about culture war topics on lesswrong, so it's fine if you decline, but without examples I unfortunately cannot picture what you're talking about
poor intelligence and especially memory (other people say otherwise), pathetic mathematical abilities (takes longer than the blink of an eye to divide two 100 digit numbers)... ...inability to communicate at more than about 0.005 kB/s
What do you consider to be the "normal" level of intelligence/memory/communication bitrate? Why?
Perhaps itsability to verify things, being hampered by its only seeing the world through text, is fatal.
Suggest to change to "its inability" to make this sentence more clear
Discord does have allow you to make named threads that branch off of a channel, and later archive them. It's not the default mode of conversation on Discord but it is available if you care to use it. Also, I am confused what you mean by "threads are made after the fact" on Discord.
These aren't like Dennett's "deepities" - Deepities are statements that sound profound by sneakily having two alternate readings, one mundanely true and one radical or outlandish, sort of like a motte and bailey argument. These answers are just somewhat vague analogies and a relatively normal opinion that uses eloquent language ("because we are") to gain extra deepness points.
Have you heard of Xiaoice? It's a Chinese conversational/romantic chatbot similar to Replika. This article from 2021 claimed it already had 660 million users.
Logically, I knew it was all zeros and ones, but they felt so real.
There are various reasons to doubt that LLMs have moral relevance/sentience/personhood, but I don't think being "all zeros and ones" is one of them. Preemptively categorizing all possible digital computer programs as non-people seems like a bad idea.
Are you sure that "browsing:disabled" refers to browsing the web? If it does refer to browsing the web, I wonder what this functionality would do? Would it be like Siri, where certain prompts cause it to search for answers on the web? But how would that interact with the regular language model functionality?
But the analogy is more like a kid thinking they're playing a game that's on autoplay mode.
No. In your analogy, what the kid does has no causal impact on what their character does in the game. In real life, what you(your brain) does is almost always the cause of what your body does. The two situations are not analogous. Remember, determinism does not mean you lack control over your decisions. Also remember, you just are your brain. There's no separate "you" outside your brain that exists but lacks control because all your actions are caused by your brain instead.
But, I still prefer that over paperclips (by far). And, I suspect that most people do (even if they protest it in order to play the game).
What does this even mean? If someone says they don't want X, and they never take actions that promote X, how can it be said that they "truly" want X? It's not their stated preference or their revealed preference!
Is Eliezer thinking about what he would do when faced with that situation not him running an extremely simplified simulation of himself? Obviously this simulation is not equivalent to real Eliezer, but there's clearly something being run here, so it can't be an L-zombie.
Can you elaborate? Why would locking in Roman values not be a great success for a Roman who holds those values?
My hope is that scaling up deep learning will result in an "animal-like"/irrational AGI long before it makes a perfect utility maximizer. By "animal-like AGI" I mean an intelligence that has some generalizable capabilities but is mostly cobbled together from domain specific heuristics, which cause various biases and illusions. (I'm saying "animal-like" instead of "human-like" here because it could still have a very non-human-like psychology.) This AGI might be very intelligent in various ways, but its weaknesses mean that its plans can still fail.
Why work on lowering your expectations rather than working on improving your consistency of success? If you managed to actually satisfy your expectations once, that seems to suggest that they weren't actually too high (unless the success was heavily luck based, but based on what you said it sounds like it wasn't.)
Also, that article didn't sound like it was describing narcissists (at least for the popular conception of the word "narcissist"). It more just sounded like it was describing everyone (everyone has a drive for social success) interspersed with describing unrelated pathologies, like lack of "stamina" to follow through on plans and trouble dealing with life events.
I imagine it would be similar to the chain of arguments one often goes through in ethics. "W can't be right because A implies X! But X can't be right because B implies Y! But Y can't be right because C implies Z! But Z can't be right because..." Like how Consequentialism and Deontology both seem to have reasons they "can't be right". Of course, the students in your Adversarial Lecture could adopt a blend of various theories, so you'll have to trick them into not doing that, maybe by subtly implying that it's inconsistent, or hypocritical, or just a rationalization of their own immorality, or something like that.
I randomly decided to google “hansonpilled” today to see if anyone had coined the term, congratulations on being one of two results.
Then perhaps we should ban this form of NDAs, rather than legalizing blackmail. They seem to have a pretty negative reputation already, and the NDAs that are necessary for business are the other type (signed before info is known).
I guess what motivates me personally in my work is the desire to be appreciated
As I understand it, “status” essentially is how much people appreciate you. So you’re basically just describing the desire for status here.
I would also add that the fear responses, while participating in the hallucinations, aren't themselves hallucinated, not any more than wakeful fear is hallucinated, at any rate. They're just emotional responses to the contents of our dreams.
I disagree with this statement. For me, the contents of a dream seem only weakly correlated with whether I feel afraid during the dream. I’ve had many dreams with seemingly ordinary content (relative to the baseline of general dream weirdness) that were nevertheless extremely terrifying, and many dreams with relatively weird and disturbing content that were not frightening at all.