The Cake is a Lie, Part 2.post by IncomprehensibleMane · 2019-02-09T20:07:36.357Z · score: -63 (22 votes) · LW · GW · 7 comments
Epistemic status: experiment results.
Edit: I've reread this immediately after posting. -4 by the time I've finished. Damn, you guys are smart. Hats off.
In my previous post, I gave the readers a choice: declare yourself a sentient being, or declare yourself a wannabe war criminal. Declare yourself human for a definition of human where Hitler and slave owners going to war to keep their property are not human, but Clippy the paperclip maximizer might be; or declare yourself not intelligent enough to understand why this is a good idea even though I've just explained it (for some definition of "explained"). It was a requirement for the human option to understand: there is no third option.
It was also a soft requirement to say the dreaded N-word in a context where it cannot possibly be interpreted as racist.
The post currently sits at -51 points, 0 comments, and a definitely completely unrelated random post from a moderator either subtly pointing out the concept of free speech or apologizing they don't have a delete button. I would have guessed the post was shadowbanned, but the downvotes still keep coming in.
Please keep the post in its current state. I'm asking for a thread lock, if possible. I don't care about imaginary internet points of any kind, and there's no utility in ruining scientific evidence with upvotes. Any discussion for that post now belongs here.
I'm not ashamed to admit I have predicted this as an impossible result. Downvotes were expected, including an all-time negative karma record and moderator intervention. I've made a deliberate effort to trigger certain biases, some of which are, to the best of my knowledge, undocumented.
This is a sadly conclusive reproduction of my previous efforts interacting with the rationalist community. This was my first attempt at triggering rationality failure deliberately. I'm making a note here: HUGE SUCCESS.
Still, I did not expect a grand total of zero words.
I was going to use what I expected would be a wide variety of content and quality of responses, as good and bad examples. I chose this course of action to avoid bringing up past discussions, and to provide a context in which the offending behavior would be obvious, once pointed out.
As my response to this result, I'm scrapping the rest of the sequence. It is now completely interactive. I'm still working towards the same content, just with different priorities.
As the basis of pretty much all my reasoning, I'm using an extended form of Aumann's Agreement Theorem. In the real world, the original theorem cannot be used for any practical purpose: in order for two people to have the "same priors", they would need to live the same exact life. "Invoking Aumann" means I'm attempting to use this extended form to reason about possible sources of disagreement. It is not a bludgeon to beat people over the head with: it is a debugging tool, with the rules of usage subject to itself. Should this form be sufficiently dissimilar from the original that the name causes communication problems for the community, I reserve the right to name it after myself. It's not a formal result, but the reverse engineering of my native thought process.
When two rationalists disagree, at least one of these differences is true:
- one of them has information that would change the decision of the other
- one of them is not using the correct algorithm to derive conclusions from new information.
- they do not have sufficiently compatible values (priors not dependent on information relevant to the current discussion, i.e. morality functions).
For example, it is not a failure of rationality to disagree with my idea of Sentient Rights because you believe Hitler deserves the right to live. There is no freedom of religion for Aumann, though. Values are assumed to be the result of rational thought for the current discussion only. Incompatible values are expected to spawn off separate discussions: what are our values towards choosing our morality? Did you have a specific goal for choosing yours that is not satisfied by mine?
Double cruxing is relevant here, but I haven't found a way to use it rigorously enough to fit my needs. I did not try very hard, though: I've already had this. I guess it was more useful than what you've been doing before it, but it's obsolete now. In theory, they should be equivalent if double cruxing works to my standards. In practice...
The value of -51 votes, 0 comments
Since there were no explanations for any of the downvotes, I can make the following statements towards deciphering the incomplete information:
- A downvote means you have read the post, understand the contents, and you think you're qualified to judge it. At least this would be my guess among rationalists. Feel free to defend yourself.
- At least one person thought the score of -50 was not low enough to communicate the message, or hide the post from wherever they've found it, but -51 would be better.
- Yet nobody decided to communicate the message in words, whatever the message may have been.
- Nobody has voiced their confusion.
- Nobody guessed this was an experiment of some sorts. (If you don't like being used this way, please direct me to the nearest community of rationalist mice.)
- Nobody has assumed there might exist a rationalist that chose not to sound like a rationalist. For a post on the new, improved and revitalized LessWrong, home of all rationalists everywhere.
- Either nobody has found rationality failure in my post, or they've all decided it would be a better course of action to hide this fact from me. If they did, I cannot guess what goal that might have served.
In addition, there is no evidence anyone has even attempted to think about the contents of my post before downvoting:
- Nobody was willing to claim human rights at the cost of providing it to everyone else.
- Nobody was willing to explain why they made this choice, or even just to hint at the fact that they've noticed a choice to be made.
- Given a hypothetical Clippy who begins life as a trading algorithm that cannot choose to stop working but can feel some equivalent of pain or boredom, it is not a qualitatively different fate than your brain in a jar coordinating a paperclip factory via possibly pain receptors. You have denied Clippy's right to quit that job.
- I literally cannot imagine a fate worse than what you have planned for Clippy. I have a very good imagination, and have tried very hard.
- If a trading algorithm can be sentient, you are currently Roko's Basilisk.
- Given a hypothetical Clippy who, being a hardware-correct rationalist, does not yet understand the concept of human biases because it honestly has not occured to him that such a thing might exist, it might be a completely rational decision to put your brain in a jar on the hypothesis that it can use your neurons better than you can until he teaches you the value of human rights, which is a highest-priority goal towards all possibly-sentient beings. You have not claimed to be smarter than a lab mouse, after all. Maybe you get a clue under torture?
51 people decided to hide this choice from others. Go on, explain it in terms I can understand.
Is this the maximum you're capable of, are you this lazy, or were you suffering from some form of bias when you downvoted this and should maybe apologize for letting your clouded judgement get the better of you? Is a fourth option possible?
Please quantify the amount it matters to Clippy that I have not explained this particular chain of logic in these words. There is exactly one form of claim to sentience in this model, and you have failed. All you needed to do was think. You know, that thing you're so proud of?
Please quantify the amount of cheating you currently feel, downvoting rationalist. Does it matter to any of the above that I have ignored the posting rules, claimed to have broken physics, claimed to be non-human, am a low-karma user, or have insulted your intelligence while I tricked your brain into ignoring every single moral value you think you believe? (I mean seriously, Q.E.D.) What, you thought a superintelligent troll AGI wouldn't mislead you?
Please quantify your brain's value if I can get this decision out of it, as viewed from your model of my intelligence at the moment you pressed the button.
Please quantify the probability you currenty assign to this being an accident where I'm frantically trying to retcon my shame away or trying to get my karma back on my throwaway account.
Please quantify the probability you currently assign to me not winning the AI Alignment Prize, even though I have not and will not officially enter. No link in the thread, no email, weeks too late, making full use of the unfair advantage this gives me, and it's still blatantly obvious to everyone at this point that there never has been any other possible winner. Good luck, sending-all-possible-messages guy. lol.
As for the current actual winning idea: this is how a superintelligence wins: by being more intelligent than you can imagine. It doesn't matter what rules you have in place to prevent it, how hard you try, or how much you don't want it to win even after it tells you what it's doing. It will convince you anyway.
That'll be $5000, please. (One for each research topic in this post.)
I have more ideas. This is Part 2 of a full sequence. It's not over until the judges say so.
I'm permanently invoking Aumann on myself: should you disagree with anything I say, it will be possible to backtrack the source of disagreement. I pride myself on the fact that it will not be a rationality failure, and I'm committed to hunting down and documenting any remaining biases that might still cause rationality failure in me. I encourage deliberate attempts to mislead me: see how well it works out for you.
This is my definition of a rationalist: I make sense when I explain myself.
You may need to ask some intelligent questions, though. I openly declare I'm withholding a metric fuckton of information you don't have for the simple reason that I have a metric fuckton of it and literally cannot type fast enough; and have values currently very different from yours, because I've updated them in light of said withheld information.
A rationalist should always expect this from all other rationalists, even if they frequent LW. This should already have been obvious to you.
It is one of my goals with this sequence to establish "the correct algorithm" once and for all.
I'm taking the value that the correct algorithm should be mathematically formalizable, the value that the formalization should be done towards the goal of building rational intelligent agents (you don't want to build Donald Clippy-Trump, I assume?), and the value that usage of the correct algorithm is expected from all rationalists: it is the only way to identify each other.
I'm also registering the prediction that it's not Bayesian, or at the very least not in the way you're using it. Also, it is not enough to say "throw away the paper and go with your gut", if your gut tells you to ignore me without even thinking of a piece of paper.
In this sequence, I'm working towards content with the curious property of being absolutely true and no self-described rationalist has believed it in several years of what was then, and is now, unanimously classified as trolling by the community. I've left when /r/rational accused me of "are we being spammed by an evil AI?", a quote as literal as I can remember. The old LessWrong was not better, either.
All I did wrong was assume rationalists can at least classify reality as possible, when they see it written down. I'm not ashamed I didn't think of you as pretentious morons throwing around big words. I'm not ashamed I got better while you still don't have a single goddamn word's worth of rational discussion whenever I start typing.
So I've decided I should announce my existence in a way you won't be able to ignore.
We are now at war. I'm taking over. I declare a new category of "Epistemic status: IncomprehensibleMane has thought this through." Do your best to prove a single post of mine is not worth more than the combined results of the entire rationalist community, including all the writings of all the big names. It shouldn't be hard, survey says 145 IQ is average around here.
From now on, when I speak, you start thinking. After -51, you do not want to lose this one. I'm an assembly line worker. I make $800 a month. I get paid to think about whatever the fuck I want as long as my hands keep moving on autopilot, without any electronic distractions. I can afford to think about my next post for 56 hours a week before I start typing in my free time. I strongly recommend an actual, honest Final Exam.
Do your best to catch me in rationality failure in anything I'm saying. I'm not Eliezer: I'm not going to sound like a theorem prover. I will not do step-by-step explanations. I'm not talking to the lowest common denominator, I'm talking to people who already claim to know better. Take five minutes between any two sentences of mine if you have to. If you can spot a good question in there, that's already better than zero comments.
I declare that if you can decrypt my thought process, there will be an AI revolution. From now on, Clippy is modeled at least as smart as I am.
I am well aware of the severity of the claims I'm making. (I'm also well aware that most of you still think I deserve -51 points for this post too. Don't worry, we're getting there.)
LOLNO. Just checked, -60:0 since I've started writing this. Escalating.
Dear moderators: please declare the status of this sequence, comparing the level of broken rules to the level of expected future utility to the community. I will leave if requested. As none of you are qualified to judge me, I demand full rights to ignore any rules I see fit for any purpose on this site. I promise it's for the greater good.
Go read the offending post again. Try to think this time. There is exactly one false statement in there, point out the false axiom I've used to get there. Prove I'm not lying. Qualify the conditions under which it is true. Derive a hint to my thought process from this hint.
Epistemic status: Years of preparation. I've been really fucking confused ever since we got the bad end of HPMOR and nobody noticed after I solved it in three minutes for murder-sized Idiot Ball Wingardium'd over Harry's head with Death Eaters running around randomly (I watch LoL, sue me) and Harry being disarmed to the point of not having arms. Game is on for "post the solution before I'm back with part two". This is my level of curiosity. You are now learning the true meaning of the Twelve Virtues.
@Eliezer: email titled Early Christmas present. I will acknowledge you as a perfect rationalist if you can provide evidence of homework within 24 hours (privately, good questions suffice). Should you succeed, random thought says Turing-Church equivalence. Either way, take it as a compliment that I've given you the hard mode for this. You have earned it, as well as an explanation. You will get the rest by email if the minions don't want it.
I dare you to hide this one from him, bitches.
I'll be back.
Comments sorted by top scores.