Posts
Comments
This is the first LW census I've taken!
I am personally interested in questions relating to mental/neurological conditions (e.g. depression, autism, ADHD, dyslexia, anxiety, schizophrenia, etc.)
I don't think the specific part of decision theory where people argue over Newcomb's problem is large enough as a field to be subject to the EMH. I don't think the incentives are awfully huge either. I'd compare it to ordinal analysis, a field which does have PhDs but very few experts in general and not many strong incentives. One significant recent result (if the proof works then the ordinal notation in question would be most powerful proven well-founded) was done entirely by an amateur building off of work by other amateurs (see the section on Bashicu Matrix System): https://cp4space.hatsya.com/2023/07/23/miscellaneous-discoveries/
Yes. Well, almost. Schwarz brings up disposition-based decision theory, which appears similar though might not be identical to FDT, and every paper I've seen on it appears to defend it as an alternative to CDT. There are some looser predecessors to FDT as well, such as Hofstadter's superrationality, but that's too different imo.
Given Schwarz' lack of reference to any paper describing any decision theory even resembling FDT, I'd wager that FDT's obviousness is merely only in retrospect.
Whenever I ask questions like "Is this bullshit or not", I'm not expecting a simple binary yes/no answer and it's meant to be shorthand for a question which is similar but longer and harder to word and asking for a more complex, specified answer.
Right now, I'm mostly taking a look at the thing I looked, and when and if (big if) I get far enough, I'll try to get Russian acquaintances to help me look into it further.
Curious about other personal development paradigms/psychotechnologies. So far I've mostly been trying to follow the book the Mind Illuminated and dabbling in feedbackloop-first rationality and tuning cognitive strategies.
Is there anything about those cases that suggest it should generalize to every decision theorist, or that this is as good a proxy for how much FDT works as the beliefs of earth scientists are for whether the Earth is flat or not?
For instance, your samples consist of a philosopher not specialized in decision theory, one unaccountable PhD, and one single person who is both accountable and specializes in decision theory. Somehow, I feel as if there is a difference between generalizing from that and generalizing from every credentialed expert that one could possibly contact. In any case, its dubious to generalize from that to "every decision theorist would reject FDT in the same way every earth scientist would reject flat earth", even if we condition on you being totally honest here and having fairly represented FDT to your friend.
I think everyone here would bet $1,000 that if every earth scientist knew about flat earth, they would nearly universally dismiss it (in contrast to debating over it or universally accepting it) without hesitation. However, I would be surprised if you would bet $1,000 that if every decision theorist knew about FDT, they would nearly universally dismiss it.
My claim is that there is not yet people who know what they are talking about, or more precisely, everyone knows roughly as much about what they are talking about as everyone else.
Again, I'd like to know who these decision theorists you talked to were, or at least what their arguments were.
The most important thing here is how you are evaluating the field of decision theory as a whole, how you are evaluating who counts as an expert or not, and what arguments they make, in enough detail that one can conclude that FDT doesn't work without having to rely on your word.
So it's crazy to believe things that aren't supported by published academic papers? I think if your standard for "crazy" is believing something that a couple people in a field too underdeveloped to be subject to the EMH disagree with and that there are merely no papers defending it, not any actively rejecting it, then probably you and roughly every person on this website ever count as "crazy".
Actually, I think an important thing here is that decision theory is too underdeveloped and small to be subject to the EMH, so you can't just go "if this crazy hypothesis is correct then why hasn't the entire field accepted it, or at least having a debate over it?" It is simply too small to have fringe, in contrast to non-fringe positions.
Obviously, I don't think the above is necessarily true, but I still think you're making us rely too much on your word and personal judgement.
On that note, I think it's pretty silly to call people crazy based on either evidence they have not seen and you have not showed them (for instance, whatever counterarguments the decision theorists you contacted had), or evidence as weak/debatable as the evidence you have put forth in this post, and which has come to their attention only now. Were we somehow supposed to know that your decision theorist acquaintances disagreed beforehand?
If you have any papers from academic decision theorists about FDT, I'd like to see them, whether favoring or disfavoring it.
IIRC Soares has a Bachelor's in both computer science and economics and MacAskill has a Bachelor's in philosophy.
The thing I disagree with here most is the claim that FDT is crazy. I do not think it is, in fact, crazy to think it is a good idea to adopt a decision theory whose users generically end up winning in decision problems compared to other decision theories.
I also find it suspicious that that part predicates on the opinions of experts we know nothing about, who presumably learned about FDT primarily from you who thinks FDT is bad, and also sort of assumes that MacAskill is any more of an expert on decision theory than Soares. Do not take this as a personal attack, given this is speculation about your intent/mental state, these are just my honest thoughts. At the very least, it would be helpful to read their arguments against FDT.
Even if these were actual experts you talked to, somehow it seems strange to compare a few experts you talked to saying it is false, to flat earth, a hypothesis which many many experts from several fields all related to the earth, all accountable, think is false, and which can be disproven by mere idle thinking about what society and the institution of science would have to look like for it to be true.
For that matter, who even counts as a decision theory expert? What set of criteria are you applying to determine whether someone is an expert on it or not?
(Epistemic status: Plausible position that I don't actually believe in.) The correct answer to the leg-cutting dilemma is that you shouldn't cut it, because actually you will end up existing no matter what because Omega has to simulate you to predict your actions, and it's always possible that you're in the simulation. The fact that you always have to be simulated to be predicted makes up for every apparent decision theory paradox, such as not cutting your leg off even when doing so precludes your existence.
I actually initially wrote off psychonetics because of its woo sounding name. However upon skimming the thing which I linked, I noticed two things that made me suspect that psychonetics might be worth taking seriously:
First, the safety rules and in particular, discouraging religious or spiritual interpretation and encouraging you to stop doing it if you start feeling "an otherworldly presence" or feeling as though you have magical powers and seek out a psychiatrist. Second, the purported benefits are limited in scope. It does not claim to solve all of your physical or mental problems, but only to efficiently process lots of information.
After reading the linked website in more detail, I continue to think it's worth taking seriously.
Psychonetics still sounds somewhat woo-ish but not in a way optimized for book sales or anything else I can think of, besides being true. At worst, I think I would be unable to tell between a world where psychonetics worked and a world where it didn't, just because it doesn't activate a lot of the woo red flags, and there's no experiments I can find confirming whether psychonetics works one way or another. This is what I'd think a $100 dollar on the sidewalk that no one has picked up would look like.
Do you have a link to it or something?
I agree. Right now though, I'm mostly unsure how to act on my knowledge of psychonetics' existence (and the existence of cognitive tuning, for that matter). At least for psychonetics, the sensible first step is probably to just read something on it and maybe distill it in a book review or something. Not sure about cognitive tuning in general since it's not really a thing yet.
If you (or anyone else) are interested in diving into psychonetics or cognitive tuning with me then feel free to contact me, though I can't guarantee anything will come of it because of the turbulent chaos of life (or more realistically my laziness).
I know nothing about naturalism but cognitive tuning (beyond just cognitive strategies) seems like its begging to be expanded upon.
I find it funny that GPT-4 finds the need to account for the possibility that the densities of uranium or gold might have changed as of September 2021.
I think it is referring to Gendlin's focusing.
A lot of your arguments boil down to "This ignores ML and prosaic alignment" so I think it would be helpful if you explained why ML and prosaic alignment are important.
So you came up with it yourself?
how did you figure these things out if they were never published on be well tuned?
I think the prior for aliens having visited Earth should be lower, since it a priori it seems unlikely to me that aliens would interact with Earth but not to an extent which makes it clear to us that they have. My intuition is that its probably rare to get to other planets with sapient life before building a superintelligence (which would almost certainly be obvious to us if it did arrive) and even if you do manage to go to other planets with sapient life, I don't think aliens would not try to contract us if they're anything like humans.
I have tried meditation a little bit although not very seriously. Everything I've heard about it makes me think it would be a good idea to do it more seriously.
Not sure how to be weird without being unuseful. What does a weird but useful background look like?
Also I've already been trying to read a lot but still somewhat dissatisfied with my pace. You mentioned you could read at 3x your previous speed. How did you do that?
I am pretty anxious about posting this since this is my first post on LessWrong and also about a pretty confusing topic but I'm probably not well calibrated on this front so oh. Also thanks to NicholasKross for taking a look at my drafts.
What other advice/readings do you have for optimizing your life/winning/whatever?
I think this depends on whether you use SIA or SSA or some other theory of anthropics.
I have a strong inside view of the alignment problem and what a solution would look like. The main reason why I don't have an as concrete inside view AI timeline is because I don't know enough about ML and I have to defer to get a specific decade. The biggest gap in my model of the alignment problem is what a solution to inner misalignment would look like, although I think it would be something like trying to find a way to avoid wireheading.
I've checked out John Wentworth's study guide before, mostly doing CS50.
Part of the reason I'm considering getting a degree is so I can get a job if I want and not have to bet on living rent-free with other rationalists or something.
The people I've talked to the most have timelines centering around 2030. However, I don't have a detailed picture of why because their reasons are capabilities exfohazards. From what I can tell, their reasons are tricks you can implement to get RSI even on hardware that exists right now, but I think most good-sounding tricks don't actually work (no one expected transformer models to be the closest to AGI in comparison with other architectures) and I think superintelligence is more contingent on compute and training data than they think. It also seems like other people in AI alignment disagree in a more optimistic direction. Now that I think about it though, I probably overestimated how long the timelines of optimistic alignment researchers were so it's probably more like 2040.
The difference between an expected utility maximizer using updateless decision theory and an entity who likes the number 1 more than the number 2, or who cannot count past 1, or who has a completely wrong model of the world which nonetheless makes it one-box is that the expected utility maximizer using updateless decision theory wins in scenarios outside of Newcomb's problem where you may have to choose to $2 instead of $1, or have to count amounts of objects larger than 1, or have to believe true things. Similarly, an entity that "acts like they have a choice" generalizes well to other scenarios whereas these other possible entities don't.
- I think getting an extra person to do alignment research can give massive amounts of marginal utility considering how few people are doing it and how it will determine the fate of humanity. We're still in the stage where adding an extra person removes a scarily large amount from p(doom), like up to 10% for an especially good individual person, which probably averages to something much smaller but still scarily large when looking at your average new alignment researcher. This is especially true for agent foundations.
- I think it's very possible to solve the alignment problem. Stuff like QACI, while not a full solution yet, make me think that this is conceivable and you could probably find a solution if you threw enough people at the problem.
- I think we'll get a superintelligence at around 2050.
One-boxers win because they reasoned in their head that one-boxers win because of updateless decision theory or something so they "should" be a one-boxer. The decision is predetermined but the reasoning acts like it has a choice in the matter (and people who act like they have a choice in the matter win.) What carado is saying is that people who act like they can move around the realityfluid tend to win more, just like how people who act like they have a choice in Newcomb's problem and one-box in Newcomb's problem win even though they don't have a choice in the matter.
I don't think this matters all that much. In Newcomb's problem, even though your decision is predetermined, you should still want to act as if you can affect the past, specifically Omega's prediction.
I don't believe something can persuade generals to go to war in a short period of time, just because it's very intelligent.
A few things I've seen give pretty worrying lower bounds for how persuasive a superintelligence would be:
- How it feels to have your mind hacked by an AI
- The AI in a box boxes you (content warning: creepy blackmail-y acausal stuff)
Remember that a superintelligence will be at least several orders of magnitude more persuasive than character.ai or Stuart Armstrong.
Formal alignment proposals avoid this problem by doing metaethics, mostly something like determining what a person would want if they were perfectly rational (so no cognitive biases or logical errors), otherwise basically omniscient, and had an unlimited amount of time to think about it. This is called reflective equilibrium. I think this approach would work for most people, even pretty terrible people. If you extrapolated a terrorist who commits acts of violence for some supposed greater good, for example, they'd realize that the reasoning they used to determine that said acts of violence were good was wrong.
Corrigibility, on the other hand, is more susceptible to this problem and you'd want to get the AI to do a pivotal act, for example, destroying every GPU to prevent other people from deploying harmful AI, or unaligned AI for that matter.
Realistically, I think that most entities who'd want to use a superintelligent AI like a nuke would probably be too short-sighted to care about alignment, but don't quote me on that.
To the first one, they aren't actually suffering that much or experiencing anything they'd rather not experience because they're continuous with you and you aren't suffering.
I don't actually think a simulated human would be continuous in spacetime with the AI because the computation wouldn't be happening inside of the qualia-having parts of the AI.
I think what defines a thing as a specific qualia-haver is not what information it actually holds but how continuous it is with other qualia-having instances in different positions of spacetime. I think that mental models are mostly continuous with the modeler so you can't actually kill them or anything. In general, I think you're discounting the importance that the substrate of a mental model/identity/whatever has. To make an analogy, you're saying the prompt is where the potential qualia-stuff is happening, and isn't merely a filter on the underlying language model.
My immediate thought is that the cat is already out of the bag and whatever risk there was of AI safety people accelerating capabilities is nowadays far outweighed by capabilities hype and in general, much larger incentives, and that the most we can do is to continue to build awareness of AI risk. Something about this line of reasoning strikes me as uncritical though.
Probably not the best person on this forum when it comes to either PR or alignment but I'm interested enough, if only about knowing your plan, that I want to talk to you about it anyways.
Will the karma thing affect users who've joined before a certain period of time? Asking this because I joined quite a while ago but have only 4 karma right now.
That's not really specific enough. I would describe it as someone being really angry about something, contingent on a certain belief being true, but then when you ask them why they believe that belief, its very weak evidence or something that is the opposite of an open and shut case or something that could vary depending on context and so on and so forth.