Posts
Comments
Check out this page, it goes up to 2024.
Nobody would understand that.
This sort of saying-things-directly doesn't usually work unless the other person feels the social obligation to parse what you're saying to the extent they can't run away from it.
Correct me if I'm wrong, but I think we could apply the concept of logical uncertainty to metaphysics and then use Bayes' theorem to update depending on where our metaphysical research takes us, the way we can use it to update the probability of logically necessarily true/false statements.
Bayes' theorem is about the truth of propositions. Why couldn't it be applied to propositions about ontology?
However, this image is obviously optimized to be scary and disgusting. It looks dangerous, with long rows of sharp teeth. It is an eldritch horror. It's at this point that I'd like to point out the simple, obvious fact that "we don't actually know how these models work, and we definitely don't know that they're creepy and dangerous on the inside."
It's optimized to illustrate the point that the neural network isn't trained to actually care about what the person training it thinks it came to care about, it's only optimized to act that way on the training distribution. Unless I'm missing something, arguing the image is wrong would be equivalent to arguing that maybe the model truly cares about what its human trainers want it to care about. (Which we know isn't actually the case.)
Well. Their actual performance is human-like, as long as they're using GPT-4 and have a right prompt. I've talked to such bots.
In any case, the topic is about what future AIs will do, so, by definition, we're speculating about the future.
They're accused, not whistleblowers. They can't retroactively gain the right to anonymity, since their identities have already been revealed.
They could argue that they became whistleblowers as well, and so they should be retroactively anonymized, but that would interfere with the first whistleblowing accusation (there is no point in whistleblowing against anonymous people), and also they're (I assume) in a position of comparative power here.
There could be a second whistleblowing accusation made by them (but this time anonymously) against the (this time) deanonymized accuser, but given their (I assume) higher social power, that doesn't seem appropriate.
I agree that it feels wrong to reveal the identities of Alice and/or Chloe without concrete evidence of major wrongdoing, but I don't think we have a good theoretical framework for why that is.
Ethically (and pragmatically), you want whistleblowers to have the right to anonymity, or else you'll learn of much less wrongdoing that you would otherwise, and because whistleblowers are (usually) in a position of lower social power, so anonymity is meant to compensate for that, I suppose.
I will not "make friends" with an appliance.
That's really substratist of you.
But in any case, the toaster (working in tandem with the LLM "simulating" the toaster-AI-character) will predict that and persuade you some other way.
Of course, it would first make friends with you, so that you'd feel comfortable leaving up to it the preparation of your breakfast, and you'll even feel happy that you have such a thoughtful friend.
If you were to break the toaster for that, it would predict that and simply do it in a way that would actually work.
Unless you precommit to breaking all your AIs that will do anything differently from what you tell them to, no matter the circumstances and no matter how you feel about it in that moment.
Can you show what priors you used, how you calculated the posteriors, what numbers you got and where the input numbers came from? I highly doubt that hypothesis has a higher posterior probability.
The wifi hacking also immediately struck me as reminiscent of paranoid psychosis.
How hard is it to hack somebody's wifi?
Also, a traumatized person attributing a seemingly hacked wifi to their serious abuser doesn't need to mean any mental illness.
That's irrelevant. To see why one-boxing is important, we need to realize the general principle - that we can only impose a boundary condition on all computations-which-are-us (i.e. we can choose how both us and all perfect predictions of us choose, and both us and all the predictions have to choose the same). We can't impose a boundary condition only on our brain (i.e. we can't only choose how our brain decides while keeping everything else the same). This is necessarily true.
Without seeing this (and therefore knowing we should one-box), or even while being unaware of this principle altogether, there is no point in trying to have a "debate" about it.
What do I get out of any of this?
If Bob asked this question, it would show he's misunderstanding the point of Alice's critique - unless I'm missing something, she claims he should, morally speaking, act differently.
Responding "What do I get out of any of this?" to that kind of critique is either a misunderstanding, or a rejection of morality ("I don't care if I should be, morally speaking, doing something else, because I prefer to maximize my own utility.").
Edit: Or also, possibly, a rejection of Alice ("You are so annoying that I'll pretend this conversation is about something else to make you go away.").
The author shares how terrible it feels that X is true, without bringing arguments for X being true in the first place (based on me skimming the post). That can bypass the reader's fact-check (because why would he write about how bad it made him feel that X is true if it wasn't?).
It feels to me like he's trying to combine an emotional exposition (no facts, talking about his feelings) with an expository blogpost (explaining a topic), while trying to grab the best of both worlds (the persuasiveness and emotions of the former and the social status of the latter) without the substance to back it up.
omnizoid's post as an example of where not to take EY's side was poorly chosen. He two-boxes on Newcomb's problem and any confident statements he makes about rationality or decision theory should be, for that reason, ignored entirely.
Of course, you go meta, without claiming that he's object-level right, but I'm not sure using an obviously wrong post to take his side on the meta level is a good idea.
To understand why FDT is true, it's best to start with Newcomb's problem. Since you believe you should two-box, it might be best to debate the Newcomb's problem with somebody first. Debating FDT at this stage seems like a waste of time for both parties.
(The visual fidelity is a very small fraction of what we actually think it is - the brain lies to us about how much we perceive.)
CharacterAI bots don't show as public until some condition is fulfilled, which I don't remember right now.
ChatGPT-3.5 (when prompted the perfect way) outperforms CharacterAI, I assume GPT-4 with the right prompt would as well, you might check those options out. (I haven't tried a therapist specifically, though.)
That depends on how we define "information" - for one definition of information, qualia are information (and also everything else is, since we can only recognize something by the pattern it presents to us).
But for another definition of information, there is a conceptual difference - for example, morphine users report knowing they are in pain, but not feeling the quale of pain.
The "purpose" of most martial arts is to defeat other martial artists of roughly the same skill level, within the rules of the given martial art.
This is false - the reason they were created was self-defense. That you can have people of similar weight and belt color spar/fight each other in contests is only a side effect of that.
"Beginner's luck" is a thing in almost all games. It's usually what happens when someone tries a strategy so weird that the better player doesn't immediately understand what's going on.
That doesn't work in chess if the difference in skill is large enough - if it did, anyone could simply make up strategies weird enough, and without any skill, win any title or even the World Chess Championship (where is the number of victories needed).
If you're saying it works as a matter of random fluctuations - i.e. a player without skill could win, let's say, games against Magnus Carlsen, because these strategies (supposedly) usually almost never work but sometimes they do, that wouldn't be useful against an AI, because it would still almost certainly win (or, more realistically, I think, simply model us well enough to know when we'd try the weird strategy).
Two points of order, without going into any specific accusations or their absence:
- The post is transphobic, which anticorrelates with being correct/truthful/objective.
- It seems optimized for smoothness/persuasion, which, based on my experience, also anticorrelates with both truth and objectivity.
- That's seemingly quite a convincing reason why you can't be born too early. But what occurs to me now is that the problem can be about where you are, temporally, in relation to other people. (So you were still born on the same day, but depending on the entire size of the civilization (), the probability of you having people precede you is .)
- Depending on how "anthropic problem" is defined, that could potentially be true either for all, or for some anthropic problems.
How does one tell if they "are trans"
If introspection doesn't help, maybe a specialized therapist would. I can't offer any rationalist-level advice on how to find out if you're transgender that you couldn't google yourself. Good luck finding out, however.
Edit: It seems that whoever downvoted this completely correct comment has been psychologically empowered by the barrage of the transphobic crackpot comments in this thread, but it seems such is the price for having an anonymous upvote/downvote system.
Autogynephylia and being transgender are two distinct phenomena, the latter being caused by the brain of the opposite gender.
Experiencing the former doesn't mean the latter is also secretly the former.
Surely you're not saying that the point of arresting him is to prevent him from winning an election. Surely you're not saying that.
There are many points to arresting criminals. Making it harder for them to amass power is one of them, of which winning an election is a subset.
Do you think such a taboo is likely to increase or decrease the risk from dictators taking over?
Increase, for the reasons I enumerated.
Maybe you could claim that would-be dictators are more likely than good candidates to have committed crimes
That's one factor, yes.
On the other hand, if there is no such taboo, then a dictator who has already been elected is more likely to appoint cronies who will prosecute his political opponents for whatever might stick to them—even if they don't stick, the prosecution itself can be damaging and onerous.
That doesn't work for the reasons I gave (to very briefly repeat them - the dictator will not respect any informal taboos (or formal ones, for that matter)).
Aside from not respecting informal taboos, he will be helped by other Republicans in other branches of the government to get away with both overstepping his authority and committing outright crimes.
The idea you're describing is the exact opposite of how social interaction and the system of power work, and has been generated and released into the wild by bad-faith actors who are invested in people falsely believing in them (another one is "we have to give the Babyeaters a platform and host their hate speech on our servers, so that people can see how terrible they are and stop supporting them," which also works, in reality, the other way around).
Do you simultaneously know what it's like when something looks red, and also believe that you don't have qualia?
Are you taking into account the simulated exams? It doesn't look like it mostly generates false facts?
It seemed to be key that I fight well by some metrics
That couldn't be the case - that would leave you, even after having a black belt, vulnerable towards people who can't fight, which would defeat the purpose of martial arts. Whichever technique you use, you use when responding to what the other person is currently doing. You don't simply execute a technique that depends on the person fighting well by some metrics, and then get defeated when it turns out that they are, in fact, only in the 0.001st percentile of fighting well by any metrics we can imagine.
(That said, I'm really happy for your victories - maybe they weren't quite as well-trained.)
This has me wonder whether an AI would have significant difficulties winning against humans who act inconsistently and suboptimally in some ways, without acting like utter idiots randomly all the time
I'm thinking the AI would predict the way in which the other person would act inconsistently and suboptimally.
If there were multiple paths to victory for the human and the AI could block only one (thereby seemingly giving the human the option to out-random the AI by picking one of the unguarded paths to victory), the AI would be better at predicting the human than the human would be at randomizing.
People are terrible at being unpredictable. I remember a 10+ years-old predictor of a rock-paper-scissors for predicting a "random" decision of a human in a series of games. The humans had no chance.
I found two statements in the article that I think are well-defined enough and go into your argument:
- "The birth rank discussion isn't about if I am born slightly earlier or later."
How do you know? I think it's exactly about that. I have probability of being born within the first of all humans (assuming all humans are the correct reference class - if they're not, the problem isn't in considering ourselves a random person from a reference class, but choosing the wrong reference class).
2. "Nobody can be born more than a few months away from their actual birthday."
When reasoning probabilistically, we can imagine other possible worlds. We're not talking about something being the case while at the same time not being the case. We imagine other possible worlds (created by the same sampling process that created our world) and compare them to ours. In some of those possible worlds, we were born sooner or later.
That's true, but the definition of probability isn't inapplicable to everything. From that, in conjunction with us being able to make probabilistic predictions about ourselves, follows that we are a random member of at least one reference class, which means that our soul has been selected at random from all possible souls from a specific reference class (if that's what you meant by that).
By definition of probability, we can consider ourselves a random member of some reference class. (Otherwise, we couldn't make probabilistic predictions about ourselves.) The question is picking the right reference class.
Back in 2000s, the official version was that it's enough to ingest a pepper-grain-sized amount of the infected tissue to be infected with BSE, so maybe the decomposition of the proteins isn't perfect (in the sense that some molecules might not be taken apart). The ingestion of the tissue is still the official mode of transmission.
The separations here are not going to impress someone who thinks the Democrats are using the system to attack the enemy they hate.
We can't withdraw from arresting criminals and putting them on trial because conspiracy theorists will invent a different story in their minds. Unless...
I wouldn't extend this to all public figures, just those who are serious candidates for an election for leader of the country.
...Oh, so you meant like 5-10 people tops, in the country of 300M. I see.
On the surface, it's a very consequentialistic reasoning. Why arrest one criminal, if it can cause a revolution?
Of course, Trump has already been arrested, and the revolution hasn't happened, so this isn't the case here, apparently.
There are also three other problems:
- Local consequentialism - locally (both spatially and temporally) optimizing can have disastrous global effects. Today, we can't arrest an aspiring dictator. And so he, in 5 years, wins the election. Now he's in power, and we have the problem we hoped to avoid. If we arrested him, there would be localized violent disturbances, but his ascension to power would've been slower, if he ever became elected in the first place. (The general pattern is that the less you oppose evil to avoid undesirable externalities, the faster and the more power the evil gains.)
- Incentivizing self-modification against your values - if we reward people willing to invent conspiracy theories in their heads by not arresting their ideological leader, they are, both consciously and subconsciously, motivated to do just that, because they know you will back down.
- The peaceful evolution, enabled by democracy, only refers to the kind of enabling where gaining the power is lawful. If I'm elected because you are too scared to arrest me, that's a (non-violent) revolution, rather than a peaceful transition of power. Of course, you could say that doesn't matter because the goal here is to avoid violence. But that brings us back to the problems (1) and (2). Also, the violence (a bloody revolution is very unlikely) will happen in both branches of the future. The difference is that in the no-arrest one, you signaled that you will rather let evil win by inaction (and then be victimized) rather than arrest the would-be-dictator (and be victimized afterwards anyway). You won't end up being actually better off. Rather, you'll end up being worse off, since you've now set the precedent that you will ignore the law under the threat of violence. (Which is sometimes a good idea, but in this case it's not.)
there should be such a strong taboo against arresting political opponents that one just doesn't do it
Nobody arrested their political opponent (the politicians aren't the ones doing the arresting).
Also, why not? Shouldn't it be more important if a crime was committed, rather than if someone is someone else's political opponent? Why should public figures have immunity from being arrested unless >80% of the population agrees?
Yes, but the game is very easy, so a lot of different strategies get you close to the cap.
I've been thinking about it, and I'm not sure if this is the case in the sense you mean it - expected money maximization doesn't reflect human values at all, white Kelly criterion mostly does, so if we make our assumptions more realistic, it should move us away from expected money maximization and towards the Kelly criterion, as opposed to moving us the other way.
I'd consider that to be exploitation. In addition to that, too-easy-to-win bets make me wary of something unpredictable going wrong.
I'm sure that this time around, it's definitely real aliens. Or, barring that, magic or time travel.
nonhumans
(You might want to exclude advanced/experimental AI models from that, to capture the spirit of the bet better.)
People will judge this question, like many others, based on their feelings. The AI person, summoned into existence by the language model, will have to be sufficiently psychologically and emotionally similar to a human, while also having above-average-human-level intelligence (so that people can look up to the character instead of merely tolerating it).
Leaving aside the question whether the technology for creating such an AI character already exists or not, these, I think, will ultimately be the criteria that will be used by people of somewhat-above-average intelligence and zero technical and philosophical knowledge (i.e. our lawmakers) to grant AIs rights.
Haven't whistleblowers talking about how the government has alien spaceships always been a thing?
It would bring on an enormous amount of new evidence, since the position of the orthogonality thesis is so strong (rather than arguing from some vague and visibly false philosophical assumptions).
Oh, I see. Yes, I agree. The idea to maximize the expected money would never occur to me (since that's not how my utility function works), but I get it now.
So, by optimal, you mean "almost certainly bankrupt you." Then yes.
My definition of optimal is very different.
Obviously humans don't have linear utility functions
I don't think that's the only reason - if I value something linearly, I still don't want to play a game that almost certainly bankrupts me.
Obviously humans don't have linear utility functions, but my point is that the Kelly criterion still isn't the right answer when you make the assumptions more realistic.
I mean, that's not obvious - the Kelly criterion gives you, in the example with the game, E(money) = $240, compared to $246.61 with the optimal strategy. That's really close.
If instead the cap goes to infinity then the optimal strategy is to bet everything on every round.
This isn't right unless I'm missing something - Kelly provides the fastest growth, while betting everything on every round is almost certain to bankrupt you.
Not maximising expected utility means that you expect to get less utility.
This isn't actually right though - the concept of maximizing utility doesn't quite overlap with expecting to have more or less utility at the end.
There are many examples where maximizing your expected utility means expecting to go broke, and not maximizing it means expecting to end up with more money.
(Even though, in this particular one-turn example, Bob should, in fact, expect to end up with more money if he bets everything.)
Because surviving worlds don't look like someone cyberattacking AI labs until AI alignment has been solved, they look like someone solving AI alignment in time before the world has been destroyed.
I believe Kelly maximizes E(log(money)), no?
I don't know about other types of data, but the human brain processes only a very small fraction of the visual data, and lies to us about how much we're perceiving.