Posts
Comments
Do you expect anyone to answer "agree" to the starting question?
Bywayeans are pretty censorious and scrupulous about violations of the NAP
Except against people who enjoy sunsets, apparently?
He’d walk on over to nearby industry labs with candy and a sales pitch for why they should use his services. He primarily targeted top, Nobel-prize-winning research groups
and
Plasmidsaurus has historically done very little ‘traditional’ marketing — no brochures, few cold reach-outs
seem to be a bit contradictory?
If people followed Brennan’s advice, those ignorant of their lack of knowledge would keep voting, while well-educated people might think they’re not competent enough and abstain.
I'd add that people ignorant enough not to know or not to understand Brennan's argument would also keep voting.
Was this post significantly edited? Because this seems to be exactly the take in the post from the start:
because he thought it wasn't bad enough to be considered torture. Then he had it tried on himself, and changed his mind, coming to believe it is torture and should not be performed.
to the end
This is supported by Malcom's claim that Hitchens was "a proponent of torture", which is clearly false going by Christopher's public articles on the subject. The question is only over whether Hitchens considered waterboarding to be a form of torture, and therefore permissible or not, which Malcolm seems to have not understood.
It’s absurd to end up with a framework that believes a life for a woman in Saudi Arabia is just as good as life for a woman in some other country with similarly high per capita income.
You could similarly argue a life for a woman in Saudi Arabia is worse than for a man, but it seems absurd to conclude from that that saving lives of SA men is better than saving lives of SA women.
Whether you save a life in Congo, Sri Lanka or Australia, I can’t think of strong reasons for why #2 would vary all that much.
It seems to me there are obvious differences: 1. family size (in the limit, the saved person may have no family at all); 2. how expected the person's death is otherwise.
But you aren't asked about (your current estimate of your prior). If you want to put it in this way, it would be , your current estimate of your previous estimate. And you do have exact knowledge what that estimate was.
Here is a counter-argument against Rovelli I found reasonable: Aristotle and Falling Objects | Diagonal Argument
so the maximum "downside" would be the sum of the differences between that reference populations lives and those without the variant for all variants you edit (plus any effects from off-targets)
I don't think that's true? It has to assume the variants don't interact with each other. Your reference population would only have 0.01% people with (the rarest) 2 variants at once, 0.0001% with 3 variants, and so on.
Yes, but this exact case is when you say "This would be useful for trying out different variations on a phrase to see what those small variations change about the implied meaning" and when it can be particularly misleading because the LLM is contrasting with the previous version which the humans reading/hearing the final version don't know about.
So it would be more useful for that purpose to use a new chat.
But the screenshot says "if i instead say the words...". This seems like it has to be in the same chat with the "matters" version.
but speak only the truth to other Parselmouths and (by implication) speak only truth to Quakers.
I would merely like to note that the implication seems contrary to the source of the name: I expect Quirrell and most historical Parselmouths in HPMOR would very much lie to Quakers (Quirrell would maybe derive some entertainment from not saying factually false things while misleading them).
Or to put it another way: in the full post you say
There is some evidence he has higher-than-normal narcissistic traits, and there’s a positive correlation between narcissistic traits and DAE. I think there is more evidence of him having DAE than there is of him having narcissistic traits
but to me it looks like you could have equally replaced DAE with "narcissistic traits" in Theories B and C, and provided the same list of evidence.
(1) Convicted criminals are more likely to have narcissistic traits.
(2) "extreme disregard for protecting his customers" is also evidence for narcissistic traits.
Etc. And then you could repeat the exercise with "sociopathy" and so on.
So there are two possibilities, as far as I can see:
- One or more things on the list are in fact not evidence for narcissistic traits.
- They are stronger evidence for DAE than for narcissistic traits.
But it isn't clear which you believe and about what parts of the list in particular. (Of course, with the exception of (4) and (11), but they go in the opposite directions.)
Yes, it's evidence. My question is how strong or weak this evidence is (and my expectation is that it's weak). Your comparison relies on "wet grass is typically substantial evidence for rain".
Based on the full text:
Some readers may think that this sounds circular: if I’m trying to explain why someone would do what SBF did, how is it valid to use the fact that he did it as a piece of evidence for the explanation? But treating the convictions as evidence for SBF’s DAE is valid in the same way that, if you were trying to explain why the grass is wet, it would be valid to use the fact that the grass is wet as evidence for the hypothesis that it rained recently (since wet grass is typically substantial evidence for rain).
But a lot of your pro-DAE evidence seems to me to fail this test. E.g. ok, he lied to the customers and to the Congress; why is this substantial evidence of DAE in particular?
oh, FTX doesn’t have a bank account, I guess people can wire to Alameda’s to get money on FTX….3 years later…oh fuck it looks like people wired $8b to Alameda and oh god we basically forgot about the stub account that corresponded to that, and so it was never delivered to FTX.
This seems like evidence in favor of Theory A and against DAE if you look at those as competing explanations? That is, he (is claiming that in this particular case he) commingled funds for reasons unrelated to DAE.
In November 2022, he also tweeted these statements
It seems likely he believed at that point that if a run could be avoided, he would have enough assets; so making these statements could help most customers, and not making them could hurt most of them, even if it helped a few lucky and quick ones. Not evidence of decreased empathy at all (in my view).
(3) There are multiple sources suggesting that he has a tendency and willingness to lie and deceive others.
Everything under this seems to fail the rain test, at least; very many people have this willingness, most of them don't have DAE (simply based on the prevalence you mention). Is this particular "style" of dishonesty characteristic of DAE?
(4) is actual evidence for DAE, great.
(5) and (10) For the rain test you need to provide a reason to believe most manipulative people have DAE.
Etc.
For decreased affective guilt the situation seems to be worse: as far as I can see, no evidence for it is presented, just evidence there is some reported guilt and then
In the context of the large amounts of evidence for his lack of affective empathy, it seems more likely that the quote above is an example of cognitive guilt rather than affective guilt.
This seems to require a very large correlation between DAEmpathy and DAGuilt. Why couldn't he have one but not the other?
When I wrote the above, I was just going by your stated definition of DAE; after going to the page you linked, which I should have done earlier, a lot of your evidence seems to cover the facets of psychopathy other than DAE; you could argue they are correlated, but it seems replacing DAE with psychopathy (as defined there) in theories B and C would make the evidence fit strictly better.
I feel like people like Scott Aaronson who are demanding a specific scenario for how AI will actually kill us all... I hypothesize that most scenarios with vastly superhuman AI systems coexisting with humans end in the disempowerment of humans and either human extinction or some form of imprisonment or captivity akin to factory farming
Aaronson in that quote is "demanding a specific scenario" for how GPT-4.5 or GPT-5 in particular will kill us all. Do you believe they will be vastly superhuman?
The quoted section more seems like instrumental convergence than orthogonality to me?
The second part of the sentence, yes. The bolded one seems to acknowledge AIs can have different goals, and I assume that version of EY wouldn't count "God" as a good goal.
Another more relevant part:
Obviously, if the AI is going to be capable of making choices, you need to create an exception to the rules - create a Goal object whose desirability is not calculated by summing up the goals in the justification slot.
Presumably this goal object can be anything.
But in order to accept that, one needs to accept the orthogonality thesis.
I agree that EY rejected the argument because he accepted OT. I very much disagree that this is the only way to reject the argument. In fact, all four positions seem quite possible:
- Accept OT, accept the argument: sure, AIs can have different goals, but this (starting an AI without explicit goals) is how you get an AI which would figure out the meaning of life.
- Reject OT, reject the argument: you can think "figure out the meaning of life" is not a possible AI goal.
- and 4. EY's positions at different times.
In addition, OT can itself be a reason to charge ahead with creating an AGI: since it says an AGI can have any goal, you "just" need to create an AGI which will improve the world. It says nothing about setting an AGI's goal being difficult.
In fact it seems that the linked argument relies on a version of the orthogonality thesis instead of being refuted by it:
For almost any ultimate goal - joy, truth, God, intelligence, freedom, law - it would be possible to do it better (or faster or more thoroughly or to a larger population) given superintelligence (or nanotechnology or galactic colonization or Apotheosis or surviving the next twenty years).
Nothing about the argument contradicts "the true meaning of life" -- which seems in that argument to be effectively defined as "whatever the AI ends up with as a goal if it starts out without a goal" -- being e.g. paperclips.
Is the story currently complete?
The issue with the first justification is that no one has actually claimed that the existence of such a rule is obvious or self-evident. Publicly holding a non-obvious belief does not obligate the holder to publicly justify that belief to the satisfaction of the author.
However, Yudkowsky also called the rule "straightforward" and said that
violating it this hugely and explicitly is sufficiently bad news that people should've been wary about this post and hesitated to upvote it for that reason alone
That is, he expected majority of EA Forum members (at least) to also consider is a "basic rule".
That right there shows autogynephilia isn't a universal explanation.
Do any prominent pro-AGP people claim it is? Even when I see them described by their opponents, the claim is that there are two clusters of trans women and AGP people are one of them, so aroace trans women could belong to the other cluster without contradicting that theory.
There are similar claims in Russia as well, for what it's worth.
and author intentionally cropped
The author is visible in the next screenshot, unless you meant something else (also, even if he wasn't, the name is part of the URL).
If I were going to play chess against Magnus Carlsen I'd definitely study his games with a computer, and if that computer found a stunning refutation to an opening he liked I'd definitely play it.
Conditionally on him continuing to play the opening, I would expect he has a refutation to that refutation, but no reason to use the counter-refutation in public games against the computer. On the other hand, he may not want to burn it on you either.
is obviously different than what you said, though
To me it doesn't seem to be? "condoned by social consensus" == "isn't broadly condemned by their community" in the original comment. And
because the "social consensus" is something designed by people, in many cases with the explicit goal of including circles wider than "them and their friends"
doesn't seem to work unless you believe a majority of people are both actively designing the "social consensus" and have this goal; majority of people who design the consensus having this as a goal is not sufficient.
It's explicitly the second:
But if they can do that with an AGI capable of ending the acute risk period, then they've probably solved most of the alignment problem. Meaning that it should be easy to drive the probability of disaster dramatically lower.
You might have confused "singularity" and "a singleton" (that is, a single AI (or someone using AI) getting control of the world)?
Cairo is a problem too, then (it was founded after Arthur lived).
It's also interesting that apparently field experts only did about as well as the traditional students:
Differences between Fleet and ITTC participants were generally smaller and neither consistently positive nor negative.
Does experience not help at all?
I don't believe the original novels imply the humanity nearly went extinct and then banded together, that was only in "the junk Herbert's son wrote". Or that Strong AI was developed only a short time before the Jihad started.
Neither of these are true in the Dune Encyclopedia version, which Frank Hebert at least didn't strongly disapprove of.
There is still some Goodhart's-Law-ing there, to quote https://dune.wikia.com/wiki/Butlerian_Jihad/DE:
After Jehanne's death, she became a martyr, but her generals continued exponentially with more zeal. Jehanne knew her weaknesses and fears, but her followers did not. The politics of Urania were favored. Around that time, the goals of the Jihad were the destruction of machine technology operating at the expense of human values; but by this point they would have be replaced by indiscriminate slaughter.
Whereas I can look at a regular triangle and see its ∆-ness from outside the simulation, I cannot do the same (let's suppose) for keys of the right shape to open lock L.
Why suppose this and not the opposite? If you understand L well enough to see if a key opens it immediately, does this make L-openingness intrinsic, so intrinsicness/extrinsicness is relative to the observer?
And on the other hand, someone else needs to simulate a ruler to check for ∆-ness, so it is an extrinsic property to him.
Namely, goodness of a state of affairs is something that I can assess myself from outside a simulation of that state.
I certainly would consider this much more difficult than merely checking whether a key opens a lock. I could after spending enough time understand the lock well enough for this, but even considering a complete state of affairs e.g. on Earth?
I've taken the survey.
Most leftists ... believe we can all agree on what crops to grow (what social values to have [2])
Whose slogan is "family values", again?
and pull out and burn the weeds of nostalgia, counter-revolution, and the bourgeoisie
Or the weeds of revolution, hippies, and trade unions...
Conservatives view their own society the way environmentalists view the environment: as a complex organism best not lightly tampered with. They're skeptical of the ability of new policies to do what they're supposed to do, especially a whole bunch of new policies all enacted at once.
Bunch of new policies like War on Drugs, for example?
I've taken the survey.
Second AI: If I just destroy all humans, I can be very confident any answers I receive will be from AIs!
The amount of line emission from a galaxy is thus a rough proxy for the rate of star formation – the greater the rate of star formation, the larger the number of large stars exciting interstellar gas into emission nebulae... Indeed, their preferred model to which they fit the trend converges towards a finite quantity of stars formed as you integrate total star formation into the future to infinity, with the total number of stars that will ever be born only being 5% larger than the number of stars that have been born at this time.
Is this a good proxy for total star formation, or only large star formation? Is it plausible that while no/few large stars are forming, many dwarfs are?
But my point is that at some point, a "static analysis" becomes functionally equivalent to running it. If I do a "static analysis" to find out what the state of the Turing machine will be at each step, I will get exactly the same result (a sequence of states) that I would have gotten if I had run it for "real", and I will have to engage in computation that is, in some sense, equivalent to the computation that the program asks for.
Crucial words here are "at some point". And Benja's original comment (as I understand it) says precisely that Omega doesn't need to get to that point in order to find out with high confidence what Eliezer's reaction to counterfactual mugging would be.
Suppose I've seen records of some inputs and outputs to a program: 1->2, 5->10, 100->200. In every case I am aware of it was given a number as input, it output the doubled number. I don't have the program's source and or ability to access the computer it's actually running on. I form a hypothesis: if this program received input 10000, it would output 20000. Am I running the program?
In this case: doubling program<->Eliezer, inputs<->comments and threads he is answering, outputs<->his replies.
But I can still do static analysis of a Turing machine without running it. E.g. I can determine a T.M. would never terminate on given input in finite time.
If I'm figuring out what output a program "would" give "if" it were run, in what sense am I not running it?
In the sense of not producing effects on the outside world actually running it would produce. E.g. given this program
int goodbye_world() {
launch_nuclear_missiles();
return 0;
}
I can conclude running it would launch missiles (assuming suitable implementation of the launch_nuclear_missiles
function) and output 0 without actually launching the missiles.