Posts
Comments
Firstly, the important part of my modification to the indifference formalism is not about conditioning on the actual o but it's the fact that in evaluating the expectation of UN it take the action in A2 (for a given pair (a1,o)) which maximize UN instead of the action which maximize U (note that U is equal to US in the case that o is not in Press.).
Secondly an agent which chose a1 by simply maximizing E[UN | NotPress; a1] + E[US | Press; a1] do exhibit pathological behaviors. In partcular, there will still be incentives to manage the news, but from both sides now (there is an incentive to cause the button to be pressed in the event of an information which is bad news from the point of view of UN and incentives to cause the button to not be pressed in the events of information which is bad news from the point of view of US.
In the indifference formalism the agent in selecting A1 act like a UN agent that believe that the shutdown button will not be pressed, therefore it create perverse incentives to "manage the news". Which means that if the agent can cause his shutdown button to be pressed in the event of bad news, it will.
My formulation avoid this pathological behavior by instead making the agent select A1 as if it was a UN-agent which believed that it would continue to optimize according to UN even in the vent of the button being pressed which avoid the perverse incentives to "manage the news", while still not having any incentives to avoid the button being pressed because the agent will act like it believe that pressing the button will not cause it to initiate a shutdown.
Why not make it so that the agent in selecting A1 act as a UN-agent that believe that it will continue to optimize according to UN even in the event of the button being pressed rather than a UN agent that believe that the button will never be pressed: that is pick U such that
U(a1,o,a2) = UN(a1,o,a2) if o is in Press or US(a1,o,a2) + f(a1,o) - g(a1,o) if o is not in Press
where f(a1,o) is the maximum value of UN(a1,o,b) for b in A2 and g(a1,o) is the maximum value of US(a1,o,b) for b in A2.
This would avoid the perverse manipulation incentives problem detailed on section 4.2 of the paper.
Apart from the obvious problems with this approach, (The AI can do a lot with the output channel other than what you wanted it to do, choosing an appropriate value for λ, etc.) I don't see why this approach would be any easier to implement than CEV.
Once you know what a bounded approximation of an ideal algorithm is supposed to look like, how the bounded version is supposed to reason about it's idealised version and how to refer to arbitrary physical data, as the algorithm defined in your post assume, then implementing CEV really doesn't seem to be that hard of a problem.
So could you explain why you believe that implementing CEV would be so much harder than what you propose in your post?
By that term I simply mean Eliezer's idea that the correct decision theory ought to use a maximization vantage points with a no-blackmail equilibrium.
If the NBC (No Blackmail Conjecture) is correct, then that shouldn't be a problem.
So you wouldn't trade whatever amount of time Frank as left, which is at most measured in decades, against a literal eternity of Fun?
If I was Frank in this scenario, I would tell the other guy to accept the deal.
Hmm. I'll have to take a closer look at that. You mean that the uncertainties are correlated, right?
No. To quote your own post:
A similar process allows us to arbitrarily set exactly one of the km.
I meant that the utility function resulting from averaging over your uncertainty over the km's will depend on which km you chose to arbitrarily set in this way. I gave an example of this phenomenon in my original comment.
You can't simply average the km's. Suppose you estimate .5 probability that k2 should be twice k1 and .5 probability that k1 should be twice k2. Then if you normalize k1 to 1, k2 will average to 1.25, while similarly if you normalize k2 to 1, k1 will average to 1.25.
In general, to each choice of km's will correspond a utility function and the utility function we should use will be a linear combination of those utility functions and we will have renormalization parameters k'm and, if we accept the argument given in your post, those k'm ought to be just as dependant on your preferences, so you're probably also uncertain about the values that those parameters should take and so you obtain k''m's and so on ad infinitum. So you obtain an infinite tower of uncertain parameters and it isn't obvious how to obtain a utility function out of this mess.
What do you even mean by "is a possible outcome" here? Do you mean that there is no proof in PA of the negation of the proposition?
The formula of a modal agent must be fully modalized, which means that all propositions containing references to actions of agents within the formula must be within the scope of a provability operator.
Proof without using Kripke semantic: Let X be a modal agent and Phi(...) it's associated fully modalized formula. Then if PA was inconsistent Phi(...) would reduce to a truth value independent of X opponent and so X would play the same move against both FairBot and UnfairBot (and this is provable in PA). But PA cannot prove it's own consistency so PA cannot both prove X(FairBot) = C and X(UnfairBot) = D and so we can't both have FairBot(X) = C and UnfairBot(X) = C. QED
Proof: Let X be a modal agent, Phi(...) it's associated fully modalized formula, (K, R) a GL Kripke model and w minimal in K. Then, for all statement of the form ◻(...) we have w |- ◻(...) so Phi(...) reduce in w to a truth value which is independent of X opponent. As a result, we can't have both w |- X(FairBot) = C and w |- X(UnfairBot) = D and so we can't have both ◻(X(FairBot) = C) and ◻(X(UnfairBot) = D) and so we can't both have FairBot(X) = C and UnfairBot(X) = C. QED
no modal agent can get both FairBot and UnfairBot to cooperate with it.
TrollDetector is not a modal agent.
UnfairBot defect against PrudentBot.
Proof: For UnfairBot to cooperate with PrudentBot, PA would have to prove that PrudentBot defect against UnfairBot which would require PA to prove that "PA does not prove that UnfairBot cooperate with PrudentBot or PA+1 does not prove that UnfairBot defect against DefectBot" but that would require PA to prove it's own consistency which it cannot do. QED
Here is another obstacle to an optimality result: define UnfairBot as the agent which cooperate with X if and only if PA prove that X defect against UnfairBot, then no modal agent can get both FairBot and UnfairBot to cooperate with it.
Both of those agents are modal agents of rank 0 and so the fact that they defect against CooperateBot imply that FairBot defects against them by theorem 4.1.
You're right, and wubbles's agent can easily be exploited by a modal agent A defined by A(X)=C <-> [] (X(PB)=C) (where PB is PrudentBot).
The agent defined by wubbles is actually the agent called JustBot in the robust cooperation paper and which is proven to be non-exploitable by modal agents.
Well, depend how different the order is...
If there are theorem in PA of the form "If there are theorems of the form A()≠a and of the form A'()≠a' then the a and the a' such that the corresponding theorem come first in the appropriate ordering must be identical." then you should be okay in the prisoner dilemma setting but otherwise there will be a model of PA in which the two players end up playing different actions and we end up in the same situation as in the post.
More generally, no matter how you try to cut it, there will always be a model of PA in which all theorems are provable and your agent behavior will always depend on what happen in that model because PA will never be able to prove anything which is false in that model.
If I understand what you're proposing correctly then that doesn't work either.
Suppose it is a theorem of PA that this algorithm return an error, then all theorems of the form A()=a => U()>=u are demonstrable in PA and so there is no action with the highest u and the algorithm return an error, because that is demonstrable in PA the algorithm must return an error by Löb's theorem.
I didn't like the way the first paragraph ended. It seemed excessively confrontational and made me not want to read the rest of the post.
My intention was not to appear confrontational. It actually seemed obvious when I began thinking about this problem that the order in which we check actions in step 1 shouldn't matter but that ended up being completely wrong. That was what I was trying to convey though I admit I might have done so in a clumsy manner.
To quote step 2 of the original algorithm:
For every possible action a, find some utility value u such that S proves that A()=a ⇒ U()=u. If such a proof cannot be found for some a, break down and cry because the universe is unfair.
If something is not ontologically fundamental and doesn't reduce to anything which is, then that thing isn't real.
Given that we are able to come to agreement about certain moral matters and the existence of moral progress, I do think that the evidence favor the existence of a well-behaved idealized reasoning system that we are approximating when we do moral reasoning.
Can you give a detailed example of this?
This for a start.
Are you saying you think qualia is ontologically fundamental or that it isn't real or what?
there still isn't any Scientifically Accepted Unique Solution for the moral value of animals
There isn't any SAUS for the problem of free will either. Nonetheless, it is a solved problem. Scientists are not in the business of solving that kind of problems, those problems generally being considered philosophical in nature.
the question is whether the solution uniquely follows from your other preferences, or is somewhat arbitrary?
It certainly appear to uniquely follow.
see the post "The "Scary problem of Qualia".
That seems easy to answer. Modulo a reduction of computation of course but computation seems like a concept which ought to be canonically reducible.
Did the person come into existence:
Ve came into existence whenever a computation isomorphic to verself was performed.
For Harry had only loaned his Cloak, not given it
That seems like it answer your question: his invisible copies aren't borrowing the cloak from him because they are him.
You seem to be taking a position that's different from Eliezer's, since AFAIK he has consistently defended this approach that you call "wrong" (for example in the thread following my first linked comment).
Well, Eliezer_2009 do seem to underestimate the difficulty of the extrapolation problem.
If you have some idea of how to "derive the computation that we approximate when we talk about morality by examining the dynamic of how people reacts to moral arguments" that doesn't involve just "looking at where people moralities concentrate as you present moral arguments to them" then I'd be interested to know what it is.
Have I solved that problem? No. But humans do seem to be able to infer from the way pebblesorters reason that they are referring to primality and also it seem that by looking at mathematicians reasoning about Goldbach's Conjecture we should be able to infer what they refer to when they speak about Goldbach's Conjecture and we don't do that by looking at what position they end up concentrating at when we present them with all possible arguments. So the problem ought to be solvable.
Assuming "when we do moral reasoning we are approximating some computation", what reasons do we have for thinking that the "some computation" will allow us to fully reduce "pain" to a set of truth conditions? What are some properties of this computation that you can cite as being known, that leads you to think this?
Are you trying to imply that it could be the case that the computation that we approximate is only defined over a fixed ontology which doesn't correspond to the correct ontology and simply return a domain error when one try to apply it to the real world? Well that doesn't seem to be the case because we are able to do moral reasoning at a more fine-grained level than the naive ontology and a lot of the concept that seems relevant to moral reasoning, like qualia for example, seems like they ought to have canonical reductions. I detail the way in which I see us doing that kind of elucidation in this comment.
Quiditch, the lack of adequate protection on time turners before Harry gave them the idea put protective shell on them... Seriously, just reread the fic.
Eliezer seems to think that moral arguments are meaningful, but their meanings are derived only from how humans happen to respond to them (or more specifically to whatever coherence humans may show in their responses to moral arguments).
No. What he actually says is that when we do moral reasoning we are approximating some computation in the same ways that the pebblesorters are approximating primality. What make moral arguments valid or invalid is whether the arguments actually establish what they where trying to establish in the context of the actual concept of rightness which is being approximated in the same way that an argument by a pebblesorter that 6 is an incorrect heap because 6=2*3 is judged valid because that is an argument establishing 6 is not a prime number. Of course, to determine what computation we actually are trying to approximate or to establish that we actually are approximating something, looking at the coherence human show in their response to moral arguments is an excellent idea, but it is not how you define morality.
Looking at the first link you provided, I think that looking at where people moralities concentrate as you present moral arguments to them is totally the wrong way to look at this problem. Consider for example Goldbach's Conjecture. If you where to take peoples and present random arguments to them about the conjecture it doesn't seem to necessarily be the case, depending on what distribution you use on possible arguments, that people opinions will concentrate to the correct conclusion concerning the validity of the conjecture. That doesn't mean that people can't talk meaningfully about the validity of Goldbach's Conjecture. Should we be able to derive the computation that we approximate when we talk about morality by examining the dynamic of how people reacts to moral arguments? The answer is yes, but it isn't a trivial problem.
As for the second link you provided, you argue that the way we react to moral arguments could be totally random or depend on trivial details, which doesn't appear to be the case and which seems in contradiction with the fact that we do manage to agree about a lot of thing concerning morality and that moral progress do seems to be a thing.
huge, unsolved debates over morality
One shouldn't confuse there being a huge debate over something with the problem being unsolved, far less unsolvable (look at the debate over free will or worse p-zombies). I have actually solved the problem of the moral value of animals to my satisfaction (my solution could be wrong, of course). As for the problem of dealing with peoples having multiple copies this really seems like the problem of reducing "magical reality fluid" which while hard seems like it should be possible.
also in math, it might be possible to generalize things, but not necessarily, and not always uniquely
Well, yes. But in general if you're trying to elucidate some concept in your moral reasoning you should ask yourself the specific reason why you care about that specific concept until you reach concepts that looks like they should have canonical reductions, then you reduce them. If in doing so you end up with multiple possible reductions that probably mean you didn't go deep enough and should be asking why you care about that specific concept some more so that you can pinpoint the reduction you are actually interested in. If after all that you're still left with multiple possible reductions for a certain concept, that you appear to value terminally, and not for any other reasons, then you should still be able to judge between possible reductions using the other things you care about: elegance, tractability, etc. (though if you end up in this situation it probably means you made an error somewhere...)
generalize n-th differentials over real numbers
I'm not sure what you're referring to here...
Also, looking at the possibilities you enumerate again, 3 appear incoherent. Contradictions are for logical systems, if you have a component of your utility function which is monotone increasing in the quantity of blue in the universe and another component which is monotone decreasing in the quantity of blue in the universe, they partially or totally cancel one another but that doesn't result in a contradiction.
Why exactly do you call 1 unlikely? The whole metaethics sequence argue in favor in 1 (If I understand what you mean by 1 correctly), so what part of that argument do you think is wrong specifically?
There is an entire sequence dedicated to how to define concepts and the specific problem of categories as they matter for your utility function is studied in this post where it is argued those problems should be solved by moral arguments and the whole metaethics sequence argue for the fact that moral arguments are meaningful.
Now if you're asking me if we have a complete reduction of some concept relevant to our utility function all the way down to fundamental physics then the answer is no. That doesn't mean that partial reductions of some concepts potentially relevant to our utility function has never been accomplished or that complete reduction is not possible.
So are you arguing that the metaethics sequence is wrong and that moral arguments are meaningless or are you arguing that the GRT is wrong and that reduction of the concept which appear in your utility function is impossible despite them being real or what? I still have no idea what it is that you're arguing for exactly.
Actually, dealing with a component of your ontology not being real seems like a far harder problem than the problem of such a component not being fundamental.
According to the Great Reductionist Thesis everything real can be reduced to a mix of physical reference and logical reference. In which case, if every component of your ontology is real, you can obtain a formulation of your utility function in terms of fundamental things.
The case where some components of your ontology can't be reduced because they're not real and where your utility function refer explicitly to such entity seem considerably harder, but that is exactly the problem that someone who realize God doesn't actually exist is confronted with, and we do manage that kind of ontology crisis.
So are you saying that the GRT is wrong or that none of the things that we value are actually real or that we can't program a computer to perform reduction (which seems absurd given that we have managed to perform some reductions already) or what? Because I don't see what you're trying to get at here.
Also I totally think there was a respectable hard problem
So you do have a solution to the problem?
Proposition p is meaningful relative to the collection of possible worlds W if and only if there exist w, w' in W such that p is true in the possible world w and false in the possible world w'.
Then the question become: to be able to reason in all generality what collection of possible worlds should one use?
That's a very hard question.
It's explained in detail in chapter 25 that the genes that make a person a wizard do not do so by building some complex machinery which allow you to become a wizard; the genes that make you a wizard constitute a marker which indicate to the source of magic that you should be allowed to cast spells.
I don't think taking polyjuice modify your genetic code. If that was the case, using polyjuice to take the form of a muggle or a squib would leave you without your magical powers.
I'm confused. What's the distinction between x and here?
On the other hand, as I have shown, if you chose t sufficiently large the algorithm I recommend in my post will necessarily end up one boxing if the formal system used is sound.
This is incorrect, as Zeno had shown more than 2000 years ago. It could be that your inference system generates an infinite sequence of statements of the form A()=1 ⇒ U()≥Si with sequence {Si} tending to, say, 100, but with all Si<100, so that A()=1 loses to A()=2 no matter how large the timeout is.
That's why you enumerate all proofs of statement of the form A()=a ⇒ U()≥u (where u is rational number in canonical form). It's a well known fact that it is possible to enumerate all the provable statements in a given formal system without skipping any.
This is a possible behavior, even if the formal system used is sound, if one use rational intervals as you recommend.
Not if we use a Goedel statement/chicken rule failsafe like the one discussed in Slepnev's article you linked to.
There are some subtleties about doing this in the interval setting which made me doubt that it could be done, but after thinking about it some more I must admit that it is possible.
But I think that my algorithm for the non-oracle setting is still valuable.
That doesn't actually work. Take the Newcomb's problem. Suppose that your formal system quickly prove that A()≠1. Then it conclude that A()=1 ⇒ U()∈[0,0], on the other it end up concluding correctly that A()=2 ⇒ U()∈[1000,1000] so it end up two boxing. This is a possible behavior, even if the formal system used is sound, if one use rational intervals as you recommend. On the other hand, as I have shown, if you chose t sufficiently large the algorithm I recommend in my post will necessarily end up one boxing if the formal system used is sound. (using intervals was actually the first idea I had when coming to term with the problem detailed in the post, but it simply doesn't work.)
Rationals are dense in the reals so there are always a rational value between any two real numbers. So for example, if it can be proven in your formal system that A()=1 ⇒ U()=π and this happen to be the maximal utility attainable, there will be a rational number x greater than the utility that can be achieved by any other action and such that x≤π, so you will be able to prove that A()=1 ⇒ U()≥x and so the first action will end up being taken because by definition of x it is greater than the utility that can be obtained by taking any other action.
I don't think the part about summoning Death is a reference to anything. After all, we already know what the incarnations of Death are in MOR. And it looks like the conterspell to dismiss Death is lost no more thanks to Harry...
What if that square is now occupied?
So you're atheists are actually nay theists... If that's the case I have difficulty imagining how a group containing both atheists and theists could work at all...
Color me interested.
I think the character creation rules really should be collected together. Has is, I can't figure out how your supposed to determine many of your initial statistics (wealth level, hit points, speed...). Also I don't like the fact that the number of points you have to distribute among the big five and among the skills are random. And of course a system where you simply divide X points among your stats however you wish is completely broken. You should really think about introducing some limit on the number of point you can put into one your stat and some sort of diminishing return.
But even if many part of the design are open to criticism what you've created is still very awesome.
ETA: There are a lot of references in the rules to a Faith stat, except that there is no such stat! Also, the righteousness stat is often called morality in the rules.
Here in Spain, France, the UK, the majority of people are Atheists.
I would be interested in knowing where you got your numbers because the statistics I found definitively disagreed with this.
Congratulation for raising the expected utility of the future!