Posts

Making AIs less likely to be spiteful 2023-09-26T14:12:06.202Z
Open-minded updatelessness 2023-07-10T11:08:22.207Z
Nicolas Macé's Shortform 2023-06-21T12:37:28.375Z

Comments

Comment by Nicolas Macé (NicolasMace) on Implementing Decision Theory · 2023-11-20T13:31:05.589Z · LW · GW

[Apologies for the delay]

Is there necessarily a utility function for the predictor such that the predictor's behavior (which is arbitrary) corresponds to maximizing its own utility

You're right, the predictor's behavior might not be compatible with utility maximization against any beliefs. I guess we're often interested in cases where we can think of the predictor as an agent. The predictor's behavior might be irrational in the restrictive above sense,[1] but to the extent that we think of it as an agent, my guess is that we can still get away with using a game theoretic-flavored approach. 

  1. ^

    For instance, if the predictor is unaware of some crucial hypothesis, or applies mild optimization rather than expected value maximization

Comment by Nicolas Macé (NicolasMace) on Implementing Decision Theory · 2023-11-11T17:38:51.413Z · LW · GW

I'd say that the connection is: Single-agent problems with predictors can be interpreted as sequential two-player games where the (perfect) predictor is a player who observes the action of the decision-maker and best-responds to it. In game-theoretic jargon, the predictor is a Stackelberg follower, and the decision-maker is the Stackelberg leader. (Related: (Kovarik, Oesterheld & Conitzer 2023))

Comment by Nicolas Macé (NicolasMace) on Implementing Decision Theory · 2023-11-09T08:58:00.119Z · LW · GW

Looks very interesting, I'll make sure to check out the git repo! Thanks for developing that!

As you're perhaps already aware, (Everitt, Leike & Hutter 2015) comes with a jupyter notebook that implements EDT and CDT in sequential decision problems. Perhaps useful as a comparison or a source of inspiration.

My view of decision theory now is that it's all about fixpoints. You solve some big equation, and inside of it is the same equation, and there are multiple fixpoint solutions, and you pick the (hopefully unique) best one.

Would you say that this is similar to the connection that exists between fixed points and Nash equilibria?

Comment by Nicolas Macé (NicolasMace) on The Puritans would one-box: evidential decision theory in the 17th century · 2023-10-16T11:25:25.749Z · LW · GW

Note for completeness:  Arif Ahmed uses a similar connection between Calvinism and evidential decision theory to introduce Newcomb's problem in Evidence, Decision and Causality.

Comment by Nicolas Macé (NicolasMace) on Open-minded updatelessness · 2023-07-15T08:25:32.500Z · LW · GW

Seems like the payoffs of the two agents were swapped in the figure; this should be fixed now. Thanks for pointing it out!

Comment by Nicolas Macé (NicolasMace) on Why Not Subagents? · 2023-06-26T07:49:47.360Z · LW · GW

I think the idea of contracts is interesting. I’m probably less optimistic (or pessimistic?) than the authors that the sub-agents can always contract to complete their preferences. For one thing, contracts might be expensive to make. Second, even free contracts might not be incentivized, for the usual reasons rational agents cannot always avoid inefficiencies in trading (cf the Myerson-Satterthwaite theorem).

On might object that Myerson–Satterthwaite doesn't allow for smart contracts that can conditionally disclose private information. But then I'd argue that these types of smart contracts are probably expensive to make, and thus not always incentivized.

Comment by Nicolas Macé (NicolasMace) on Nicolas Macé's Shortform · 2023-06-21T16:41:01.245Z · LW · GW

One observation that dissolves a little bit of the mystery for me: We can devise, within ZF, a set of properties that points to a unique integer . We can write down this integer, of course, but ZF won't be able to notice that it's the one that we're after, because if it did it'd prove its own consistency.

Comment by Nicolas Macé (NicolasMace) on Nicolas Macé's Shortform · 2023-06-21T12:37:28.465Z · LW · GW

Here's one of the math facts I find the hardest to wrap my head around: There are some finite, well-defined integers that are impossible to compute within PA or ZF.

The argument goes roughly like this: We'll define a finite integer that is such that computing would be equivalent to proving ZF's consistency. Then we're done because by an incompleteness theorem, ZF cannot prove its own consistency. So how to construct such a number? First construct a Turing machine that halts (on a blank tape) iff ZF is inconsistent. Say it has states. The busy beaver number is defined as the maximal number of steps a Turing machine with states takes when ran (on a blank tape), assuming that it halts. If we knew we'd need to run our TM at most steps to prove ZF's consistency. So we've constructed an that's as claimed.

Comment by Nicolas Macé (NicolasMace) on The Lightcone Theorem: A Better Foundation For Natural Abstraction? · 2023-05-18T10:55:57.984Z · LW · GW

Perhaps of interest that people in quantum many-body physics have related results. One keyword is "scrambling". Like in your case, they have a network of interacting units, and since interactions are local they have a lightcone outside of which correlations are exactly zero. 

They can say more than that: Because excitations typically propagate slower than the theoretical max speed (the speed of light or whatever thing is analogous) there's a region near the edge of the lightcone where correlations are almost zero. And then there's the bulk of correlations. They can say all sorts of things in the large time limit. For instance the correlation front typically starts having a universal shape if one waits for long enough. See e.g. this or that.