anthony-digiovanni

Posts
Comments

Posts

Should you go with your best guess?: Against precise Bayesianism and related views 2025-01-27T20:25:26.809Z

Winning isn't enough 2024-11-05T11:37:39.486Z

What are your cruxes for imprecise probabilities / decision rules? 2024-07-31T15:42:27.057Z

Individually incentivized safe Pareto improvements in open-source bargaining 2024-07-17T18:26:43.619Z

In defense of anthropically updating EDT 2024-03-05T06:21:46.114Z

Making AIs less likely to be spiteful 2023-09-26T14:12:06.202Z

Responses to apparent rationalist confusions about game / decision theory 2023-08-30T22:02:12.218Z

Anthony DiGiovanni's Shortform 2023-04-11T13:10:43.391Z

When is intent alignment sufficient or necessary to reduce AGI conflict? 2022-09-14T19:39:11.920Z

When would AGIs engage in conflict? 2022-09-14T19:38:22.478Z

When does technical work to reduce AGI conflict make a difference?: Introduction 2022-09-14T19:38:00.760Z

Comments

Comment by Anthony DiGiovanni (antimonyanthony) on AI 2027: What Superintelligence Looks Like · 2025-04-25T04:07:22.550Z · LW · GW

what would your preferred state of the timelines discourse be?

My main recommendation would be, "Don't pin down probability distributions that are (significantly) more precise than seems justified." I can't give an exact set of guidelines for what constitutes "more precise than seems justified" (such is life as a bounded agent!). But to a first approximation:

Suppose I'm doing some modeling, and I find myself thinking, "Hm, what feels like the right median for this? 40? But ehh maybe 50, idk…"
And suppose I can't point to any particular reason for favoring 40 over 50, or vice versa. (Or, I can point to some reasons for one number but also some reasons for the other, and it’s not clear which are stronger — when I try weighing up these reasons against each other, I find some reasons for one higher-order weighing and some reasons for another, etc. etc.)
- This isn’t a problem for every pair of numbers that occurs to us when estimating stuff. If I have to pick between, say, 2030 or 2060 for my AGI timelines median, it seems like I have reason to trust my (imprecise!) intuition^[1] that AI progress is going fast enough that 2060 is unreasonable.
Then: I wouldn't pick just one of 40 or 50 for the median, or just one number in between. I'd include them all.

I totally agree that we can't pin down the parameters to high precision

I'm not sure I understand your position, then. Do you endorse imprecise probabilities in principle, but report precise distributions for some illustrative purpose? (If so, I'd worry that's misleading.) My guess is that we're not yet on the same page about what "pin down the parameters to high precision" means.

I think this sort of work is valuable because it introduces new, comprehensive-ish frameworks for thinking about timelines/takeoff

Agreed! I appreciate your detailed transparency in communicating the structure of the model, even if I disagree about the formal epistemology.

communicating the reasoning behind our beliefs in a more transparent way than a non-quantitative approach would

If our beliefs about this domain ought to be significantly imprecise, not just uncertain, then I'd think the more transparent way to communicate your reasoning would be to report an imprecise (yet still quantitative) forecast.

^{^}
I don't want to overstate this, tbc. I think this intuition is only trustworthy to the extent that I think it's a compression of (i) lots of cached understanding I've gathered from engaging with timelines research, and (ii) conservative-seeming projections of AI progress that pass enough of a sniff test. If I came into this domain with no prior background, just having a vibe of "2060 is way too far off" wouldn't be a sufficient justification, I think.

Comment by Anthony DiGiovanni (antimonyanthony) on AI 2027 is a Bet Against Amdahl's Law · 2025-04-22T03:49:13.414Z · LW · GW

I think once an AI is extremely good at AI R&D, lots of these skills will transfer to other domains, so it won’t have to be that much more capable to generalize to all domains, especially if trained in environments designed for teaching general skills.

This step, especially, really struck me as under-argued relative to how important it seems to be for the conclusion. This isn't to pick on the authors of AI 2027 in particular. I'm generally confused as to why arguments for an (imminent) intelligence explosion don't say more on this point, as far as I've read. (I'm reminded of this comic.) But I might well have missed something!

Comment by Anthony DiGiovanni (antimonyanthony) on AI 2027: What Superintelligence Looks Like · 2025-04-21T03:25:06.447Z · LW · GW

arguing against having probabilistic beliefs about events which are unprecedented

Sorry, I'm definitely not saying this. First, in the linked post (see here), I argue that our beliefs should still be probabilistic, just imprecisely so. Second, I'm not drawing a sharp line between "precedented" and "unprecedented." My point is: Intuitions are only as reliable as the mechanisms that generate them. And given the sparsity of feedback loops^[1] and unusual complexity here, I don't see why the mechanisms generating AGI/ASI forecasting intuitions would be truth-tracking to a high degree of precision. (Cf. Violet Hour's discussion in Sec. 3 here.)

the level of precedentedness is continous

Right, and that's consistent with my view. I'm saying, roughly, the degree of imprecision (/width of the interval-valued credence) should increase continuously with the depth of unprecedentedness, among other things.

forecasters have successfully done OK at predicting increasingly unprecedented events

As I note here, our direct evidence only tells us (at best) that people can successfully forecast up to some degree of precision, in some domains. How we ought to extrapolate from this to the case of AGI/ASI forecasting is very underdetermined.

^{^}
On the actual information of interest (i.e. information about AGI/ASI), that is, not just proxies like forecasting progress in weaker or narrower AI.

Comment by Anthony DiGiovanni (antimonyanthony) on AI 2027: What Superintelligence Looks Like · 2025-04-21T00:15:09.324Z · LW · GW

Despite (2), it's important to try anyways

FWIW this is the step I disagree with, if I understand what you mean by "try". See this post.

Forecasts which include intuitive estimations are commonplace and often useful (see e.g. intelligence analysis, Superforecasting, prediction markets, etc.).

In this context, we're trying to forecast radically unprecedented events, occurring on long subjective time horizons, where we have little reason to expect these intuitive estimates to be honed by empirical feedback. Peer disagreement is also unusually persistent in this domain. So it's not at all obvious to me that, based on superforecasting track records, we can trust that our intuitions pin down these parameters to a sufficient degree of precision.^[1] More on this here (this is not a comprehensive argument for my view, tbc; hoping to post something spelling this out more soon-ish!).

^{^}
As the linked post explains, "high precision" here does not mean "the credible interval for the parameter is narrow". It means that your central/point estimate of the parameter is pinned down to a narrow range, even if you have lots of uncertainty.

Comment by Anthony DiGiovanni (antimonyanthony) on AI 2027: What Superintelligence Looks Like · 2025-04-20T23:57:17.690Z · LW · GW

Assumptions that highly constrain the model should be first and foremost rather than absent from publicly facing write-ups and only in appendices.

Strongly agree — cf. nostalgebraist's posts making this point on the bio anchors and AI 2027 models. I have the sense this is a pretty fundamental epistemic crux between camps of people making predictions (or suspending judgment!) about AI takeoff.

Comment by Anthony DiGiovanni (antimonyanthony) on VDT: a solution to decision theory · 2025-04-10T00:42:41.536Z · LW · GW

It sounds like you're viewing the goal of thinking about DT as: "Figure out your object-level intuitions about what to do in specific abstract problem structures. Then, when you encounter concrete problems, you can ask which abstract problem structure the concrete problems correspond to and then act accordingly."

I think that approach has its place. But there's at least another very important (IMO more important) goal of DT: "Figure out your meta-level intuitions about why you should do one thing vs. another, across different abstract problem structures." (Basically figuring out our "non-pragmatic principles" as discussed here.) I don't see how just asking Claude helps with that, if we don't have evidence that Claude's meta-level intuitions match ours. Our object-level verdicts would just get reinforced without probing their justification. Garbage in, garbage out.

Comment by Anthony DiGiovanni (antimonyanthony) on Can you control the past? · 2025-04-07T00:11:42.695Z · LW · GW

often, we don't think of commitment as literally closing off choice -- e.g., it's still a "choice" to keep a promise

For what it's worth, if a "commitment" is of this form, I struggle to see what the motivation for paying in Parfit's hitchhiker would be. The "you want to be the sort of person who pays" argument doesn't do anything for me, because that's answering a different question than "should you choose to pay [insofar as you have a 'choice']?" I worry there's a motte-and-bailey between different notions of "commitment" going on. I'd be curious for your reactions to my thoughts on this here.

Comment by Anthony DiGiovanni (antimonyanthony) on Anthony DiGiovanni's Shortform · 2025-04-01T03:41:48.117Z · LW · GW

Isn't the "you get what you measure" problem a problem for capabilities progress too, not just alignment? I.e.: Some tasks are sufficiently complex (hence hard to evaluate) and lacking in unambiguous ground-truth feedback that, when you turn the ML crank on them, you're not necessarily going to select for actually doing the task well. You'll select for "appearing to do the task well," and it's open question how well this correlates with actually doing the task well. ("Doing the task" here can include something much higher-level, like "being 'generally intelligent'.")

Which isn't to say this problem wouldn't bite especially hard for alignment. Alignment seems harder to verify than lots of things. But this is one reason I'm not fully sold that once you get human-level AI, capabilities progress will get faster.

(I'm hardly an expert on this, so might well have missed existing discourse on & answers to this question.)

Comment by Anthony DiGiovanni (antimonyanthony) on METR: Measuring AI Ability to Complete Long Tasks · 2025-03-26T05:20:59.457Z · LW · GW

No, at some point you "jump all the way" to AGI

I'm confused as to what the actual argument for this is. It seems like you've just kinda asserted it. (I realize in some contexts all you can do is offer an "incredulous stare," but this doesn't seem like the kind of context where that suffices.)

I'm not sure if the argument is supposed to be the stuff you say in the next paragraph (if so, the "Also" is confusing).

Comment by Anthony DiGiovanni (antimonyanthony) on tlevin's Shortform · 2025-03-08T06:37:35.902Z · LW · GW

I worry there's kind of a definitional drift going on here. I guess Holden doesn't give a super clean definition in the post, but AFAICT these quotes get at the heart of the distinction:

Sequence thinking involves making a decision based on a single model of the world ...

Cluster thinking – generally the more common kind of thinking – involves approaching a decision from multiple perspectives (which might also be called “mental models”), observing which decision would be implied by each perspective, and weighing the perspectives in order to arrive at a final decision. ... [T]he different perspectives are combined by weighing their conclusions against each other, rather than by constructing a single unified model that tries to account for all available information.

"Making a decision based on a single model of the world" vs. "combining different perspectives by weighing their conclusions against each other" seems orthogonal to the failure mode you mention. (Which is a failure to account for a mechanism that the "cluster thinker" here explicitly foresees.) I'm not sure if you're claiming that empirically, people who follow sequence thinking have a track record of this failure mode? If so, I guess I'm just suspicious of that claim and would expect it's grounded mostly in vibes.

Comment by Anthony DiGiovanni (antimonyanthony) on tlevin's Shortform · 2025-02-25T17:47:33.503Z · LW · GW

here's a story where we totally fail on that first thing and the second thing turns out to matter a ton!

I'm confused as to why this is inconsistent with sequence thinking. This sounds like identifying a mechanistic story for why the policy/technical win would have good consequences, and accounting for that mechanism in your model of the overall value of working on the policy/technical win. Which a sequence thinker can do just fine.

Comment by Anthony DiGiovanni (antimonyanthony) on Notes on Occam via Solomonoff vs. hierarchical Bayes · 2025-02-13T18:17:52.836Z · LW · GW

working more directly with metrics such as "what are the most expected-value rewarding actions that a bounded agent can make given the evidence so far"

I'm not sure I exactly understand your argument, but it seems like this doesn't avoid the problem of priors, because what's the distribution w.r.t. which you define "expected-value rewarding"?

Comment by Anthony DiGiovanni (antimonyanthony) on Should you go with your best guess?: Against precise Bayesianism and related views · 2025-02-02T13:19:47.084Z · LW · GW

(General caveat that I'm not sure if I'm missing your point.)

Sure, there's still a "problem" in the sense that we don't have a clean epistemic theory of everything. The weights we put on the importance of different principles, and how well different credences fulfill them, will be fuzzy. But we've had this problem all along.

There are options other than (1) purely determinate credences or (2) implausibly wide indeterminate credences. To me, there are very compelling intuitions behind the view that the balance among my epistemic principles is best struck by (3) indeterminate credences that are narrow in proportion to the weight of evidence and how far principles like Occam seem to go. This isn't objective (neither are any other principles of rationality less trivial than avoiding synchronic sure losses). Maybe your intuitions differ, upon careful reflection. That doesn't mean it's a free-for-all. Even if it is, this isn't a positive argument for determinacy.

both do rely on my intuitions

My intuitions about foundational epistemic principles are just about what I philosophically endorse — in that domain, I don’t know what else we could possibly go on other than intuition. Whereas, my intuitions about empirical claims about the far future only seem worth endorsing as far as I have reasons to think they're tracking empirical reality.

Comment by Anthony DiGiovanni (antimonyanthony) on Should you go with your best guess?: Against precise Bayesianism and related views · 2025-02-02T00:07:08.094Z · LW · GW

it seems pretty arbitrary to me where you draw the boundary between a credence that you include in your representor vs. not. (Like: What degree of justification is enough? We'll always have the problem of induction to provide some degree of arbitrariness.)

To spell out how I’m thinking of credence-setting: Given some information, we apply different (vague) non-pragmatic principles we endorse — fit with evidence, Occam’s razor, deference, etc.

Epistemic arbitrariness means making choices in your credence-setting that add something beyond these principles. (Contrast this with mere “formalization arbitrariness”, the sort discussed in the part of the post about vagueness.)

I don’t think the problem of induction forces us to be epistemically arbitrary. Occam’s razor (perhaps an imprecise version!) favors priors that penalize a hypothesis like “the mechanisms that made the sun rise every day in the past suddenly change tomorrow”. This seems to give us grounds for having prior credences narrower than (0, 1), even if there’s some unavoidable formalization arbitrariness. (We can endorse the principle underlying Occam’s razor, “give more weight to hypotheses that posit fewer entities”, without a circular justification like “Occam’s razor worked well in the past”. Admittedly, I don’t feel super satisfied with / unconfused about Occam’s razor, but it’s not just an ad hoc thing.)

By contrast, pinning down a single determinate credence (in the cases discussed in this post) seems to require favoring epistemic weights for no reason. Or at best, a very weak reason that IMO is clearly outweighed by a principle of suspending judgment. So this seems more arbitrary to me than indeterminate credences, since it injects epistemic arbitrariness on top of formalization arbitrariness.

Comment by Anthony DiGiovanni (antimonyanthony) on Should you go with your best guess?: Against precise Bayesianism and related views · 2025-02-01T23:57:16.857Z · LW · GW

(I'll reply to the point about arbitrariness in another comment.)

I think it's generally helpful for conceptual clarity to analyze epistemics separately from ethics and decision theory. E.g., it's not just EV maximization w.r.t. non-robust credences that I take issue with, it's any decision rule built on top of non-robust credences. And I worry that without more careful justification, "[consequentialist] EV-maximizing within a more narrow "domain", ignoring the effects outside of that "domain"" is pretty unmotivated / just kinda looking under the streetlight. And how do you pick the domain?

(Depends on the details, though. If it turns out that EV-maximizing w.r.t. impartial consequentialism is always sensitive to non-robust credences (in your framing), I'm sympathetic to "EV-maximizing w.r.t. those you personally care about, subject to various deontological side constraints etc." as a response. Because “those you personally care about” isn’t an arbitrary domain, it’s, well, those you personally care about. The moral motivation for focusing on that domain is qualitatively different from the motivation for impartial consequentialism.)

So I'm hesitant to endorse your formulation. But maybe for most practical purposes this isn't a big deal, I'm not sure yet.

Comment by Anthony DiGiovanni (antimonyanthony) on Winning isn't enough · 2025-02-01T12:08:22.454Z · LW · GW

That's right.

(Not sure you're claiming otherwise, but FWIW, I think this is fine — it's true that there's some computational cost to this step, but in this context we're talking about the normative standard rather than what's most pragmatic for bounded agents. And once we start talking about pragmatic challenges for bounded agents, I'd be pretty dubious that, e.g., "pick a very coarse-grained 'best guess' prior and very coarse-grained way of approximating Bayesian updating, and try to optimize given that" would be best according to the kinds of normative standards that favor indeterminate beliefs.)

Comment by Anthony DiGiovanni (antimonyanthony) on Winning isn't enough · 2025-01-31T09:40:32.589Z · LW · GW

does that require you to either have the ability to commit to a plan or the inclination to consistently pick your plan from some prior epistemic perspective

You aren't required to take an action (/start acting on a plan) that is worse from your current perspective than some alternative. Let maximality-dominated mean "w.r.t. each distribution in my representor, worse in expectation than some alternative." (As opposed to "dominated" in the sense of "worse than an alternative with certainty".) Then, in general you would need^[1] to ask, "Among the actions/plans that are not maximality-dominated from my current perspective, which of these are dominated from my prior perspective?" And rule those out.

^{^}
If you care about diachronic norms of rationality, that is.

Comment by Anthony DiGiovanni (antimonyanthony) on Should you go with your best guess?: Against precise Bayesianism and related views · 2025-01-29T09:21:34.645Z · LW · GW

mostly problems with logical omnisicence not being satisfied

I'm not sure, given the "Indeterminate priors" section. But assuming that's true, what implication are you drawing from that? (The indeterminacy for us doesn't go away just because we think logically omniscient agents wouldn't have this indeterminacy.)

the arbitrariness of the prior is just a fact of life

The arbitrariness of a precise prior is a fact of life. This doesn't imply we shouldn't reduce this arbitrariness by having indeterminate priors.

Comment by Anthony DiGiovanni (antimonyanthony) on Should you go with your best guess?: Against precise Bayesianism and related views · 2025-01-28T16:13:10.382Z · LW · GW

The obvious answer is only when there is enough indeterminacy to matter; I'm not sure if anyone would disagree. Because the question isn't whether there is indeterminacy, it's how much, and whether it's worth the costs of using a more complex model instead of doing it the Bayesian way.

Based on this I think you probably mean something different by “indeterminacy” than I do (and I’m not sure what you mean). Many people in this community explicitly disagree with the claim that our beliefs should be indeterminate at all, as exemplified by the objections I respond to in the post.

When you say “whether it’s worth the costs of using a more complex model instead of doing it the Bayesian way”, I don’t know what “costs” you mean, or what non-question-begging standard you’re using to judge whether “doing it the Bayesian way” would be better. As I write in the “Background” section: "And it’s question-begging to claim that certain beliefs “outperform” others, if we define performance as leading to behavior that maximizes expected utility under those beliefs. For example, it’s often claimed that we make “better decisions” with determinate beliefs. But on any way of making this claim precise (in context) that I’m aware of, “better decisions” presupposes determinate beliefs!"

You also didn't quite endorse suspending judgement in that case - "If someone forced you to give a best guess one way or the other, you suppose you’d say “decrease”.

The quoted sentence is consistent with endorsing suspending judgment, epistemically speaking. As the key takeaways list says, “If you’d prefer to go with a given estimate as your “best guess” when forced to give a determinate answer, that doesn’t imply this estimate should be your actual belief.”

But if it is decision relevant, and there is only a binary choice available, your best guess matters

I address this in the “Practical hallmarks” section — what part of my argument there do you disagree with?

Comment by Anthony DiGiovanni (antimonyanthony) on Should you go with your best guess?: Against precise Bayesianism and related views · 2025-01-28T13:23:24.718Z · LW · GW

You seem to be underestimating how pervasive / universal this critique is - essentially every environment is more complex than we are

I agree it's pretty pervasive, but the impression I've gotten from my (admittedly limited) sense of how infra-Bayesianism works is:

The "more complex than we are" condition for indeterminacy doesn't tell us much about when, if ever, our credences ought to capture indeterminacy in how we weigh up considerations/evidence — which is a problem for us independent of non-realizability. For example, I'd be surprised if many/most infra-Bayesians would endorse suspending judgment in the motivating example in this post, if they haven't yet considered the kinds of arguments I survey. This matters for how decision-relevant indeterminacy is for altruistic prioritization.

I'm also not aware of the infra-Bayesian literature addressing the "practical hallmarks" I discuss, though I might have missed something.

(The Solomonoff induction part is a bit above my pay grade, will think more about it.)

Comment by Anthony DiGiovanni (antimonyanthony) on Chance is in the Map, not the Territory · 2025-01-19T11:55:31.991Z · LW · GW

Just a pointer, I'd strongly recommend basically anything by Alan Hajek about this topic. "The reference class problem is your problem too" is a highlight. I find him to be an exceptionally clear thinker on philosophy of probability, and expect discussions about probability and beliefs would be less confused if more people read his work.

Comment by Anthony DiGiovanni (antimonyanthony) on Computational functionalism probably can't explain phenomenal consciousness · 2025-01-05T05:22:52.810Z · LW · GW

(1) isn't a belief (unless accompanied by (2))

Why not? Call it what you like, but it has all the properties relevant to your argument, because your concern was that the person would "act in all ways as if they're in pain" but not actually be in pain. (Seems like you'd be begging the question in favor of functionalism if you claimed that the first-person recognition ((2)-belief) necessarily occurs whenever there's something playing the functional role of a (1)-belief.)

That's not possible, because the belief_2 that one isn't in pain has nowhere to be instantiated.

I'm saying that no belief_2 exists in this scenario (where there is no pain) at all. Not that the person has a belief_2 that they aren't in pain.

Even if the intermediate stages believed_2 they're not in pain and only spoke and acted that way (which isn't possible), it would introduce a desynchronization between the consciousness on one side, and the behavior and cognitive processes on the other.

I don't find this compelling, because denying epiphenomenalism doesn’t require us to think that changing the first-person aspect of X always changes the third-person aspect of some Y that X causally influences. Only that this sometimes can happen. If we artificially intervene on the person's brain so as to replace X with something else designed to have the same third-person effects on Y as the original, it doesn’t follow that the new X has the same first-person aspect! The whole reason why given our actual brains our beliefs reliably track our subjective experiences is, the subjective experience is naturally coupled with some third-person aspect that tends to cause such beliefs. This no longer holds when we artificially intervene on the system as hypothesized.

There is no analogue of "fluid" in the brain. There is only the pattern.

We probably disagree at a more basic level then. I reject materialism. Subjective experiences are not just patterns.

Comment by Anthony DiGiovanni (antimonyanthony) on What are your cruxes for imprecise probabilities / decision rules? · 2025-01-03T16:28:42.211Z · LW · GW

I think I'm happy to say that in this example, you're warranted in reasoning like: "I have no information about the biases of the three coins except that they're in the range [0.2, 0.7]. The space 'possible biases of the coin' seems like a privileged space with respect to which I can apply the principle of indifference, so there's a positive motivation for having a determinate probability distribution about each of the three coins centered on 0.45."

But many epistemic situations we face in the real world, especially when reasoning about the far future, are not like that. We don't have a clear, privileged range of numbers to which we can apply the principle of indifference. Rather we have lots of vague guesses about a complicated web of things, and our reasons for thinking a given action could be good for the far future are qualitatively different from (hence not symmetric with) our reasons for thinking it could be bad. (Getting into the details of the case for this is better left for top-level posts I'm working on, but that's the prima facie idea.)

Comment by Anthony DiGiovanni (antimonyanthony) on What are your cruxes for imprecise probabilities / decision rules? · 2024-12-30T17:25:54.342Z · LW · GW

I'd think this isn't a very good forecast since the forecaster should either have combined all their analysis into a single probability (say 30%) or else given the conditions under which they give their low end (say 10%) or high end (say 40%) and then if I didn't have any opinions on the probability of those conditions then I would weigh the low and high equally (and get 25%).

This sounds like a critique of imprecise credences themselves, not maximality as a decision rule. Do you think that, even if the credences you actually endorse are imprecise, maximality is objectionable?

Anyway, to respond to the critique itself:

The motivation for having an imprecise credence of [10%, 40%] in this case is that you might think a) there are some reasons to favor numbers closer to 40%; b) there are some reasons to favor numbers closer to 10%; and c) you don't think these reasons have exactly equal weight, nor do you think the reasons in (a) have determinately more or less weight than those in (b). Given (c), it's not clear what the motivation is for aggregating these numbers into 25% using equal weights.
I'm not sure why exactly you think the forecaster "should" have combined their forecast into a single probability. In what sense are we losing information by not doing this? (Prima facie, it seems like the opposite: By compressing our representation of our information into one number, we're losing the information "the balance of reasons in (a) and (b) seems indeterminate".)

Comment by Anthony DiGiovanni (antimonyanthony) on Computational functionalism probably can't explain phenomenal consciousness · 2024-12-30T17:02:57.248Z · LW · GW

In response to the two reactions:
Why do you say, "Besides, most people actually take the opposite approch: computation is the most "real" thing out there, and the universe—and any consciouses therein—arise from it."
Euan McLean said at the top of his post he was assuming a materialist perspective. If you believe there exists "a map between the third-person properties of a physical system and whether or not it has phenomenal consciousness" you believe you can define consciousness with a computation. In fact, anytime you believe something can be explicitly defined and manipulated, you've invented a logic and computer. So, most people who take the materialist perspective believe the material world comes from a sort of "computational universe", e.g. Tegmark IV.

I'm happy to grant that last sentence for the sake of argument, but note that you originally just said "most people," full stop, without the massively important qualifier "who take the materialist perspective."

Comment by Anthony DiGiovanni (antimonyanthony) on Computational functionalism probably can't explain phenomenal consciousness · 2024-12-30T04:36:10.904Z · LW · GW

The non-functionalist audience is also not obliged to trust the introspective reports at intermediate stages.
This introduces a bizarre disconnect between your beliefs about your qualia, and the qualia themselves. Imagine: It would be possible, for example, that you believe you're in pain, and act in all ways as if you're in pain, but actually, you're not in pain.

I think "belief" is overloaded here. We could distinguish two kinds of "believing you're in pain" in this context:

Patterns in some algorithm (resulting from some noxious stimulus) that, combined with other dispositions, lead to the agent's behavior, including uttering "I'm in pain."
A first-person response of recognition of the subjective experience of pain.

I'd agree it's totally bizarre (if not incoherent) for someone to (2)-believe they're in pain yet be mistaken about that. But in order to resist the fading qualia argument along the quoted lines, I think we only need someone to (1)-believe they're in pain yet be mistaken. Which doesn't seem bizarre to me.

(And no, you don't need to be an epiphenomenalist to buy this, I think. Quoting Block: “Consider two computationally identical computers, one that works via electronic mechanisms, the other that works via hydraulic mechanisms. (Suppose that the fluid in one does the same job that the electricity does in the other.) We are not entitled to infer from the causal efficacy of the fluid in the hydraulic machine that the electrical machine also has fluid. One could not conclude that the presence or absence of the fluid makes no difference, just because there is a functional equivalent that has no fluid.”)

Comment by Anthony DiGiovanni (antimonyanthony) on Anthony DiGiovanni's Shortform · 2024-12-29T20:08:52.165Z · LW · GW

the copies would not only have the same algorithm, but also the same physical structure arbitrarily finely

I understand, I'm just rejecting the premise that "same physical structure" implies identity to me. (Perhaps confusingly, despite the fact that I'm defending the "physicalist ontology" in the context of this thread (in contrast to algorithmic ontology), I reject physicalism in the metaphysics sense.)

This also seems tangential, though, because the substantive appeals to the algorithmic ontology that get made in the decision theory context aren't about physically instantiated copies. They're about non-physically-instantiated copies of your algorithm. I unfortunately don't know of a reference for this off the top of my head, but it has come up in some personal communications FWIW.

Comment by Anthony DiGiovanni (antimonyanthony) on Anthony DiGiovanni's Shortform · 2024-12-29T19:41:53.000Z · LW · GW

you'd eventually meet copies of yourself

But a copy of me =/= me. I don't see how you establish this equivalence without assuming the algorithmic ontology in the first place.

Comment by Anthony DiGiovanni (antimonyanthony) on Responses to apparent rationalist confusions about game / decision theory · 2024-12-29T19:39:54.188Z · LW · GW

it's not an independent or random sample

What kind of sample do you think it is?

Comment by Anthony DiGiovanni (antimonyanthony) on Responses to apparent rationalist confusions about game / decision theory · 2024-12-28T18:46:14.048Z · LW · GW

Sure, but isn't the whole source of weirdness the fact that it's metaphysically unclear (or indeterminate) what the real "sampling procedure" is?

Comment by Anthony DiGiovanni (antimonyanthony) on Anthony DiGiovanni's Shortform · 2024-12-28T17:19:37.252Z · LW · GW

I don't understand. It seems that when people appeal to the algorithmic ontology to motivate interesting decision-theoretic claims — like, say, "you should choose to one-box in Transparent Newcomb" — they're not just taking a more general perspective. They're making a substantive claim that it's sensible to regard yourself as an algorithm, over and above your particular instantiation in concrete reality.

Comment by Anthony DiGiovanni (antimonyanthony) on Responses to apparent rationalist confusions about game / decision theory · 2024-12-28T17:14:54.004Z · LW · GW

This post was a blog post day project. For its purpose of general sanity waterline-raising, I'm happy with how it turned out. If I still prioritized the kinds of topics this post is about, I'd say more about things like:

"equilibrium" and how it's a misleading and ill-motivated frame for game theory, especially acausal trade;
time-slice rationality;
why the logical/algorithmic ontology for decision theory is far from obviously preferable.

But I've come to think there are far deeper and higher-priority mistakes in the "orthodox rationalist worldview" (scare quotes because I know individuals' views are less monolithic than that, of course). Mostly concerning pragmatism about epistemology and uncritical acceptance of precise Bayesianism. I wrote a bit about the problems with pragmatism here, and critiques of precise Bayesianism are forthcoming, though previewed a bit here.

Comment by Anthony DiGiovanni (antimonyanthony) on Anthony DiGiovanni's Shortform · 2024-12-26T17:39:55.561Z · LW · GW

Linkpost: Why Evidential Cooperation in Large Worlds might not be action-guiding

A while back I wrote up why I was skeptical of ECL. I think this basically holds up, with the disclaimers at the top of the post. But I don't consider it that important compared to other things relevant to LW that people could be thinking about, so I decided to put it on my blog instead.

Comment by Anthony DiGiovanni (antimonyanthony) on What are the strongest arguments for very short timelines? · 2024-12-25T17:43:30.637Z · LW · GW

(I might misunderstand you. My impression was that you're saying it's valid to extrapolate from "model XYZ does well at RE-Bench" to "model XYZ does well at developing new paradigms and concepts." But maybe you're saying that the trend of LLM success at various things suggests we don't need new paradigms and concepts to get AGI in the first place? My reply below assumes the former:)

I'm not saying LLMs can't develop new paradigms and concepts, though. The original claim you were responding to was that success at RE-Bench in particular doesn't tell us much about success at developing new paradigms and concepts. "LLMs have done various things some people didn't expect them to be able to do" doesn't strike me as much of an argument against that.

More broadly, re: your burden of proof claim, I don't buy that "LLMs have done various things some people didn't expect them to be able to do" determinately pins down an extrapolation to "the current paradigm(s) will suffice for AGI, within 2-3 years." That's not a privileged reference class forecast, it's a fairly specific prediction.

Comment by Anthony DiGiovanni (antimonyanthony) on What are the strongest arguments for very short timelines? · 2024-12-24T16:28:19.131Z · LW · GW

I don't think this distinction between old-paradigm/old-concepts and new-paradigm/new-concepts is going to hold up very well to philosophical inspection or continued ML progress; it smells similar to ye olde "do LLMs truly understand, or are they merely stochastic parrots?" and "Can they extrapolate, or do they merely interpolate?"

I find this kind of pattern-match pretty unconvincing without more object-level explanation. Why exactly do you think this distinction isn't important? (I'm also not sure "Can they extrapolate, or do they merely interpolate?" qualifies as "ye olde," still seems like a good question to me at least w.r.t. sufficiently out-of-distribution extrapolation.)

Comment by Anthony DiGiovanni (antimonyanthony) on Which things were you surprised to learn are not metaphors? · 2024-12-01T12:44:05.207Z · LW · GW

"Music is emotional" is something almost everyone can agree to, but, for some, emotional songs can be frequently tear-jerking and for others that never happens

I'm now curious if anyone thinks "this gave me chills" is just a metaphor. Music has literally given me chills quite a few times.

Comment by Anthony DiGiovanni (antimonyanthony) on Winning isn't enough · 2024-11-07T00:38:21.055Z · LW · GW

Adding to Jesse's comment, the "We’ve often heard things along the lines of..." line refers both to personal communications and to various comments we've seen, e.g.:

[link]: "Since this intuition leads to the (surely false) conclusion that a rational beneficent agent might just as well support the For Malaria Foundation as the Against Malaria Foundation, it seems to me that we have very good reason to reject that theoretical intuition"
[link]: "including a few mildly stubborn credence functions in some judiciously chosen representors can entail effective altruism from the longtermist perspective is a fool’s errand. Yet this seems false"
[link]: "I think that if you try to get any meaningful mileage out of the maximality rule ... basically everything becomes permissible, which seems highly undesirable"
- (Also, as we point out in the post, this is only true insofar as you only use maximality, applied to total consequences. You can still regard obviously evil things as unacceptable on non-consequentialist grounds, for example.)

Comment by Anthony DiGiovanni (antimonyanthony) on Winning isn't enough · 2024-11-06T23:32:03.933Z · LW · GW

Without a clear definition of "winning,"

This is part of the problem we're pointing out in the post. We've encountered claims of this "winning" flavor that haven't been made precise, so we survey different things "winning" could mean more precisely, and argue that they're inadequate for figuring out which norms of rationality to adopt.

Comment by Anthony DiGiovanni (antimonyanthony) on Winning isn't enough · 2024-11-06T14:21:47.687Z · LW · GW

The key claim is: You can’t evaluate which beliefs and decision theory to endorse just by asking “which ones perform the best?” Because the whole question is what it means to systematically perform better, under uncertainty. Every operationalization of “systematically performing better” we’re aware of is either:

Incomplete — like “avoiding dominated strategies”, which leaves a lot unconstrained;
A poorly motivated proxy for the performance we actually care about — like “doing what’s worked in the past”; or
Secretly smuggling in nontrivial non-pragmatic assumptions — like “doing what’s worked in the past, not because that’s what we actually care about, but because past performance predicts future performance”

This is what we meant to convey with this sentence: “On any way of making sense of those words, we end up either calling a very wide range of beliefs and decisions “rational”, or reifying an objective that has nothing to do with our terminal goals without some substantive assumptions.”

(I can't tell from your comment if you agree with all of that. But, if this was all obvious to you, great! But we’ve often had discussions where someone appealed to “which ones perform the best?” in a way that misses these points.)

Comment by Anthony DiGiovanni (antimonyanthony) on Winning isn't enough · 2024-11-06T01:28:55.982Z · LW · GW

Sorry this was confusing! From our definition here:

We’ll use “pragmatic principles” to refer to principles according to which belief-forming or decision-making procedures should “perform well” in some sense.

"Avoiding dominated strategies" is pragmatic because it directly evaluates a decision procedure or set of beliefs based on its performance. (People do sometimes apply pragmatic principles like this one directly to beliefs, see e.g. this work on anthropics.)
Deference isn't pragmatic, because the appropriateness of your beliefs is evaluated by how your beliefs relate to the person you're deferring to. Someone could say, "You should defer because this tends to lead to good consequences," but then they're not applying deference directly as a principle — the underlying principle is "doing what's worked in the past."

Comment by Anthony DiGiovanni (antimonyanthony) on You can, in fact, bamboozle an unaligned AI into sparing your life · 2024-10-06T18:27:05.292Z · LW · GW

at time 1 you're in a strictly better epistemic position

Right, but 1-me has different incentives by virtue of this epistemic position. Conditional on being at the ATM, 1-me would be better off not paying the driver. (Yet 0-me is better off if the driver predicts that 1-me will pay, hence the incentive to commit.)

I'm not sure if this is an instance of what you call "having different values" — if so I'd call that a confusing use of the phrase, and it doesn't seem counterintuitive to me at all.

Comment by Anthony DiGiovanni (antimonyanthony) on You can, in fact, bamboozle an unaligned AI into sparing your life · 2024-10-03T07:59:14.528Z · LW · GW

(I might not reply further because of how historically I've found people seem to simply have different bedrock intuitions about this, but who knows!)

I intrinsically only care about the real world (I find the Tegmark IV arguments against this pretty unconvincing). As far as I can tell, the standard justification for acting as if one cares about nonexistent worlds is diachronic norms of rationality. But I don't see an independent motivation for diachronic norms, as I explain here. Given this, I think it would be a mistake to pretend my preferences are something other than what they actually are.

Comment by Anthony DiGiovanni (antimonyanthony) on You can, in fact, bamboozle an unaligned AI into sparing your life · 2024-10-02T08:16:30.783Z · LW · GW

Thanks for clarifying!

covered under #1 in my list of open questions

To be clear, by "indexical values" in that context I assume you mean indexing on whether a given world is "real" vs "counterfactual," not just indexical in the sense of being egoistic? (Because I think there are compelling reasons to reject UDT without being egoistic.)

Comment by Anthony DiGiovanni (antimonyanthony) on You can, in fact, bamboozle an unaligned AI into sparing your life · 2024-10-01T22:36:41.468Z · LW · GW

I strongly agree with this, but I'm confused that this is your view given that you endorse UDT. Why do you think your future self will honor the commitment of following UDT, even in situations where your future self wouldn't want to honor it (because following UDT is not ex interim optimal from his perspective)?

Comment by Anthony DiGiovanni (antimonyanthony) on Anthony DiGiovanni's Shortform · 2024-09-08T19:15:47.320Z · LW · GW

I'm afraid I don't understand your point — could you please rephrase?

Comment by Anthony DiGiovanni (antimonyanthony) on Anthony DiGiovanni's Shortform · 2024-09-04T20:57:02.440Z · LW · GW

Linkpost: "Against dynamic consistency: Why not time-slice rationality?"

This got too long for a "quick take," but also isn't polished enough for a top-level post. So onto my blog it goes.

I’ve been skeptical for a while of updateless decision theory, diachronic Dutch books, and dynamic consistency as a rational requirement. I think Hedden's (2015) notion of time-slice rationality nicely grounds the cluster of intuitions behind this skepticism.

Comment by Anthony DiGiovanni (antimonyanthony) on What’s this probability you’re reporting? · 2024-08-17T13:01:26.309Z · LW · GW

"I'll {take/lay} $100 at those odds, what's our resolution mechanism?" is an excellent clarification mechanism

I think one reason this has fallen out of favor is that it seems to me to be a type error. Taking $100 at some odds is a (hypothetical) decision, not a belief. And the reason you'd be willing to take $100 at some odds is, your credence in the statement is such that taking the bet would be net-positive.

Comment by Anthony DiGiovanni (antimonyanthony) on What are your cruxes for imprecise probabilities / decision rules? · 2024-08-11T16:50:23.673Z · LW · GW

I still feel like I don't know what having a strict preference or permissibility means — is there some way to translate these things to actions?

As an aspiring rational agent, I'm faced with lots of options. What do I do? Ideally I'd like to just be able to say which option is "best" and do that. If I have a complete ordering over the expected utilities of the options, then clearly the best option is the expected utility-maximizing one. If I don't have such a complete ordering, things are messier. I start by ruling out dominated options (as Maximality does). The options in the remaining set are all "permissible" in the sense that I haven't yet found a reason to rule them out.

I do of course need to choose an action eventually. But I have some decision-theoretic uncertainty. So, given the time to do so, I want to deliberate about which ways of narrowing down this set of options further seem most reasonable (i.e., satisfy principles of rational choice I find compelling).

(Basically I think EU maximization is a special case of “narrow down the permissible set as much as you can via principles of rational choice,^[1] then just pick something from whatever remains.” It’s so straightforward in this case that we don’t even recognize we’re identifying a (singleton) “permissible set.”)

Now, maybe you'd just want to model this situation like: "For embedded agents, 'deliberation' is just an option like any other. Your revealed strict preference is to deliberate about rational choice." I might be fine with this model.^[2] But:

For the purposes of discussing how {the VOI of deliberation about rational choice} compares to {the value of going with our current “best guess” in some sense}, I find it conceptually helpful to think of “choosing to deliberate about rational choice” as qualitatively different from other choices.
The procedure I use to decide to deliberate about rational choice principles is not “I maximize EV w.r.t. some beliefs,” it’s “I see that my permissible set is not a singleton, I want more action-guidance, so I look for more action-guidance.”

^{^}
"Achieve Pareto-efficiency" (as per the CCT) is one example of such a principle.
^{^}
Though I think once you open the door to this embedded agency stuff, reasoning about rational choice in general becomes confusing even for people who like precise EV max.

Comment by Anthony DiGiovanni (antimonyanthony) on What are your cruxes for imprecise probabilities / decision rules? · 2024-08-09T10:06:01.765Z · LW · GW

It seems to me like you were like: "why not regiment one's thinking xyz-ly?" (in your original question), to which I was like "if one regiments one thinking xyz-ly, then it's an utter disaster" (in that bullet point), and now you're like "even if it's an utter disaster, I don't care

My claim is that your notion of "utter disaster" presumes that a consequentialist under deep uncertainty has some sense of what to do, such that they don't consider ~everything permissible. This begs the question against severe imprecision. I don't really see why we should expect our pretheoretic intuitions about the verdicts of a value system as weird as impartial longtermist consequentialism, under uncertainty as severe as ours, to be a guide to our epistemics.

I agree that intuitively it's a very strange and disturbing verdict that ~everything is permissible! But that seems to be the fault of impartial longtermist consequentialism, not imprecise beliefs.

Comment by Anthony DiGiovanni (antimonyanthony) on What are your cruxes for imprecise probabilities / decision rules? · 2024-08-09T10:02:20.199Z · LW · GW

The branch that's about sequential decision-making, you mean? I'm unconvinced by this too, see e.g. here — I'd appreciate more explicit arguments for this being "nonsense."

User info

Posts

Comments