Gradations of Inner Alignment Obstacles 2021-04-20T22:18:18.394Z
Superrational Agents Kelly Bet Influence! 2021-04-16T22:08:18.201Z
A New Center? [Politics] [Wishful Thinking] 2021-04-12T15:19:35.430Z
My Current Take on Counterfactuals 2021-04-09T17:51:06.528Z
Reflective Bayesianism 2021-04-06T19:48:43.917Z
Affordances 2021-04-02T20:53:35.639Z
Voting-like mechanisms which address size of preferences? 2021-03-18T23:23:55.393Z
MetaPrompt: a tool for telling yourself what to do. 2021-03-16T20:49:19.693Z
Rigorous political science? 2021-03-12T15:30:53.837Z
Four Motivations for Learning Normativity 2021-03-11T20:13:40.175Z
Kelly *is* (just) about logarithmic utility 2021-03-01T20:02:08.300Z
"If You're Not a Holy Madman, You're Not Trying" 2021-02-28T18:56:19.560Z
Support vs Advice & Holding Off Solutions 2021-02-23T01:12:33.156Z
Calculating Kelly 2021-02-22T17:32:38.601Z
Mathematical Models of Progress? 2021-02-16T00:21:44.298Z
The Pointers Problem: Clarifications/Variations 2021-01-05T17:29:45.698Z
Debate Minus Factored Cognition 2020-12-29T22:59:19.641Z
Babble Challenge: Not-So-Future Coordination Tech 2020-12-21T16:48:20.515Z
Fusion and Equivocation in Korzybski's General Semantics 2020-12-21T05:44:41.064Z
Writing tools for tabooing? 2020-12-13T19:50:37.301Z
Mental Blinders from Working Within Systems 2020-12-10T19:09:50.720Z
Quick Thoughts on Immoral Mazes 2020-12-09T01:21:40.210Z
Number-guessing protocol? 2020-12-07T15:07:48.019Z
Recursive Quantilizers II 2020-12-02T15:26:30.138Z
Nash Score for Voting Techniques 2020-11-26T19:29:31.187Z
Deconstructing 321 Voting 2020-11-26T03:35:40.863Z
Normativity 2020-11-18T16:52:00.371Z
Thoughts on Voting Methods 2020-11-17T20:23:07.255Z
Signalling & Simulacra Level 3 2020-11-14T19:24:50.191Z
Learning Normativity: A Research Agenda 2020-11-11T21:59:41.053Z
Probability vs Likelihood 2020-11-10T21:28:03.934Z
Time Travel Markets for Intellectual Accounting 2020-11-09T16:58:44.276Z
Kelly Bet or Update? 2020-11-02T20:26:01.185Z
Generalize Kelly to Account for # Iterations? 2020-11-02T16:36:25.699Z
Dutch-Booking CDT: Revised Argument 2020-10-27T04:31:15.683Z
Top Time Travel Interventions? 2020-10-26T23:25:07.973Z
Babble & Prune Thoughts 2020-10-15T13:46:36.116Z
One hub, or many? 2020-10-04T16:58:40.800Z
Weird Things About Money 2020-10-03T17:13:48.772Z
"Zero Sum" is a misnomer. 2020-09-30T18:25:30.603Z
What Does "Signalling" Mean? 2020-09-16T21:19:00.968Z
Most Prisoner's Dilemmas are Stag Hunts; Most Stag Hunts are Schelling Problems 2020-09-14T22:13:01.236Z
Comparing Utilities 2020-09-14T20:56:15.088Z
[Link] Five Years and One Week of Less Wrong 2020-09-14T16:49:35.082Z
Social Capital Paradoxes 2020-09-10T18:44:18.291Z
abramdemski's Shortform 2020-09-10T17:55:38.663Z
Capturing Ideas 2020-09-09T21:20:23.049Z
Updates and additions to "Embedded Agency" 2020-08-29T04:22:25.556Z
The Bayesian Tyrant 2020-08-20T00:08:55.738Z
Radical Probabilism 2020-08-18T21:14:19.946Z


Comment by abramdemski on Updating the Lottery Ticket Hypothesis · 2021-04-20T21:28:30.177Z · LW · GW

Wait... so:

  1. The tangent-space hypothesis implies something close to "gd finds a solution if and only if there's already a dog detecting neuron" (for large networks, that is) -- specifically it seems to imply something pretty close to "there's already a feature", where "feature" means a linear combination of existing neurons within a single layer
  2. gd in fact trains NNs to recognize dogs
  3. Therefore, we're still in the territory of "there's already a dog detector"


Comment by abramdemski on The Zettelkasten Method · 2021-04-20T18:27:14.271Z · LW · GW

To respond to the thrust of the comment:

The euphoria drops off after some initial excitement.

I do have the sense that some ideas "naturally come out" better with zettelkasten-like notes, while others "naturally come out" better in other ways (eg more linear notes, or just scribbling math on scrap paper, or writing essays for public consumption). This may be something intrinsic to the ideas, or it may be based on my mood/etc. 

Comment by abramdemski on The Zettelkasten Method · 2021-04-20T18:19:40.863Z · LW · GW

Yeah, sorry for focusing on that rather than the thrust of your comment.

Comment by abramdemski on Updating the Lottery Ticket Hypothesis · 2021-04-20T17:58:42.503Z · LW · GW

Ah, I should have read comments more carefully.

I agree with your comment that the claim "I doubt that today’s neural networks already contain dog-recognizing subcircuits at initialization" is ambiguous -- "contains" can mean "under pruning".

Obviously, this is an important distinction: if the would-be dog-recognizing circuit behaves extremely differently due to intersection with a lot of other circuits, it could be much harder to find. But why is "a single neuron lighting up" where you draw the line?

It seems clear that at least some relaxation of that requirement is tenable. For example, if no one neuron lights up in the correct pattern, but there's a linear combination of neurons (before the output layer) which does, then it seems we're good to go: GD could find that pretty easily.

I guess this is where the tangent space model comes in; if in practice (for large networks) we stay in the tangent space, then a linear combination of neurons is basically exactly as much as we can relax your requirement.

But without the tangent-space hypothesis, it's unclear where to draw the line, and your claim that an existing neuron already behaving in the desired way is "what would be necessary for the lottery ticket intuition" isn't clear to me. (Is there a more obvious argument for this, according to you?)

Comment by abramdemski on Updating the Lottery Ticket Hypothesis · 2021-04-20T17:17:43.878Z · LW · GW

I was in a discussion yesterday that made it seem pretty plausible that you're wrong -- this paper suggests that the over-parameterization needed to ensure that some circuit is (approximately) present at the beginning is not that large.

function space is superexponentially large, circuit space is smaller but still superexponential, so no neural network is ever going to be large enough to have neurons which light up to match most functions/circuits.

I haven't actually read the paper I'm referencing, but my understanding is that this argument doesn't work out because the number of possible circuits of size N is balanced by the high number of subgraphs in a graph of size M (where M is only logarithmically larger than N).

That being said, I don't know whether "present at the beginning" is the same as "easily found by gradient descent".

Comment by abramdemski on Identifiability Problem for Superrational Decision Theories · 2021-04-20T16:50:20.688Z · LW · GW

Now I think the reasoning presented is correct in both cases, and the lesson here is for our expectations of rationality.

I agree that the reasoning is correct in both cases (or rather: could be correct, assuming some details), but the lesson I derive is that we have to be really careful about our assumptions here.

Normally, in game theory, we're comfortable asserting that re-labeling the options doesn't matter (and re-numbering the players also doesn't matter). But normally we aren't worried about anthropic uncertainty in a game.

If we suppose that players can see their numbers, as well, this can be used as a signal to break symmetry for anti-matching. Player 1 can choose option 1, and player 2 can choose option 2. (Or whatever -- they just have to agree on an anti-matching policy acausally.)

Thinking physically, the question is: are the two players physically precisely the same (including environment), at least insofar as the players can tell? Then anti-matching is hard. Usually we don't need to think about such things for game theory (since a game is a highly abstracted representation of the physical situation).

But this is one reason why correlated equilibria are, usually, a better abstraction than Nash equilibria. For example, a game of chicken is similar to anti-matching. In correlated equilibria, there is a "fair" solution to chicken: each player goes straight with 50% probability (and the other player swerves). This corresponds to the idea of a traffic light. If traffic lights were not invented, some other correlating signal from the environment might be used (particularly as we assume increasingly intelligent agents). This is a possible game-theoretic explanation for divination practices such as reading entrails.

Nash equilibria, otoh, are a better abstraction for the case where there truly is no "environment" to take complicated signals from (besides what you explicitly represent in the game). It better fits a way of thinking where models are supposed to be complete.

Comment by abramdemski on My Current Take on Counterfactuals · 2021-04-20T16:32:19.401Z · LW · GW

Actually I am rather skeptical/agnostic on this. For me it's fairly easy to picture that I have a "platonic" utility function, except that the time discount is dynamically inconsistent (not exponential).

I am in favor of exploring models of preferences which admit all sorts of uncertainty and/or dynamic inconsistency, but (i) it's up to debate how much degrees of freedom we need to allow there and (ii) I feel that the case logical induction is the right framework for this is kinda weak (but maybe I'm missing something).


It's clear that you understand logical induction pretty well, so while I feel like you're missing something, I'm not clear on what that could be.

I think maybe the more fruitful branch of this conversation (as opposed to me trying to provide an instrumental justification for radical probabilism, though I'm still interested in that) is the question of describing the human utility function.

The logical induction picture isn't strictly at odds with a platonic utility function, I think, since we can consider the limit. (I only claim that this isn't the best way to think about it in general, since Nature didn't decide a platonic utility function for us and then design us such that our reasoning has the appropriate limit.)

For example, one case which to my mind argues in favor of the logical induction approach to preferences: the procrastination paradox. All you want to do is ensure that the button is pressed at some point. This isn't a particularly complex or unrealistic preference for an agent to have. Yet, it's unclear how to make computable beliefs think about this appropriately. Logical induction provides a theory about how to think about this kind of goal. (I haven't thought much about how TRL would handle it.)

Agree or disagree: agents can sensibly pursue  objectives? And, do you think that question is cruxy for you?

Comment by abramdemski on My Current Take on Counterfactuals · 2021-04-20T16:20:55.756Z · LW · GW

To further elaborate, this post discusses ways a Bayesian might pragmatically prefer non-Bayesian updates. Some of them don't carry over, for sure, but I expect the general idea to translate: InfraBayesians need some unrealistic assumptions to reflectively justify the InfraBayesian update in contrast to other updates. (But I am not sure which assumptions to point out, atm.)

Comment by abramdemski on My Current Take on Counterfactuals · 2021-04-19T17:18:26.154Z · LW · GW

I'm not convinced this is the right desideratum for that purpose. Why should we care about exploitability by traders if making such trades is not actually possible given the environment and the utility function? IMO epistemic rationality is subservient to instrumental rationality, so our desiderata should be derived from the later.

So, one point is that the InfraBayes picture still gives epistemics an important role: the kind of guarantee arrived at is a guarantee that you won't do too much worse than the most useful partial model expects. So, we can think about generalized partial models which update by thinking longer in addition to taking in sense-data. 

I suppose TRL can model this by observing what those computations would say, in a given situation, and using partial models which only "trust computation X" rather than having any content of their own. Is this "complete" in an appropriate sense? Can we always model a would-be radical-infrabayesian as a TRL agent observing what that radical-infrabayesian would think?

Even if true, there may be a significant computational complexity gap between just doing the thing vs modeling it in this way.

Comment by abramdemski on Fun with +12 OOMs of Compute · 2021-04-18T13:30:18.836Z · LW · GW

So, how does the update to the AI and compute trend factor in?

Comment by abramdemski on Fun with +12 OOMs of Compute · 2021-04-18T13:22:16.492Z · LW · GW

Arguably this is what happened with LSTMs?

Is there a reference for this?

Comment by abramdemski on Superrational Agents Kelly Bet Influence! · 2021-04-18T13:08:54.315Z · LW · GW

I also looked into this after that discussion. At the time I thought that this might have been something special about Kelly, but when I did some calculations afterwards I found that I couldn't get this to work in the other direction.

I'm not sure what you mean here. What is "this" in "looked into this" -- Critch's theorem? What is "the other direction"?

Everything you've written (as I currently understand it) also applies for many other betting strategies. eg if everyone was betting (the same constant) fractional Kelly.

Specifically the market will clear at the same price (weighted average probability) and "everyone who put money on the winning side picks up a fraction of money proportional to the fraction they originally contributed to that side". 

It seems obvious to me that the market will clear at the same price if everyone is using the same fractional Kelly, but if people are using different Kelly fractions, the weighted sum would be correspondingly skewed, right? Anyway, that's not really important here...

The important thing for the connection to Critch's theorem is: the total wealth gets adjusted like Bayes' Law. Other betting strategies may not have this property; for example, fractional Kelly means losers lose less, and winners win less. This doesn't limit us to exactly Kelly (for example, the bet-against-yourself strategy in the post also has the desired property); however, all such strategies must be equivalent to Kelly in terms of the payoffs (otherwise, they wouldn't be equivalent to Bayes in terms of the updates!).

For example, if everyone uses fractional Kelly with the same fraction, then on the first round of betting, the market clears with all the right prices, since everyone is just scaling down how much they bet. However, the subsequent decisions will then get messed up, because the everyone has the wrong weights (weights changed less than they should).

Comment by abramdemski on Superrational Agents Kelly Bet Influence! · 2021-04-18T12:56:26.083Z · LW · GW

Can you justify Kelly "directly" in terms of Pareto-improvement trades rather than "indirectly" through Pareto-optimality? I feel this gets at the distinction between the selfish vs altruistic view.

Comment by abramdemski on My Current Take on Counterfactuals · 2021-04-16T18:14:57.309Z · LW · GW

I'm not convinced this is the right desideratum for that purpose. Why should we care about exploitability by traders if making such trades is not actually possible given the environment and the utility function? IMO epistemic rationality is subservient to instrumental rationality, so our desiderata should be derived from the later.

This does make sense to me, and I view it as a weakness of the idea. However, the productivity of dutch-book type thinking in terms of implying properties which seem appealing for other reasons speaks heavily in favor of it, in my mind. A formal connection to more pragmatic criteria would be great.

But also, maybe I can articulate a radical-probabilist position without any recourse to dutch books... I'll have to think more about that. 

Actually I am rather skeptical/agnostic on this. For me it's fairly easy to picture that I have a "platonic" utility function, except that the time discount is dynamically inconsistent (not exponential).

I'm not sure how to double crux with this intuition, unfortunately. When I imagine the perspective you describe, I feel like it's rolling all dynamic inconsistency into time-preference and ignoring the role of deliberation. 

My claim is that there is a type of change-over-time which is due to boundedness, and which looks like "dynamic inconsistency" from a classical bayesian perspective, but which isn't inherently dynamically inconsistent. EG, if you "sleep on it" and wake up with a different, firmer-feeling perspective, without any articulable thing you updated on. (My point isn't to dogmatically insist that you haven't updated on anything, but rather, to point out that it's useful to have the perspective where we don't need to suppose there was evidence which justifies the update as Bayesian, in order for it to be rational.)

Comment by abramdemski on "Taking your environment as object" vs "Being subject to your environment" · 2021-04-16T17:50:55.652Z · LW · GW

I think this post mostly intends to focus on the "imagine you're a space alien; see the world with fresh eyes" type thing, which Feynman talks about. I see this as a tool for transitioning from cook mindset to to chef mindset.

But I already do that all the time. The valuable thing I got out of the post was the more local version, where you think about your city/job/diet/etc.

Comment by abramdemski on Hell is wasted on the evil · 2021-04-16T17:38:24.377Z · LW · GW

Opportunities to do the most good are.

I guess it's necessary to get a bit technical to explain what I mean by that. I do not mean that the number of maximally-good things is small; that is true, but will be true in most environments.

What I mean is that the distribution has a crazy variance (possibly no finite variance); take two "opportunities to do good" and compare them to each other, and an orders-of-magnitude difference is note rare.

The water-in-the-desert analogy really falls apart at that point. It's more like an investor looking for a good startup to invest in; successful startups aren't that rare, but the quality varies immensely; you'd much much much prefer to invest in "the next Google/Uber/etc" rather than the next [insert some company from 2010 which made a good profit but which you and I have never heard of].

Comment by abramdemski on "Taking your environment as object" vs "Being subject to your environment" · 2021-04-16T17:28:35.595Z · LW · GW

Excellent post.

I could easily spin either a narrative where I'm totally missing this or totally good at it. I'm a bit confused at that.

I have a very strong identity and continuity over time. I recently re-read the SSC review of House of God -- that's the post where Scott talks about how he recalls high school fondly despite knowing he felt differently about it at the time (and mentions that as a teen he was surrounded by adults who recalled their own high school experiences fondly). I don't seem to have any delusions about high school: I remember that at the time, I wanted to write a manifesto about why it sucks so much (modeled after the communist manifesto). My recollection remains consistent with that.

My opinion about living in cities remains the same over time; no difference of opinion upon leaving.

One possibility is that I never break out of my narrative. This is consistent with how consistent my identity tends to be over time. 

The other possibility is that I "pre-break-out" of my narrative. This is consistent with the feeling of unease I have about following some social scripts, especially ones which "balance cons with pros" like you mention. It's also more consistent with the fact that I occasionally do explicit thought experiments about leaving places.

But I think my thought experiments are "abstract" rather than "concrete" in some sense, like taking the outside view, what a person hearing the story might say. Reading your post made me try the inside view version, which felt much different.  

It's plausible that a lot of my thinking about cities, high school, etc, was and remains "outside-view" in this way, and my resistance to participate in narratives such as high-school-was-great is actually based on outside-view thinking ("that sounds like something someone might say regardless of whether it was true").

So, yeah, my new narrative is that I have a really hardcore outside-view version of this thing, which I execute kind of constantly, but was lacking the inside-view version of the thing, which your post somehow prompted me to try.

Comment by abramdemski on The Zettelkasten Method · 2021-04-16T15:40:16.324Z · LW · GW

Not sure exactly what you mean about growth mindset, but I would remind you that even in SSC's original article where he made a comment about it maybe-possibly being up next for the replication-crisis chopping-block, Scott Alexander pointed out how well-supported the theory seemed in experimental results so far. The proofoflogic post on growth mindset reviews some of this, including the causal model of how growth mindset works, which Carol Dweck validated piece-by-piece and which is a priori extremely plausible based on other known psychological phenomena including self-handicapping and learned helplessness.

Comment by abramdemski on A New Center? [Politics] [Wishful Thinking] · 2021-04-15T16:07:41.836Z · LW · GW

I put it to you that the most natural fit for what you are proposing is a new political party which chooses not to put candidates on the ballot.

It does seem necessary to settle the terminology better; I agree that the terms I've been inconsistently using so far seem inadequate (voting bloc, platform, movement, group, ...?). I'm still not convinced "party" is the best term. But I have some sympathy for your points.

I would much prefer that people call the group "the new center" or "neocentrists" or whatever, as opposed to "the new center party" "the moderate party" etc.

Alas, running (or even starting) a party/whatever sounds incredibly time consuming. :(

Comment by abramdemski on A New Center? [Politics] [Wishful Thinking] · 2021-04-14T15:31:03.688Z · LW · GW

Here is another possible solution (which might be bad in other respects):

Maybe a formal declaration of membership only serves to increase the visibility of the group (by boosting numbers on their website). The actual position on issues cannot be "influenced". Instead, the New Center platform preforms imperial surveys of the general population to find issues on which there is broad agreement.

Or: official bloc membership might get you a voice in determining which issues get put on the surveys. But ultimately the surveys determine the New Center position.

This would make it difficult to take over the New Center and make it a mouthpiece for non-moderates (albeit not impossible).

Comment by abramdemski on A New Center? [Politics] [Wishful Thinking] · 2021-04-14T01:47:50.820Z · LW · GW

That's an empirical question!

See my refined proposal.

Comment by abramdemski on A New Center? [Politics] [Wishful Thinking] · 2021-04-14T01:41:54.347Z · LW · GW

Some of the other comments have reminded me of your linkpost about digital democracy. Specifically, the idea of seeking surprising agreement which was mentioned.

In the OP, I posited that "the new center" should have a strong, simple set of issues, pre-selected to cater to people who are sick of both sides. But I think Stuart Anderson is right: it shouldn't focus so much on the battle between the two sides; it should focus on the surprising commonality between people.

As Steven Byrnes mentioned, swing voters aren't exactly moderate; rather, they tend to have extreme views which don't fit within existing party lines. The article Byrnes linked to also points out that the consensus within party elites of both parties is very different from the consensus within the party base.

I find myself forming the hypothesis that politicians have a tendency to over-focus on divisive issues, and miss some issues on which there is broad agreement. (This would be an interesting question to investigate, if someone really did a feasibility study on the whole idea.)

My new suggestion for the new-center platform would be, rather than distilling complaints about both sides, seek surprising agreement in the way mentioned in that podcast you linked.

The proposal would be something like this:

  • You register with The New Center platform. This involves "signing" a non-binding agreement to vote according to the New Center recommendations.
    • I'm imagining that you're never asked to promise to vote a specific way, but rather, you are asked to affirm that you agree with the argument that making such a commitment would increase your voting power overall. (Mostly because something feels shady to me about actually making people promise to vote a specific way.)
  • The platform crowdsources issues, and aggregates New Center opinions on those issues, looking for issues where there is broad agreement.
    • This might be done by something like quadratic voting, letting people spend points to indicate how much they care about an issue, so that you get information on the strength of preferences rather than only their existence.
  • The platform publicizes the issues on which there is broad agreement. The main purpose of this is so that politicians know the issues on which they will be judged. A secondary purpose is to attract new people to the New Center platform, if the current New Center consensus resonates with them.
  • Finally, the platform rates political candidates on the consensus criteria, and makes recommendations on that basis. (This is probably also done in a democratized way.)
    • It could also be interesting to keep track of New Center's preferences wrt bills being voted on in legislative bodies, and keep track of politician's record in terms of voting with New Center. Politicians with a record of voting according to New Center recommendations should be rewarded by the system, even if their voting record goes against what's now the consensus of New Center, because (in the long term) the hope is that some politicians end up deferring to New Center's opinions (at least some of the time). So you want to avoid punishing that behavior just because New Center flip-flops on an issue. However, that's a corner case which may not be that important (because hopefully, New Center finds issues with broad appeal on which there isn't so much flip-flop in public opinion over time).
Comment by abramdemski on Dutch-Booking CDT: Revised Argument · 2021-04-13T18:35:35.897Z · LW · GW

Hmm, on further reflection, I had an effect in mind which doesn't necessarily break your argument, but which increases the degree to which other counterarguments such as AlexMennen's break your argument. This effect isn't necessarily solved by multiplying the contract payoff (since decisions aren't necessarily continuous as a function of utilities), but it may under many circumstances be approximately solved by it. So maybe it doesn't matter so much, at least until AlexMennen's points are addressed so I can see where it fits in with that.


Comment by abramdemski on Dutch-Booking CDT: Revised Argument · 2021-04-13T18:34:31.360Z · LW · GW

OK, here's my position.

As I said in the post, the real answer is that this argument simply does not apply if the agent knows its action. More generally: the argument applies precisely to those actions to which the agent ascribes positive probability (directly before deciding). So, it is possible for agents to maintain a difference between counterfactual and evidential expectations. However, I think it's rarely normatively correct for an agent to be in such a position.

Even though the decision procedure of CDT is deterministic, this does not mean that agents described by CDT know what they will do in the future. We can think of this in terms of logical induction: the market is not 100% certain of its own beliefs, and in particular, doesn't typically know precisely what the maximum-expectation-action is.

One way of seeing the importance of this is to point out that CDT is a normative theory, not a descriptive one. CDT is supposed to tell you what arbitrary agents should do. The recommendations are supposed to apply even to, say, epsilon-exploring agents (who are not described by CDT, strictly speaking). But here we see that CDT recommends being dutch-booked! Therefore, CDT is not a very good normative theory, at least for epsilon-explorers. (So I'm addressing your epsilon-exploration example by differentiating between the agent's algorithm and the CDT decision theory. The agent isn't dutch-booked, but CDT recommends a dutch book.)

Granted, we could argue via dutch book that agents should know their own actions, if those actions are deterministic consequences of a know agent-architecture. However, theories of logical uncertainty tell us that this is not (always) realistic. In particular, we can adapt the bounded-resource-dutch-book idea from logical induction. According to this idea, some dutch-book-ability is OK, but agents should not be boundlessly exploitable by resource-bounded bookies.

This idea leads me to think that efficiently computable sequences of actions, which continue to have probability bounded away from zero (just before the decision), should have CDT expectations which converge to EDT expectations.

(Probably there's a stronger version, based on density-zero exploration type intuitions, where we can reach this conclusion even if the probability is not bounded away from zero, because the total probability is still unbounded.)

One conjecture which was supposed to be communicated by my more recent post was: in learnable environments, this will amount to: all counterfactual expectations converge to evidential expectations (provided the agent is sufficiently farsighted). For example, if the agent knows the environment is trap-free, then when counterfactual and evidential hypotheses continue to severely differ for some (efficiently enumerable) sequence of actions, then there will be a hypothesis which says "the evidential expectations are actually correct". The agent will want to check that hypothesis, because the VOI of significantly updating its counterfactual expectations is high. Therefore, these actions will not become sufficiently rare (unless the evidential and counterfactual expectations do indeed converge).

In other words, the divergence between evidential and counterfactual expectations is itself a reason why the action probability should be high, provided that the agent is not shortsighted and doesn't expect the action to be a trap.

If the agent is shortsighted and/or expects traps, then it normatively should not learn anyway (at least, not by deliberate exploration steps). In that case, counterfactual and evidential expectations may forever differ. OTOH, in that case, there's no reason to expect evidential expectations to be well-informed, so it kind of makes sense that the agent has little motive to adjust its counterfactual expectations towards them.

(But I'll still give the agent a skeptical look when it asserts that the two differ, since I know that highly informed positions never look like this. The belief that the two differ seems "potentially rational but never defensible", if that makes sense. I'm tempted to bake the counterfactual/evidential equivalence into the prior, on the general principle that priors should not contain possibilities which we know will be eliminated if sufficient evidence comes in. Yet, doing so might make us vulnerable to Troll Bridge.)

Comment by abramdemski on Dutch-Booking CDT: Revised Argument · 2021-04-13T17:25:24.593Z · LW · GW

I disagree, I don't think it's a simple binary thing. I don't think Dutch book arguments in general never apply to recursive things, but it's more just that the recursion needs to be modelled in some way, and since your OP didn't do that, I ended up finding the argument confusing.

But what does that look like? How should it make a difference? (This isn't a rhetorical question; I would be interested in a positive position. My lack of interest is, significantly, due to a lack of positive positions in this direction.)

I don't think your argument goes through for the imp, since it never needs to decide its action, and therefore the second part of selling the contract back never comes up?

Ah, true, but the imp will necessarily just make EDT-type predictions anyway. So the imp argument reaches a similar conclusion.

But I'm not claiming the imp argument is very strong in any case, it's just an intuition pump.

Comment by abramdemski on A New Center? [Politics] [Wishful Thinking] · 2021-04-13T17:15:59.677Z · LW · GW

Totally agree.

Comment by abramdemski on A New Center? [Politics] [Wishful Thinking] · 2021-04-13T17:15:15.658Z · LW · GW

Select whoever defected least.

An important mechanism for avoiding this failure mode would be to encourage new-centrists to be involved in political primaries.

Comment by abramdemski on A New Center? [Politics] [Wishful Thinking] · 2021-04-13T17:11:37.110Z · LW · GW

This comment makes me want to reiterate that I am not proposing a new party. A new party needs more than 1/3rd of voters, at least regionally, in order to be viable (that is, in order to avoid shooting itself in the foot by causing its base to waste votes). I agree that splitting an existing party is mostly the only way a new centrist party could happen.

Instead, the proposal is to organize a legible voting bloc. More like "environmentalists" than "the green party".

The fact that new parties empirically can pop up in the middle is, however, encouraging.

Comment by abramdemski on A New Center? [Politics] [Wishful Thinking] · 2021-04-13T17:03:34.535Z · LW · GW

I think voting reform is highly implausible, because "voting reform" has come to mean instant runoff voting, which is barely better (and probably much worse for political polarization in particular, due to the center-squeeze problem).

Not to say that "a new center" is really plausible, though ;p

Comment by abramdemski on A New Center? [Politics] [Wishful Thinking] · 2021-04-13T17:00:23.275Z · LW · GW

Interesting model, thanks!

Comment by abramdemski on A New Center? [Politics] [Wishful Thinking] · 2021-04-13T16:57:55.452Z · LW · GW

I don't have time for this. Do you? Is/should it be a priority? I have other ideas which may or may not make it more probable (which I excluded from the post out of an abundance of caution).

Comment by abramdemski on A New Center? [Politics] [Wishful Thinking] · 2021-04-13T16:43:36.714Z · LW · GW

Interesting reference!

Yeah, I agree. The fantasy is to create a "swing voter bloc", with a formalized (fairly objective) declaration about how to pander. People who don't feel represented by either party, or by other blocs, can increase their voice by joining the bloc (provided they feel it could represent them, of course).

Comment by abramdemski on A New Center? [Politics] [Wishful Thinking] · 2021-04-13T16:39:58.676Z · LW · GW

Hopefully a positive one.

Comment by abramdemski on Ranked Choice Voting is Arbitrarily Bad · 2021-04-13T16:39:19.794Z · LW · GW

I would personally greatly prefer you use the name "instant runoff" rather than "ranked choice".

  • "instant runoff" is descriptive of what it's actually doing, if the listener is familiar with runoff elections.
  • There are many other voting methods which rank choices. Calling it "ranked choice" seems to marginalize this entire category of voting methods. For example, Borda count is a much older voting method which ranks choices.

This isn't just my opinion; I was convinced by Jameson Quinn:

IRV (Instant runoff voting), aka Alternative Vote or RCV (Ranked Choice Voting... I hate that name, which deliberately appropriates the entire "ranked" category for this one specific method)

Would you be willing to edit the article to change the term?

Alice is the clear intuitive choice, but according to plurality rules, Dave ends up winning despite being despised by 70% of the electorate.

Sorry for focusing on nitpicks, but: I generally prefer writers taboo phrases like "clear intuitive choice" for voting theory. I have few intuitions when I look at a set of ordinal preferences like this! It seems like voting theorists do have such intuitions, and tend to heavily assume their readers do, too. My working hypothesis is that any time voting theorists invoke the concept of "intuitive winner" in an election, they actually could (with sufficient introspection) name a specific property which this "intuitive winner" has, and could argue for the appeal of this property. This would better communicate the intuition to the rest of us, and potentially, clarify arguments and positions for the voting theorist as well.

Each cohort knows that Carol is not a realistic threat to their preferred candidate, and will thus rank her second, while ranking their true second choice last. For any individual, this is a good strategy to maximizing the odds of their preferred candidate, but in aggregate, it leads to [...] a victory for Carol, even though she was universally despised.

As others have pointed out, this does not seem true. I believed you for a few days but then noticed my confusion when I tried to explain it to someone else. I have downvoted the post as a result of this severe error, but would upvote it again if corrected.

Comment by abramdemski on My Current Take on Counterfactuals · 2021-04-13T16:09:45.584Z · LW · GW

Ah, but there is a sense in which it doesn't. The radical update rule is equivalent to updating on "secret evidence". And in TRL we have such secret evidence. Namely, if we only look at the agent's beliefs about "physics" (the environment), then they would be updated radically, because of secret evidence from "mathematics" (computations).

I agree that radical probabilism can be thought of as bayesian-with-a-side-channel, but it's nice to have a more general characterization where the side channel is black-box, rather than an explicit side-channel which we explicitly update on. This gives us a picture of the space of rational updates. EG, the logical induction criterion allows for a large space of things to count as rational. We get to argue for constraints on rational behavior by pointing to the existence of traders which enforce those constraints, while being agnostic about what's going on inside a logical inductor. So we have this nice picture, where rationality is characterized by non-exploitability wrt a specific class of potential exploiters.

Here's an argument for why this is an important dimension to consider: 

  1. Human value-uncertainty is not particularly well-captured by Bayesian uncertainty, as I imagine you'll agree. One particular complaint is realizability: we have no particular reason to assume that human preferences are within any particular space of hypotheses we can write down.
  2. One aspect of this can be captured by InfraBayes: it allows us to eliminate the realizability assumption, instead only assuming that human preferences fall within some set of constraints which we can describe.
  3. However, there is another aspect to human preference-uncertainty: human preferences change over time. Some of this is irrational, but some of it is legitimate philosophical deliberation.
  4. And, somewhat in the spirit of logical induction, humans do tend to eventually address the most egregious irrationalities.
  5. Therefore, I tend to think that toy models of alignment (such as CIRL, DRL, DIRL) should model the human as a radical probabilist; not because it's a perfect model, but because it constitutes a major incremental improvement wrt modeling what kind of uncertainty humans have over our own preferences.

Recognizing preferences as a thing which naturally changes over time seems, to me, to take a lot of the mystery out of human preference uncertainty. It's hard to picture that I have some true platonic utility function. It's much easier to interpret myself as having some preferences right now (which I still have uncertainty about, but which I have some introspective access of), but, also being the kind of entity who shifts preferences over time, and mostly in a way which I myself endorse. In some sense you can see me as converging to a true utility function; however, this "true utility function" is a (non-constructive) consequence of my process of deliberation, and the process of deliberation takes a primary role.

I recognize that this isn't exactly the same perspective captured by my first reply.

Comment by abramdemski on Dutch-Booking CDT: Revised Argument · 2021-04-12T20:28:13.735Z · LW · GW

Isn't your Dutch-book argument more recursive than standard ones? Your contract only pays out if you act, so the value of the dutch book causally depends on the action you choose.

Sure, do you think that's a concern? I was noting the similarity in this particular respect (pretending that bets are independent of everything), not in all respects.

Note, in particular, that traditional dutch book arguments make no explicit assumption one way or the other about whether the propositions have to do with actions under the agent's control. So I see two possible interpretations of traditional Dutch books:

  1. They apply to "recursive" stuff, such as things you have some influence over. For example, I can bet on a presidential election, even though I can also vote in a presidential election. In this case, what we have here is not weirder. This is the position I prefer.
  2. They can't apply to "recursive" stuff. In this case, presumably we don't think standard probability theory applies to stuff we have influence over. This could be a respectable position, and I've seen it discussed. However, I don't buy it. I've seen philosophers answer this kind of think with the following argument: what if you had a little imp on your shoulder, who didn't influence you in any way but who watched you and formed predictions? The imp could have probabilistic beliefs about your actions. The standard dutch book arguments would apply to the imp. Why should you be in such a different position from the imp? 

How do you make the payoff small?

For example, multiply the contract payoff by 0.001. 

Think of it this way. Making bets about your actions (or things influenced by your actions) can change your behavior. But if you keep the bets small enough, then you shouldn't change your behavior; the bets are less important than other issues. (Unless two actions are exactly tied, in terms of other issues.)

I will concede that this isn't 100% convincing. Perhaps different laws of probability should apply to actions we can influence. OTOH, I'm not sure what laws those would be.

Comment by abramdemski on Dutch-Booking CDT: Revised Argument · 2021-04-12T16:58:35.494Z · LW · GW

I thought about these things in writing this, but I'll have to think about them again before making a full reply.

We could modify the epsilon exploration assumption so that the agent also chooses between  and  even while its top choice is . That is, there's a lower bound on the probability with which the agent takes an action in , but even if that bound is achieved, the agent still has some flexibility in distributing probability between  and .

Another similar scenario would be: we assume the probability of an action is small if it's sub-optimal, but smaller the worse it is.

Comment by abramdemski on A New Center? [Politics] [Wishful Thinking] · 2021-04-12T16:53:36.230Z · LW · GW

1a. The proposal here is not to get rid of the two-party system, but rather, to reduce polarization. My view here is that polarization is harmful.

1b. The proposal attempts to work within the two-party system, rather than create a true third party.

1c. Why do you think a two-party system has to do with a strong executive? Mathematical arguments suggest that plurality voting eventually results in a two-party system, because you're usually wasting your vote if you vote for anyone other than the two candidates with the highest probability of winning. Similarly, mathematical arguments suggest that instant runoff voting will eventually result in a two-party system, because out of the top three candidates, the most moderate will often be "squeezed out" (instant runoff voting isn't very kind to compromise candidates). Other voting methods are much more mathematically favorable to multi-party systems. Therefore I tend to assume that the voting method is the culprit. However, abstract arguments like this don't necessarily reflect reality, so I'm open to the idea that a strong executive is the real culprit. But why do you think this?

1d. What happened in Germany?

2a. Gun control and immigration preferences differ a lot between the two parties. Recently, preferences about police funding are very different. I think budgetary differences are large. I believe there are many other issues. I have seen graphs illustrating that the increasing political polarization can be seen rather vividly by only looking at how politicians vote (IE it's gotten much easier to predict party affiliation from what legislation a politician supports). Also, similar graphs for voters (IE it's gotten much easier to separate republicans and democrats based on survey questions). 

2b. But you're right, policy questions are not really the main driver of polarization or of my personal perception of polarization, or even of my wish to reduce polarization. Rather, identity politics (the pressure to identify with one side or the other) is the main driver of all three. My wish for a "new center" is a wish for a (widely recognized) tribal affiliation which offers an alternative, and a "return to sanity" in the media resulting from this. (The point of the "kingmaker" mechanism is to incentivize rhetoric from both sides to be less extreme.)

Comment by abramdemski on Dutch-Booking CDT: Revised Argument · 2021-04-12T16:14:47.717Z · LW · GW

I agree with this, but I was assuming the CDT agent doesn't think buying B will influence the later decision. This, again, seems plausible if the payoff is made sufficiently small. I believe that there are some other points in my proof which make similar assumptions, which would ideally be made clearer in a more formal write-up.

However, I think CDT advocates will not generally take this to be a sticking point. The structure of my argument is to take a pre-existing scenario, and then add bets. For my argument to work, the bets need to be "independent" of critical things (causally and/or evidentially independent) -- in the example you point out, the action taken later needs to be causally independent of the bet made earlier (more specifically, causal-conditioning on the bet should not change beliefs about what action will be taken).

This is actually very similar to traditional Dutch-book arguments, which treat the bets as totally independent of everything. I could argue that it's just part of the thought experiment; if you concede that there could be a scenario like that, then you concede that CDT gets dutch-booked.

If you don't buy that, but you do buy Dutch Books as a methodology more generally, then I think you have to claim there's some rule which forbids "situations like this" (so CDT has to think the bets are not independent of everything else, in such a way as to spoil my argument). I would be very interested if you could propose a sensible view like this. However, I think not: there doesn't seem to be anything about the scenario which violates some principle of causality or rationality. If you forbid scenarios like this, you seem to be forbidding a very reasonable scenario, for no good reason (other than to save CDT).

Comment by abramdemski on My Current Take on Counterfactuals · 2021-04-12T15:23:17.262Z · LW · GW

Now I feel like I should have phrased it more modestly, since it's really "settled modulo math working out", even though I feel fairly confident some version of the math should work out.

Comment by abramdemski on My Current Take on Counterfactuals · 2021-04-10T15:46:30.288Z · LW · GW

I'm not really sure what you're getting at.

Causal interventions are supposed to be interventions that "affect nothing but what's explicitly said to be affected".

This seems like a really bad description to me. For example, suppose we have the causal graph . We intervene on . We don't want to "affect nothing but y" -- we affect z, too. But we don't get to pick and choose; we couldn't choose to affect x and y without affecting z.

So I'd rather say that we "affect nothing but what we intervene on and what's downstream of what we intervened on".

Not sure whether this has anything to do with your point, though.

Comment by abramdemski on My Current Take on Counterfactuals · 2021-04-09T21:36:13.896Z · LW · GW

Is there a way to operationalize "respecting logic"? For example, a specific toy scenario where an infra-Bayesian agent would fail due to not respecting logic?

"Respect logic" means either (a) assigning probability one to tautologies (at least, to those which can be proved in some bounded proof-length, or something along those lines), or, (b) assigning probability zero to contradictions (again, modulo boundedness). These two properties should be basically equivalent (ie, imply each other) provided the proof system is consistent. If it's inconsistent, they imply different failure modes.

My contention isn't that infra-bayes could fail due to not respecting logic. Rather, it's simply not obvious whether/how it's possible to make an interesting troll bridge problem for something which doesn't respect logic. EG, the example I mentioned of a typical RL agent -- the obvious way to "translate" Troll Bridge to typical RL is for the troll to blow up the bridge if and only if the agent takes an exploration step. But, this isn't sufficiently like the original Troll Bridge problem to be very interesting.

By no means do I mean to indicate that there's an argument that agents have to "respect logic" buried somewhere in this write-up (or the original troll-bridge writeup, or my more recent explanation of troll bridge, or any other posts which I linked).

If I want to argue such a thing, I'd have to do so separately.

And, in fact, I don't think I want to argue that an agent is defective if it doesn't "respect logic". I don't think I can pull out a decision problem it'll do poorly on, or such.

I a little bit want to argue that a decision theory is less revealing if it doesn't represent an agent as respecting logic, because I tend to think logical reasoning is an important part of an agent's rationality. EG, a highly capable general-purpose RL agent should be interpretable as using logical reasoning internally, even if we can't see that in the RL algorithm which gave rise to it. (In which case you might want to ask how the RL agent avoids the troll-bridge problem, even though the RL algorithm itself doesn't seem to give rise to any interesting problem there.)

As such, I find it quite plausible that InfraBayes and other RL algorithms end up handling stuff like Troll Bridge just fine without giving us insight into the correct reasoning, because they eventually kick out any models/hypotheses which fail Troll Bridge.

Whether it's necessary to "gain insight" into how to solve Troll Bridge (as an agent which respects some logic internally), rather than merely solve it (by providing learning algorithms which have good guarantees), is separate question. I won't claim this has a high probability of being a necessary kind of insight (for alignment). I will claim it seems like a pretty important question to answer for someone interested in counterfactual reasoning.

True, but IMO the way to incorporate "radical probabilism" is via what I called Turing RL.

I don't think Turing RL addresses radical probabilism at all, although it plausibly addresses a major motivating force for being interested in radical probabilism, namely logical uncertainty.

From a radical-probabilist perspective, the complaint would be that Turing RL still uses the InfraBayesian update rule, which might not always be necessary to be rational (the same way Bayesian updates aren't always necessary).

Naively, it seems very possible to combine infraBayes with radical probabilism: 

  • Starting from radical probabilism, which is basically "a dynamic market for beliefs", infra seems close to the insight that prices can have a "spread". (In the same way that interval probability is close to InfraBayes, but not all the way).
  • Starting from Infra, the question is how to add in the market aspect.

However, I'm not sure what formalism could unify these.

Comment by abramdemski on Reflective Bayesianism · 2021-04-09T15:09:05.409Z · LW · GW

This post seemed to be praising the virtue of returning to the lower-assumption state. So I argued that in the example given, it took more than knocking out assumptions to get the benefit.

Agreed. Simple Bayes is the hero of the story in this post, but that's more because the simple bayesian can recognize that there's something beyond.

Comment by abramdemski on Phylactery Decision Theory · 2021-04-09T15:03:19.256Z · LW · GW

I'm using talk about control sometimes to describe what the agent is doing from the outside, but the hypothesis it believes all have a form like "The variables such and such will be as if they were set by BDT given such and such inputs".

Right, but then, are all other variables unchanged? Or are they influenced somehow? The obvious proposal is EDT -- assume influence goes with correlation. Another possible answer is "try all hypotheses about how things are influenced."

Comment by abramdemski on Phylactery Decision Theory · 2021-04-08T16:32:41.554Z · LW · GW

One problem with this is that it doesn't actually rank hypotheses by which is best (in expected utility terms), just how much control is implied. So it won't actually converge to the best self-fulfilling prophecy (which might involve less control).

Another problem with this is that it isn't clear how to form the hypothesis "I have control over X".

Comment by abramdemski on Reflective Bayesianism · 2021-04-07T20:52:14.809Z · LW · GW

I wanted to separate what work is done by radicalizing probabilism in general, vs logical induction specifically. 

From my perspective, Radical Probabilism is a gateway drug. Explaining logical induction intuitively is hard. Radical Probabilism is easier to explain and motivate. It gives reason to believe that there's something interesting in the direction. But, as I've stated before, I have trouble comprehending how Jeffrey correctly predicted that there's something interesting here, without logical uncertainty as a motivation. In hindsight, I feel his arguments make a great deal of sense; but without the reward of logical induction waiting at the end of the path, to me this seems like a weird path to decide to go down.

That said, we can try and figure out Jeffrey's perspective, or, possible perspectives Jeffrey could have had. One point is that he probably thought virtual evidence was extremely useful, and needed to get people to open up to the idea of non-bayesian updates for that reason. I think it's very possible that he understood his Radical Probabilism purely as a generalization of regular Bayesianism; he may not have recognized the arguments for convergence and other properties. Or, seeing those arguments, he may have replied "those arguments have a similar force for a dogmatic probabilist, too; they're just harder to satisfy in that case."

That said, I'm not sure logical inductors properly have beliefs about their own (in the de dicto sense) future beliefs. It doesn't know "its" source code (though it knows that such code is a possible program) or even that it is being run with the full intuitive meaning of that, so it has no way of doing that.

I totally agree that there's a philosophical problem here. I've put some thought into it. However, I don't see that it's a real obstacle to ... provisionally ... moving forward. Generally I think of the logical inductor as the well-defined mathematical entity and the self-referential beliefs are the logical statements which refer back to that mathematical entity (with all the pros and cons which come from logic -- ie, yes, I'm aware that even if we think of the logical inductor as the mathematical entity, rather than the physical implementation, there are formal-semantics questions of whether it's "really referring to itself"; but it seems quite fine to provisionally set those questions aside).

So, while I agree, I really don't think it's cruxy. 

Comment by abramdemski on [deleted post] 2021-04-07T18:02:00.958Z

Fixed, sorta, but now this tag needs to be merged with "humility". (I've named it "epistemic humility" in the meantime, but I think it should just be called "humility" -- no one says "epistemic humility" I think.)

Comment by abramdemski on Reflective Bayesianism · 2021-04-07T17:29:14.973Z · LW · GW

So, let's suppose for a moment that ZFC set theory is the one true foundation of mathematics, and it has a "standard model" that we can meaningfully point at, and the question is whether our universe is somewhere in the standard model (or, rather, "perfectly described" by some element of the standard model, whatever that means).

In this case it's easy to imagine that the universe is actually some structure not in the standard model (such as the standard model itself, or the truth predicate for ZFC; something along those lines).

Now, granted, the whole point of moving from some particular system like that to the more general hypothesis "the universe is mathematical" is to capture such cases. However, the notion of "mathematics in general" or "described by some formal system" or whatever is sufficiently murky that there could still be an analogous problem -- EG, suppose there's a formal system which describes the entire activity of human mathematics. Then "the real universe" could be some object outside the domain of that formal system, EG, the truth predicate for that formal system, the intended 'standard model' of that system, etc.

I'm not confident that we should think that way, but it's a salient possibility.

Comment by abramdemski on Reflective Bayesianism · 2021-04-07T17:20:17.273Z · LW · GW

What is actually left of Bayesianism after Radical Probabilism? Your original post on it was partially explaining logical induction, and introduced assumptions from that in much the same way as you describe here. But without that, there doesn't seem to be a whole lot there. The idea is that all that matters is resistance to dutch books, and for a dutch book to be fair the bookie must not have an epistemic advantage over the agent. Said that way, it depends on some notion of "what the agent could have known at the time", and giving a coherent account of this would require solving epistemology in general. So we avoid this problem by instead taking "what the agent actually knew (believed) at the time", which is a subset and so also fair. But this doesn't do any work, it just offloads it to agent design. 

Part of the problem is that I avoided getting too technical in Radical Probabilism, so I bounced back and forth between different possible versions of Radical Probabilism without too much signposting.

I can distinguish at least three versions:

  1. Jeffrey's version. I don't have a good source for his full picture. I get the sense that the answer to "what is left?" is "very little!" -- EG, he didn't think agents have to be able to articulate probabilities. But I am not sure of the details.
  2. The simplification of Jeffrey's version, where I keep the Kolmogorov axioms (or the Jeffrey-Bolker axioms) but reject Bayesian updates.
  3. Skyrms' deliberation dynamics. This is a pretty cool framework and I recommend checking it out (perhaps via his book The Dynamics of Rational Deliberation). The basic idea of its non-bayesian updates is, it's fine so long as you're "improving" (moving towards something good).
  4. The version represented by logical induction.
  5. The Shafer & Vovk version. I'm not really familiar with this version, but I hear it's pretty good.

(I can think of more, but I cut myself off.)

Said that way, it depends on some notion of "what the agent could have known at the time", and giving a coherent account of this would require solving epistemology in general. 

Making a broad generalization, I'm going to stick things into camp #2 above or camp #4. Theories in camp #2 have the feature that they simply assume a solid notion of "what the agent could have known at the time". This allows for a nice simple picture in which we can check Dutch Book arguments. However, it does lend itself more easily to logical omniscience, since it doesn't allow a nuanced picture of how much logical information the agent can generate. Camp #4 means we do give such a nuanced picture, such as the poly-time assumption.

Either way, we've made assumptions which tell us which Dutch Books are valid. We can then check what follows.

For example with logical induction, we know that it can't be dutch booked by any polynomial-time trader. Why do we think that criterion is important? Because we think its realistic for an agent to in the limit know anything you can figure out in polynomial time. And we think that because we have an algorithm that does it. Ok, but what intellectual progress does the dutch book argument make here? We had to first find out what one can realistically know, and got logical induction, from which we could make the poly-time criterion. So now we know its fair to judge agents by that criterion, so we should find one, which fortunately we already have. But we could also just not have thought about dutch books at all, and just tried to figure out what one could realistically know, and what would we have lost? Making the dutch book here seems like a spandrel in thinking style.

I think this understates the importance of the Dutch-book idea to the actual construction of the logical induction algorithm. The criterion came first, and the construction was finished soon after. So the hard part was the criterion (which is conceived in dutch-book terms). And then the construction follows nicely from the idea of avoiding these dutch-books.

Plus, logical induction without the criterion would be much less interesting. The criterion implies all sorts of nice properties. Without the criterion, we could point to all the nice properties the logical induction algorithm has, but it would just be a disorganized mess of properties. Someone would be right to ask if there's an underlying reason for all these nice properties -- an organizing principle, rather than just a list of seemingly nice properties. The answer to that question would be "dutch books".

BTW, I believe philosophers currently look down on dutch books for being too pragmatic/adversarial a justification, and favor newer approaches which justify epistemics from a plain desire to be correct rather than a desire to not be exploitable. So by no means should we assume that Dutch Books are the only way. However, I personally feel that logical induction is strong evidence that Dutch Books are an important organizing principle.

As a side note, I reread Radical Probabilism for this, and everything in the "Other Rationality Properties" section seems pretty shaky to me. Both the proofs of both convergence and calibration as written depend on logical induction - or else, the assumption that the agent would know if its not convergent/calibrated, in which case could orthodoxy not achieve the same? You acknowledge this for convergence in a comment but also hint at another proof. But if radical probabilism is a generalization of orthodox bayesianism, then how can it have guarantees that the latter doesn't?

You're right to call out the contradiction between calling radical probabilism a generalization, vs claiming that it implies new restrictions. I should have been more consistent about that. Radical Probabilism is merely "mostly a generalization". 

I still haven't learned about how #2-style settings deal with calibration and convergence, so I can't really comment on the other proofs I implied the existence of. But, yeah, it means there are extra rationality conditions beyond just the Kolmogorov axioms.

For the conservation of expected evidence, note that the proof here involves a bet on what the agents future beliefs will be. This is a fragile construction: you need to make sure the agent can't troll the bookie, without assuming the accessability of the structures you want to establish. It also assumes the agent has models of itself in its hypothesis space. And even in the weaker forms, the result seems unrealistic. There is the problem with psychedelics that the "virtuous epistemic process" is supposed to address, but this is something that the formalism allows for with a free parameter, not something it solves. The radical probabilist trusts the sequence of , but it doesn't say anything about where they come from. You can now assert that it can't be identified with particular physical processes, but that just leaves a big questionmark for bridging laws. If you want to check if there are dutch books against your virtuous epistemic process, you have to be able to identify its future members. Now I can't exclude that some process could avoid all dutch books against it without knowing where they are (and without being some trivial stupidity), but it seems like a pretty heavy demand.

This part seems entirely addressed by logical induction, to me.

  1. A "virtuous epistemic process" is a logical inductor. We know logical inductors come to trust their future opinions (without knowing specifically what they will be). 
  2. The logical induction algorithm tells us where the future beliefs come from.
  3. The logical induction algorithm shows how to have models of yourself.
  4. The logical induction algorithm shows how to avoid all dutch books "without knowing where they are" (actually I don't know what you meant by this)
Comment by abramdemski on Predictive Coding has been Unified with Backpropagation · 2021-04-07T16:37:33.211Z · LW · GW

--I wouldn't characterize my own position as "we know a lot about the brain." I think we should taboo "a lot."

To give my position somewhat more detail:

  • I think the methods of neuroscience are mostly not up to the task. This is based on the paper which applied neuroscience methods to try to reverse-engineer the CPU.
  • I think what we have are essentially a bunch of guesses about functionality based on correlations and fairly blunt interventional methods (lesioning), combined with the ideas we've come up with about what kinds of algorithms the brain might be running (largely pulling from artificial intelligence for ideas).

I'm guessing you just are significantly more skeptical of both predictive coding and the predictive coding --> backprop link than I am... perhaps because the other hypotheses on my list are less plausible to you?

It makes a lot of sense to me that the brain does something resembling belief propagation on bayes nets. (I take this to be the core idea of predictive coding.) However:

  1. There are a lot of different algorithms resembling belief prop. Sticking within the big tent of "variational methods", there are a lot of different variational objectives, which result in different algorithms. The brain could be using a variation which we're unfamiliar with. This could result in significant differences from backprop. (I'm still fond of Hinton's analogy between contrastive divergence and dreaming, for example. It's a bit like saying that dreams are GAN-generated adversarial examples, and the brain trains to anti-learn these examples during the night, which results in improved memory consolidation and conceptual clarity during the day. Isn't that a nice story?)
  2. There are a lot of graphical models besides Bayesian networks. Many of them are "basically the same", but for example SPNs (sum-product networks) are very different. There's a sense in which Bayesian networks assume everything is neatly organized into variables already, while SPNs don't. Also, SPNs are fundamentally faster, so the convergence step in the paper (the step which makes predictive coding 100x slower than belief prop) becomes fast. So SPNs could be a very reasonable alternative, which might not amount to backprop as we know it.
  3. I think it could easily be that the neocortex is explained by some version of predictive coding, but other important elements of the brain are not. In particular, I think the numerical logic of reinforcement learning isn't easily and efficiently captured via graphical models. I could be ignorant here, but what I know of attempts to fit RL into a predictive-processing paradigm ended up using multiplicative rewards rather than additive (so, you multiply in the new reward rather than adding), simply because adding up a bunch of stuff isn't natural in graphical models. I think that's a sign that it's not the right paradigm.
  4. Radical Probabilism / Logical Uncertainty / Logical Induction makes it generally seem pretty probable, almost necessary, that there's also some "non-Bayesian" stuff going on in the brain (ie generalized-bayesian, ie non-bayesian updates). This doesn't seem well-described by predictive coding. This could easily be enough to ruin the analogy between the brain and backprop.
  5. And finally, reiterating the earlier point: there are other algorithms which are more data-efficient than backprop. If humans appear to be more efficient than backprop, then it seems plausible that humans are using a more data-efficient algorithm.

As for the [predictive coding -> backprop] link, well, that's not a crux for me right now, because I was mainly curious why you think such a link, if true, would be evidence against "the brain uses something else that backprop". I think I understand why you would think that, now, sans what the mounting evidence is.

I think my main crux is the question: (for some appropriate architecture, ie, not necessarily transformers) do human-brain-sized networks, with human-like opportunities for transfer learning, achieve human-level data-efficiency? If so, I have no objection to the hypothesis that the brain uses something more-or-less equivalent to gradient descent.