## Posts

## Comments

**JGWeissman**on A simple game that has no solution · 2014-07-22T20:50:43.838Z · LW · GW

"Rationality" seems to give different answer to the same problem posed with different affine transformations of the players' utility functions.

**JGWeissman**on A simple game that has no solution · 2014-07-22T19:21:32.281Z · LW · GW

Error: Adding values from different utility functions.

See this comment.

**JGWeissman**on A simple game that has no solution · 2014-07-21T19:10:24.368Z · LW · GW

Eliezer's "arbitrary" strategy has the nice property that it gives both players more expected utility than the Nash equilibrium. Of course there are other strategies with this property, and indeed multiple strategies that are not themselves dominated in this way. It isn't clear how ideally rational players would select one of these strategies or which one they would choose, but they should choose one of them.

**JGWeissman**on A simple game that has no solution · 2014-07-21T17:28:00.516Z · LW · GW

Why not "P1: C, P2: Y", which maximizes the sum of the two utilities, and is the optimal precommitment under the Rawlian veil-of-ignorance prior?

If we multiply player 2's utility function by 100, that shouldn't change anything because it is an affine transformation to a utility function. But then "P1: B, P2: Y" would maximize the sum. Adding values from different utility functions is a meaningless operation.

**JGWeissman**on A simple game that has no solution · 2014-07-21T01:00:42.211Z · LW · GW

The reason player 1 would choose B is not because it directly has a higher payout but because including B in a mixed strategy gives player 2 an incentive to include Y in its own mixed strategy, increasing the expected payoff of C for player 1. The fact that A dominates B is irrelevant. The fact that A has better expected utility than the subgame with B and C indicates that player 1 not choosing A is somehow irrational, but that doesn't give a useful way for player 2 to exploit this irrationality. (And in order for this to make sense for player 1, player 1 would need a way to counter exploit player 2's exploit, and for player 2 to try its exploit despite this possibility.)

**JGWeissman**on A simple game that has no solution · 2014-07-20T23:10:17.369Z · LW · GW

The definition you linked to doesn't say anything about entering subgame not giving the players information, so no, I would not agree with that.

I would agree that if it gave player 2 useful information, that should influence the analysis of the subgame.

(I also don't care very much whether we call this object within the game of how the strategies play out given that player 1 doesn't choose A a "subgame". I did not intend that technical definition when I used the term, but it did seem to match when I checked carefully when you objected, thinking that maybe there was a good motivation for the definition so it could indicated a problem with my argument if it didn't fit.)

I also disagree that player 1 not picking A provides useful information to player 2.

**JGWeissman**on A simple game that has no solution · 2014-07-20T22:43:56.143Z · LW · GW

I'm sorry but "subgame" has a very specific definition in game theory which you are not being consistent with.

I just explained in detail how the subgame I described meets the definition you linked to. If you are going to disagree, you should be pointing to some aspect of the definition I am not meeting.

Also, intuitively when you are in a subgame you can ignore everything outside of the subgame, playing as if it didn't exist. But when Player 2 moves he can't ignore A because the fact that Player 1 could have picked A but did not provides insight into whether Player 1 picked B or C.

If it is somehow the case that giving player 2 info about player 1 is advantageous for player 1, then player 2 should just ignore the info, and everything still plays out as in my analysis. If it is advantageous for player 2, then it just strengthens the case that player 1 should choose A.

I am a game theorist.

I still think you are making a mistake, and should pay more attention to the object level discussion.

**JGWeissman**on A simple game that has no solution · 2014-07-20T21:24:59.845Z · LW · GW

To see that it is indeed a subgame:

Represent the whole game with a tree whose root node represents player 1 choosing whether to play A (leads to leaf node), or to enter the subgame at node S. Node S is the root of the subgame, representing player 1's choices to play B or C leading to nodes representing player 2 choice to play X or Y in those respective cases, each leading to leaf nodes.

Node S is the only node in its information set. The subgame contains all the descendants of S. The subgame contains all nodes in the same information set as any node in the subgame. It meets the criteria.

There is no uncertainty that screws up my argument. The whole point of talking about the subgame was to stop thinking about the possibility that player 1 chose A, because that had been observed not to happen. (Of course, I also argue that player 2 should be interested in logically causing player 1 not to have chosen A, but that gets beyond classical game theory.)

**JGWeissman**on A simple game that has no solution · 2014-07-20T20:20:39.987Z · LW · GW

Classical game theory says that player 1 should chose A for expected utility 3, as this is better than than the sub game of choosing between B and C where the best player 1 can do against a classically rational player 2 is to play B with probability 1/3 and C with probability 2/3 (and player 2 plays X with probability 2/3 and Y and with probability 1/3), for an expected value of 2.

But, there are pareto improvements available. Player 1's classically optimal strategy gives player 1 expected utility 3 and player 2 expected utility 0. But suppose instead Player 1 plays C, and player 2 plays X with probability 1/3 and Y with probability 2/3. Then the expected utility for player 1 is 4 and for player 2 it is 1/3. Of course, a classically rational player 2 would want to play X with greater probability, to increase its own expected utility at the expense of player 1. It would want to increase the probability beyond 1/2 which is the break even point for player 1, but then player 1 would rather just play A.

So, what would 2 TDT/UDT players do in this game? Would they manage to find a point on the pareto frontier, and if so, which point?

**JGWeissman**on Downvote stalkers: Driving members away from the LessWrong community? · 2014-07-02T16:41:55.620Z · LW · GW

If you have trouble confronting people, you make a poor admin.

Can we please act like we actually know stuff about practical instrumental rationality given how human brains work, and not punish people for openly noticing their weaknesses.

You could have more constructively said something like "Thank you for taking on these responsibilities even though it sometimes makes you uncomfortable. I wonder if anyone else who is more comfortable with that would be willing to help out."

**JGWeissman**on Life insurance for Cryonics, how many years? · 2014-05-23T17:37:44.901Z · LW · GW

I use whole life insurance. If you use term insurance, you should have a solid plan for an alternate funding source to replace your insurance at the end of the term.

I believe the Efficient Market Hypothesis is correct enough that reliably getting good results from buying term insurance and investing the premium difference would be a lot of work if possible at all.

**JGWeissman**on The Problem of "Win-More" · 2014-03-26T20:14:42.676Z · LW · GW

See also The Valley of Bad Rationality.

**JGWeissman**on Open thread, 18-24 March 2014 · 2014-03-19T18:37:17.116Z · LW · GW

Mere survival doesn't sound all that great. Surviving in a way that is comforting is a very small target in the general space of survival.

**JGWeissman**on Polling Thread · 2014-03-03T19:51:03.467Z · LW · GW

By saying "clubs", I communicate the message that my friend would be better off betting $1 on a random club than $2 on the seven of diamonds, (or betting $1 on a random heart or spade), which is true, so I don't really consider that lying.

If, less conveniently, my friend takes what I see to literally mean the suit of the top card, but I still can get them to not bet $2 on the wrong card, then I bite the bullet and lie.

**JGWeissman**on White Lies · 2014-02-21T17:57:59.347Z · LW · GW

And there's a very real danger of this being a fully general counterargument against any sufficiently simple moral theory.

Establishing a lower bound on the complexity of a moral theory that has all the features we want seems like a reasonable thing to do. I don't think the connotations of "fully general counterargument" are appropriate here. "Fully general" means you can apply it against a theory without really looking at the details of the theory. If you have to establish that the theory is sufficiently simple before applying the counterargument, you are referencing the details of the theory in a way that differentiates from other theories, and the counterargument is not "fully general".

**JGWeissman**on A Fervent Defense of Frequentist Statistics · 2014-02-12T20:27:49.455Z · LW · GW

and the number of possible models for T rounds is exponential in T

??? Here n is the number of other people betting. It's a constant.

Within a single application of online learning, n is a constant, but that doesn't mean we can't look at the consequences of it having particular values, even values that vary with other parameters. But, you seem to be agreeing with the main points that if you use all possible models (or "super-people") the regret bound is meaningless, and that in order to reduce the number of models so it is not meaningless, while also keeping a good model that is worth performing almost as well as, you need structural assumptions.

even if the "true hypothesis" isn't in the family of models we consider

I agree you don't need the model that is right every round, but you do need the model to be right in a lot of rounds. You don't need a perfect model, but you need a model that is as correct as you want your end results to be.

maybe even adversarial data

I think truly adversarial data gives a result that is within the regret bounds, as guaranteed, but still uselessly inaccurate because the data is adversarial against the collection of models (unless the collection is so large you aren't really bounding regret).

**JGWeissman**on Useful Personality Tests · 2014-02-11T20:19:06.923Z · LW · GW

Everyone with half a brain could game them either to shorten their stay or to get picked as a leader candidate.

Maybe that's the test.

**JGWeissman**on A Fervent Defense of Frequentist Statistics · 2014-02-11T19:49:07.802Z · LW · GW

Regarding myth 5 and the online learning, I don't think the average regret bound is as awesome as you claim. The bound is square root( (log n) / T). But if there are really no structural assumptions, then you should be considering all possible models, and the number of possible models for T rounds is exponential in T, so the bound ends up being 1, which is the worst possible average regret using any strategy. With no assumptions of structure, there is no meaningful guarantee on the real accuracy of the method.

The thing that is awesome about the bounds guarantee is that if you assume some structure, and choose a subset of possible models based on that structure, you know you get increased accuracy if your structural assumptions hold.

So this method doesn't really avoid relying on structural assumptions, it just punts the question of which structural assumption to make to the choice of models to run the method over. This is pretty much the same as Bayesian methods putting the structural assumptions in the prior, and it seems that choosing a collection of models is an approximation of choosing a prior, though less powerful because instead of assigning models probabilities in a continuous range, it just either includes the model or doesn't.

**JGWeissman**on Skepticism about Probability · 2014-01-27T17:19:27.844Z · LW · GW

The obvious steelman of dialogue participant A would keep the coin hidden but ready to inspect, so that A can offer bets having credible ignorance of the outcomes and B isn't justified in updating on A offering the bet.

**JGWeissman**on Dangers of steelmanning / principle of charity · 2014-01-18T15:47:00.166Z · LW · GW

Yvain says that people claim to be using one simple deontological rule "Don't violate consent" when in fact they are using a complicated collection of rules of the form "Don't violate consent in this specific domain" while not following other rules of that form.

And yet, you accuse him of strawmanning their argument to be *simple*.

**JGWeissman**on Dangers of steelmanning / principle of charity · 2014-01-17T04:36:54.746Z · LW · GW

Arguing that the consequentialist approach is better than the deontological approach is different than skipping that step and going straight to refuting your own consequentialist argument for the position others were arguing on deontological grounds. Saying they should do some expected utility calculations is different than saying the expected utility calculations the haven't done are wrong.

**JGWeissman**on Dangers of steelmanning / principle of charity · 2014-01-17T02:33:26.982Z · LW · GW

I thought he was ramming home his point that his opponents are secretly deontologists there

I think the point was that his opponents are openly deontologists, making openly deontological arguments for their openly deontological position, and therefor they are rightly confused and not moved by Yvain's refutation of a shoehorning of their position into a consequentialist argument when they never made any such argument, which Yvain now understands and therefor he doesn't do that anymore.

**JGWeissman**on Functional Side Effects · 2014-01-14T21:11:23.130Z · LW · GW

This seems to be overloading the term "side effects". The functional programming concept of side effects (which it says its functions shouldn't have) is changing the global state of the program that invokes them other than by returning the value. It makes no claims of these other concepts of a program being affected by analyzing the source code of the function independent of invoking it or of the the function running on morally relevant causal structure.

**JGWeissman**on Dangers of steelmanning / principle of charity · 2014-01-13T14:50:05.354Z · LW · GW

It seems to me that in the quote Yvain is admitting an error, not celebrating victory. Try taking his use of the word "reasonably" at face value.

**JGWeissman**on Why CFAR? · 2014-01-01T02:17:11.819Z · LW · GW

CFAR can achieve its goal of creating effective, rational do-gooders by taking existing do-gooders and making them more effective and rational. This is why they offer scholarships to existing do-gooders. Their goal is not to create effective, rational do-gooders out of blank slates but make valuable marginal increases in this combination of traits, often by making people who already rank highly in these areas even better.

They also use the same workshops to make people in general more effective and rational, which they can charge money for to fund the workshops, and gives them more data to test their training methods on. That they don't turn people in general into do-gooders does not constitute a failure of the whole mission. These activities support the mission without directly fulfilling it.

Fourth, they become a sort of fraternal organization, i.e. membership does bring benefits mainly from being able to network with other members.

CFAR is creating an alumni network to create benefits on top of increased effectiveness and rationality.

**JGWeissman**on Why CFAR? · 2013-12-31T14:28:52.922Z · LW · GW

CFAR does offer to refund the workshop fee if after the fact participants evaluate that it wasn't worth it. They also solicit donations from alumni. So they are kind of telling participants to evaluate the value provided by CFAR and pay what they think is appropriate, while providing an anchor point and default which covers the cost of providing the workshop. That anchor point and default are especially important for the many workshop participants who are not selected for altruism, who probably will learn a lot of competence and epistemic rationality but not much altruism, and whose workshop fees subsidize CFAR's other activities.

**JGWeissman**on Donating to MIRI vs. FHI vs. CEA vs. CFAR · 2013-12-27T13:42:13.971Z · LW · GW

my feeling about CFAR is that they are providing a service to individuals for money and it's probably not a terrible idea to let the market determine if their services are worth the amount they charge.

I think that CFAR's workshops are self funding and contribute to paying for organizational overhead. Donated funds allow them to offer scholarships to their workshops to budding Effective Altruists (like college students) and run the SPARC program (targeting mathematically gifted children who may be future AI researchers). So, while CFAR does provide a service to individuals for money, donated money buys more services targeted at making altruistic people more effective and getting qualified people working on important hard problems.

**JGWeissman**on Donating to MIRI vs. FHI vs. CEA vs. CFAR · 2013-12-27T06:08:14.543Z · LW · GW

I'm convinced AGI is much more likely to be built by a government or major corporation, which makes me more inclined to think movement-building activities are likely to be valuable, to increase the odds of the people at that government or corporation being conscious of AI safety issues, which MIRI isn't doing.

MIRI's AI workshops get outside mathematicians and AI researchers involved in FAI research, which is good for movement building within the population of people likely to be involved in creating an AGI.

**JGWeissman**on Open thread for December 17-23, 2013 · 2013-12-19T20:22:09.363Z · LW · GW

Qiaochu's answer seems off. The argument that the parent AI can already prove what it wants the successor AI to prove and therefore isn't building a more powerful successor, isn't very compelling because being able to prove things is a different problem than searching for useful things to prove. It also doesn't encompass what I understand to be the Lobian obstacle, that being able to prove that if your own mathematical system proves something that thing is true implies that your system is inconsistent.

Is there more context on this?

**JGWeissman**on Harry Potter and the Methods of Rationality discussion thread, part 28, chapter 99-101 · 2013-12-15T06:43:03.901Z · LW · GW

Then, back in Azkaban, facing that auror, when Quirrel used Avadra Kevadra in an attempt to force the auror to dodge

What do you think you know about which spell Quirrell used, and how do you think you know it?

**JGWeissman**on Open thread for December 9 - 16, 2013 · 2013-12-13T20:37:32.396Z · LW · GW

It sounds to me like somebody is purchasing utilons, using themselves as an example to get other people to also purchase utilons, and incidentally deriving a small amount of well deserved status from the process.

**JGWeissman**on Open thread for December 9 - 16, 2013 · 2013-12-10T15:52:37.425Z · LW · GW

Can you give an example of a property a star might have because having that property made its ancestor stars better at producing descendant stars with that property?

**JGWeissman**on Open Thread, December 2-8, 2013 · 2013-12-03T21:50:15.591Z · LW · GW

Is Jaynes' PT:LOS able to be read by moi, given that I know basic set theory?

A good way to find out would be to try reading it.

**JGWeissman**on Another problem with quantum measure · 2013-11-22T14:37:08.974Z · LW · GW

However, it's actually determined what your decision is - any Laplacian demon could deduce it from looking at your brain. It's all pretty clear, and quantum events are not enough to derail it (barring very very low measure stochastic events). So from the universe's perspective, you're not choosing anything, not shifting measure from anything to anything.

The logical structure of my decision still controls what world gets the measure. From Timeless Control:

Surely, if you can determine the Future just by looking at the Past, there's no need to look at the Present?

The problem with the right-side graph is twofold: First, it violates the beautiful locality of reality; we're supposing causal relations that go outside the immediate neighborhoods of space/time/configuration. And second, you can't compute the Future from the Past, except by also computing something that looks exactly like the Present; which computation just creates another copy of the Block Universe (if that statement even makes any sense), it does not affect any of the causal relations within it.

This is basically the same point as the one I keep making and you keep missing: The universe/Laplacian demon/whatever is adding quantum measure, in order to select the same world to add measure to that was selected by your decision, it has to duplicate the causal structure of your decision and the resulting world. (And since within this computation the same things happen for the same reasons as in the selected world, by the generalized anti zombie principle, the computation is adding measure to that world even at times before your model says it adds quantum measure.)

**JGWeissman**on Group Rationality Diary, November 16-30 · 2013-11-21T18:27:53.622Z · LW · GW

I don't think that would prevent you from falling asleep, but you would get more benefit from the nap if it is earlier, so your waking periods are more even. You should also be able to sleep less at night, typical biphasic schedules have a 6.5 hour core.

**JGWeissman**on Another problem with quantum measure · 2013-11-21T16:22:08.595Z · LW · GW

Are you claiming that a) this model is incoherent, or b) that this model does not entail what I'm claiming (that you should save for the future)?

The basic model you described, even as alternative physics, is underspecified, and depending on how I try to steelman it so it is coherent, it doesn't entail what you claim, and if I try to steelman it so it entails what you say, it isn't coherent.

The big question is what worlds get to accumulate measure and why those particular worlds. If the answer is that all worlds accumulate measure, then the accumulation happens independently of your decision, so the effect should not impact your decision. If the answer is that the measure accumulation process looks somehow depends on what world your decision leads to, then the measure accumulation process in locating that world duplicates its causal structure, and by the globalized anti zombie principle, contains all the same conscious people as that world, so it adds to the worlds decision theoretical measure even before your model says it officially adds to its quantum measure (this is basically parallel to the argument for Many Worlds). What I think is incoherent is the idea that you can add measure to world state without adding measure to the process that selected that world state, which you try to do by supposing that your decision (and its intermediate effects) don't cause the later accumulation of measure, yet the measure magically accumulates in the world that results from your decision. (To account for this, you would have to follow the probability to outside the thought experiment.)

It feels like this should all be obvious if you understand why p-zombies are incoherent, why Many Worlds is obviously correct, and how these are related.

**JGWeissman**on Another problem with quantum measure · 2013-11-21T14:16:37.806Z · LW · GW

My model was of gradual proportional increase in utility

Yes, my example shows a proportional increase in measure between two times, and is indifferent to the gradual increase between these times. If you think the gradual increase is important, please provide an example that illustrates this.

not absolute addition to every branch.

I have already explained why adding the measure to a single branch is incoherent in both the cases where the decision causes or does not cause selection of the branch that receives the measure.

**JGWeissman**on Another problem with quantum measure · 2013-11-20T18:33:17.321Z · LW · GW

Yes, that is how your decision gives your measure M to world WA or to world WB, but that shouldn't affect accumulation of measure into later states of these worlds by quantum fluctuation, so both worlds still get measure 10M from that.

Unless you mean that quantum fluctuations into later states of the world are directed by the normal evolution of the earlier states, including your decision, in which case, this process would be adding measure (perhaps not quantum measure, but counting as decision theoretic measure in the same way) to the initial state of the world in which you make the decision (because it is another instance of the same causal chain, that is, it produces the same result for the same reasons), so you get all 10M of the quantum fluctuation measure right away, and choice A gives 44M utilons while B still gives 33M utilons.

**JGWeissman**on Another problem with quantum measure · 2013-11-20T18:04:44.861Z · LW · GW

I am not sure what you mean by "logically incompatible worlds", but if worlds WA and WB are the results of different available decisions of an agent embedded in a common precursor world, then they both follow the same laws of physics and just have their particles or whatever in different places, and in a quantum universe they just have different quantum states.

**JGWeissman**on Another problem with quantum measure · 2013-11-20T16:49:39.873Z · LW · GW

I was envisaging utilons being "consumed" at the time they were added (say people eating chocolate bars).

My example is entirely compatible with this.

So choosing A would add 4M utilons, and choosing B would add 33M utilons.

So the problem here is that you are not accounting for the fact that choosing A in the measure M world does not prevent the accumulation of measure 10M to world WB from quantum fluctuation. You get those 30M utilons whether you choose A or B, choosing A gets you an immediate 4M additional utilons, while choosing B gets you a deferred 3M utilons.

**JGWeissman**on Another problem with quantum measure · 2013-11-20T14:11:08.224Z · LW · GW

Suppose you are in a world with measure M and are choosing between A and B, where A results in world WA which includes an immediate effect worth 4 utilons per measure, and B results in world WB which includes a later effect at time T worth 3 utililons per measure. Suppose further that under your not-serious theory, at time T, random quantum fluctuations have added measure 10M to the worlds WA and WB. So your choice between A and B is a choice to either add measure M to world WA or world WB, so that choice A results in WA immediately having measure M worth 4M utililons and later at time T, WA having measure 11M (0 utilons) while WB has measure 10M (worth 30M utilons) for a total of 34M utilons, while choice B results in WB immediately having measure M, (worth 0 utilons), and at time T WA having measure 10M (worth 0 utilons) and WB having measure 11M (worth 33M utilons), so you choose A for 34M instead of B for 33M utilons, for the same reasons that without the non-serious theory, you would choose A for 4M utilons instead of B for 3M utilons. Your non-serious theory should not impact your decisions because your decisions do not control which worlds it adds measure to.

**JGWeissman**on Another problem with quantum measure · 2013-11-20T12:28:58.109Z · LW · GW

My point was that under your assumptions, the amount you affect does not increase in time at all, only the amount you do not affect increases.

**JGWeissman**on Skirting the mere addition paradox · 2013-11-18T18:18:37.995Z · LW · GW

For any population of people of happiness h, you can add more people of happiness less than h, and still improve things.

I think that this property, at least the way you are interpreting it, does not fully represent the intuition that leads to the repugnant conclusion. A stronger version would be: For any population of people, you want add more people with positive happiness (keeping the happiness of the already existing people constant), and still improve things.

I don't think your unintuitive aggregation formula would be compatible with that.

**JGWeissman**on Another problem with quantum measure · 2013-11-18T17:54:46.592Z · LW · GW

My critique of the physics was more of an aside. The main point was the critique of the decision theory, that under the assumptions of this non-serious theory of physics, most of the measure of the various outcomes are independent of your decisions, and you should only base you decisions on the small amount of measure you actually affect.

**JGWeissman**on Another problem with quantum measure · 2013-11-18T16:45:08.147Z · LW · GW

Occasionally, one of them will partially tunnel, by chance, into the same state our universe is in - and then will evolve forwards in time exactly as our universe is.

So, pretending that this sort of thing has any significance, you would also expect some worlds to tunnel, but chance, into neighboring states, as might result from making different decisions. So, the argument for always sacrificing in favor of future gains falls down, most of the measure for world in which you get the future benefits of the sacrifice comes from quantum fluctuations, not the sacrifice itself, as both available worlds, where you make or don't make the sacrifice, accumulate measure from random tunneling, regardless of your choice. You should make your decision based on the amount of measure you actually affect, not the amount that happens to merge into the same state you might cause. (And the ever growing number of branches this theory says would be accumulating measure just shows further how ridiculous it is.)

**JGWeissman**on The Inefficiency of Theoretical Discovery · 2013-11-18T15:36:25.355Z · LW · GW

There are sequences of .Net instructions that result in the runtime throwing type exceptions, because it tries to read a value of a certain type of the stack, and it gets an incompatible value. This is the situation that my verifier guards against.

The standard .Net runtime also includes a verifier that checks the same thing, and it will not run code that fails this validation unless it is explicitly trusted. So a verifiable .Net assembly will not throw type exceptions without an explicit cast or throw, but an arbitrary assembly may do so. The compilers for the standard languages such as C# will generally only produce verifiable assemblies unless you explicitly mark parts of the source code as "unsafe", and unsafe, or unverifiable assemblies need special permissions to run at all.

(There are other properties that both my verifier and the standard verifier check for. The reason I wrote my own is that it produces much more informative descriptions of problems it finds, and it is integrated into my assembly emitting libraries, so it detects problems as the assembly to be emitted is defined, and when run in the debugger, will easily show the compiler code and execution state that caused the problem.)

**JGWeissman**on The Inefficiency of Theoretical Discovery · 2013-11-17T23:28:12.293Z · LW · GW

I am not trying to write a classifier that tells you whether or not an arbitrary program throws a type exception. I wrote a verifier that tells you whether or not an arbitrary program can be proven not to throw type exceptions (except possibly at an explicit cast statement, or a throw exception statement) with a particular proof strategy that covers a huge space of useful, nicely structured programs.

See also jsteinhardt's comment I was responding to, which discussed getting around the halting problem by allowing the checker to say "I don't know".

**JGWeissman**on The Inefficiency of Theoretical Discovery · 2013-11-06T05:35:51.495Z · LW · GW

Ah, so by "Godel's theorem presents a strictly stronger obstacle than Lob's theorem" you mean if you overcome Godelian obstacles you also overcome Lobian obstacles? I think I agree, but I am not sure that it is relevant, because the program analyzer examples don't overcome Godelian obstacles, they just cope with the Godelian obstacles, which does not similarly imply coping with or overcoming Lobian obstacles.

**JGWeissman**on The Inefficiency of Theoretical Discovery · 2013-11-06T02:50:46.994Z · LW · GW

Godel's theorem presents a strictly stronger obstacle than Lob's theorem.

Why do you say that? My understanding is that Godel's theorem says that a (sufficiently powerful) logical system has true statements it can't prove, but these statements are excessively complicated and probably not important. Is there some way you envision an AGI being limited in its capacity to achieve its goals by Godel's theorem, as we envision Lob's theorem blocking an AGI from trusting its future self to make effective decisions? (Besides where the goals are tailored to be blocked by the theorem: "Prove all true statements in my formal system")

Finally, I think it's worth pointing out an actual example of a program analysis tool since I think they are more powerful than you have in mind. The following slides are a good example of such a tool.

As near as I can tell, this is more powerful than other static analysis tools that I have seen (though maybe not, the Google Web Toolkit optimizer is pretty impressive), but it is well within what I expect to be possible, and doesn't get around Lob's theorem. (I can't be too confident in this assessment of its power because I don't see a clear claim of what sort of statements it can prove or how it works except that it seems to detect when inputs to a statement may have invalid values and that it uses brute force techniques to analyze functions and then associates a summary of the analysis with that function (constraints on valid inputs and guarantees on outputs?) so that its analysis of call sites can use the summary.) (This sort of program analyzer is interesting in its own right, and I would like to see a more complete explanation of it, but I don't think it's existence says anything about the problems posed by Lob's theorem.)

**JGWeissman**on The Inefficiency of Theoretical Discovery · 2013-11-05T16:22:50.685Z · LW · GW

This is a good approach for dealing with the halting problem, but I think that Lob's theorem is not so closely related that getting around the halting problem means you get around Lob's theorem.

The theoretical AI that would run into Lob's theorem would need more general proof producing capability than these relatively simple program analyzers.

It seems like these program analyzers are built around the specification S they check for, with the human programmer doing the work of constructing a structure of a proof which can be filled in to a complete proof by supplying basic facts that the program can check. For example, I have a library that produces .Net assemblies, with byte code that targets a stack based virtual machine, and I want to verify that instructions to read elements off the stack get elements of the type is expecting. So I wrote my library to keep track of types that could be written to the stack at any execution point (represented by a stack of types). It is straightforward to compute the possible stack states after each instruction, given that instruction and the previous possible stack trace, and determine if the instruction is legal given the previous state (well, branching makes it more complicated, but not too much). So, in this case, I came up with the structure of tracking possible states after each instruction, and then it was straightforward to write my program to fill in that structure, but I did not, and don't know how to, write my program to come up with the proof structure. It is therefor easy to be confident that proof will have nice properties, because the space of possible proofs with this structure is much smaller than the space of all possible proofs.

The theoretical AI that would run into Lob's theorem would be able to come up with proof structures, as an AI that could only use proof structures prepackaged by human programmers would have huge gaps in its capabilities. This may introduce difficulties not seen in simple program analyzers.

In the example I asked about earlier, the optimizer that needs to prove things about its own future decisions, it runs into an issue that Lob's theorem applies to. In order prove that its own future decisions are good, it would rely on the fact that it's future decision are based on its own sound proof system, so it needs to assert that its own proof system is sound, that if its proof system says "X", then X. Lob's theorem says that, for all statements X, if a system P (at least as powerful as Peano arithmetic) says "P says 'X' implies X" then P says X. So, if the system asserts its own soundness, it asserts anything.

So, in summary:

Lob's theorem is a problem for generally powerful formal logic based proof systems that assert their own soundness.

Program analyzers are formal logic based proof systems that do not run into Lob's theorem, because they are not generally powerful, and do not assert their own soundness.

Humans are generally powerful proof generators than can have confidence in their own soundness, but are not purely based on formal logic, (we might intuitively dismiss a formal logical proof as "spurious") and so can get around Lob's theorem, but we don't understand how humans do this well enough to write a program to do it.

Paul's idea of a probabilistic proof system could lead to a generally powerful proof generator with confidence in its own soundness that is not based on formal logic, so that it gets around Lob's theorem, that we can understand well enough to write a program for.