Posts
Comments
Thanks for pointing that out. My feeling is still "well yes, that's technically true, but it still seems unnatural, and explosion is still the odd axiom out".
Coq, for example, allows empty case expressions (for empty types), and I expect that other languages which double as proof assistants would support them as well... for the very purpose of satisfying explosion. General purpose languages like Haskell (and I just checked OCaml too) can seemingly overlook explosion/empty cases with few if any practical problems.
Thanks for the suggestion, Patrick. I've now adapted Tsvi's formal argument to the reframed "5-equals-10 problem" and added it into the last section of my writeup.
Because the length of Scott's Moloch post greatly exceeds my working memory (to the extent that I had trouble remembering what the point was by the end) I made these notes. I hope this is the right place to share them.
Notes on Moloch (ancient god of child sacrifice)
http://slatestarcodex.com/2014/07/30/meditations-on-moloch/
Intro - no real content.
Moloch as coordination failure: everyone makes a sacrifice to optimize for a zero-sum competition, ends up with the same relative status, but worse absolute status.
- 10 examples: Prisoner's Dilemma, dollar auctions, fish-farming story (tragedy of the commons), Malthusian trap, ruthless/exploitative Capitalist markets, the two-income trap, agriculture, arms races, cancer, political race to the bottom (lowering taxes to attract business)
- 4 partial examples: inefficient education, inefficient science, government corruption (corporate welfare), Congress (representatives voting against good of nation for good of constituency)
Existing systems are created by incentive structures, not agents, e.g. Las Vegas caused by a known bias in human reward circuitry, not optimization for human values.
But sometimes we move uphill anyway. Possible explanations:
- Excess resources / we are in the dream time and can afford non-competitive behavior.
- Physical limitations to what can be sacrificed
- Economic competition actually producing positive utility for consumers (but this is fragile)
- Coordination, e.g. via governments, guilds, friendships, etc.
Technology/ingenuity creates new opportunities to fall into such traps. Technology overcomes physical limitations, consumes excess resources. Automation further decouples economic activity from human values. Technology can improve coordination, but can also exacerbate existing conflicts by giving all sides more power.
AGI opens up whole new worlds of traps: Yudkowsky's paperclipper, Hanson's subsistence-level ems, Bostrom's Disneyland with no children.
6 & 7. Gnon - basically the god of the conservative scarcity mindset. Nick Land advocates compliance; Nyan wants to capture Gnon and build a walled garden. Scott warns that Moloch is far more terrifying than Gnon and will kill both of them anyway.
8 & 9. So we have to kill this Moloch guy, by lifting a better God to Heaven (Elua).
Cool.
For those who don't want to wait until April 7th, Udacity is scheduled to launch their own Intro to Data Science on Feb 5th (this Wednesday). I expect it to be easier/shallower than the Hopkins/Coursera sequence, but it wins on actionability.
For a countable language L and theory T (say, with no finite models), I believe the standard interpretation of "space of all models" is "space of all models with the natural numbers as the underlying set". The latter is a set with cardinality continuum (it clearly can't be larger, but it also can't be smaller, as any non-identity permutation of the naturals gives a non-identity isomorphism between different models).
Moreover this space of models has a natural topology, with basic open sets {M: M models phi} for L-sentences phi, so it makes sense to talk about (Borel) probability measures on this space, and the measures of such sets. (I believe this topology is Polish, actually making the space Borel isomorphic to the real numbers.)
Note that by Lowenheim-Skolem, any model of T admits a countable elementary substructure, so to the extent that we only care about models up to some reasonable equivalence, countable models (hence ones isomorphic to models on the naturals) are enough to capture the relevant behavior. (In particular, as pengvado points out, the Christiano et al paper only really cares about the complete theories realized by models, so models on the naturals suffice.)
For the sake of the absent gods do not start with an IDE or anything that hides the compilation step.
Could you say more about why this is important for beginning programmers?
While you're certainly technically correct, it's an easy/common mistake for people to focus on the "save all you can" part, overlooking "gain all you can" opportunities. The EA movement is notable for proactively trying to counter this mistake, and apparently so is John Wesley.
For the same reasons you outline above, I'm okay with fighting this hypothetical target.
If I must dignify the hypothesis with a strategy: my "buy" and "sell" prices for such a bet correspond to the inner and outer measures of the target, respectively.
In other words, "what is the measure of an unmeasurable set?". The question is wrong.
Several popular comments say something to the effect of "I was too arrogant to just get with the program and cooperate with the other humans".
The biggest of my own arrogant mistakes was not taking CS/programming very seriously while in college because I was dead set on becoming a mathematician, and writing code was "boring" compared to math. Further arrogance: I wasn't phased by the disparity between the number of graduating Ph. D.'s and the number of academic jobs.
I found out in grad school that my level of talent in mathematics, while somewhat rare, was certainly not so exceedingly rare that real world considerations would not apply to me.
I've since changed my attitude, and I'm working on fixing this mistake.
GEB will take you from superficial knowledge to full grok.
A word of caution: there is a risk when reading popular science/math books like GEB of coming away feeling like one understands something at a higher level than one actually does, particularly if one hasn't already studied the subject formally.
If one has formally studied incompleteness before, it's easy to wave away standard primitive recursive derivations (e.g. the proof predicate) as tedious and trivial and beside the main point, but having this attitude the first time around could be dangerous.
I read GEB years ago, and recall liking it quite a bit, though I disagree with Louie Helm's endorsement of GEB as a course reference, at least not without supplementation from a "standard" source like these notes (or any formal logic textbook).
Thanks!
Martin Gardner's Mathematical Games column from Scientific American Volume 242, Number 6, June, 1980. Paywalled here:June:1980).
EDIT: escape characters
Robin Hanson has advocated this point of view.
I find the argument quite unconvincing; Hanson seems to be making the mistake of conflating "life worth living" with "not committing suicide" that is well addressed in MTGandP's reply (and grandchildren).
The 52-karma top comment of the Virtual Employment thread has been deleted. I gather that it said something about copywriting, with online skill tests for prospective applicants.
Can anyone provide a bit more information about this apparently quite valuable comment?
Awesome, thanks!
Thanks for the nice comment. I listed the PD post first, as it is probably the most readable of the three, written more like an article than like notes.
Thanks for looking! I'll try to get my hands on a physical copy, as the Google Books version has highly distracting page omissions.
Unless I'm going insane, Luke_A_Somers' comment below is incorrect. If you defect and your opponent times out, you still get 1 point and they get 0, which is marginally better than both getting 1 point in the case of mutual defection.
That was the purpose my (sleep 9). I figured anyone who was going to eval me twice against anything other than CooperateBot is going to figure out that I'm a jerk and defect against me, so I'm only getting one point from them anyway. Might as well lower their score by 1.
My assumption might not have been correct though. In the original scoreboard, T (who times out against me) actually does cooperate with K, who defected against everybody!
The bizarre-looking always-false conditional was a long shot at detecting simulation. I heard an idea at a LW meetup (maybe from So8res?) that players might remove sleeps from each other's code before calling eval. I figured I might as well fake such players out if there were any (there were not).
Yes, the "here's why Quinn is wrong about CooperateBot being a troll submission" comments were valuable, so I don't regret provoking them. Presumably if my comment had said "players" instead of "trolls" from the outset, it would have been ignored for being inoffensive and content-free.
But provoking a few good comments was a happy accident from my perspective. I will avoid casual use of the word "troll" on this site, unless I have a specific purpose in mind.
Yes, and I agree with this. I'm familiar with Straw Vulcanism and accept that guessing incorrectly is my mistake, not others'.
It seems anger and frustration were read into my comment, when in fact I was merely surprised, so I've edited out the offending t-word.
Ken Binmore & Hyun Song Shin. Algorithmic knowledge and game theory. (Chapter 9 of Knowledge, Belief, and Strategic Interaction by Cristina Bicchieri.)
EDIT: Actually, I'd be pretty happy to see any paper containing both the phrases "common knowledge" and "Löb's theorem". This particular paper is probably not the only one.
Thank you.
Can you elaborate on this?
You are right that I used the inflammatory t-word because CooperateBot submitters are probably not trying to win. I certainly expected to see DefectBots (or CliqueBots from optimists), and agree that the competition should have been seeded with both CooperateBots and DefectBots.
But I don't understand this at all:
But I find this a poor excuse. (It's why I always give the maximum possible number whenever I play "guess 2/3rds of the average.")
To me, this looks like t-wording the people who play 0.
Are we thinking of the same game, where the payout is constant?
I was a bit irritated that no fewer than three players submitted CooperateBot, as I cooperated with it in hopes of tricking bots like submission B.
EDIT: More offense was taken at my use of the word "trolls" than was intended.
Of course, when you ignore the payoff matrix, the argument is no longer really about the Prisoner's Dilemma (and so I don't know that it's fair to accuse the barber paradox of having a lot of extra cruft not present in the PD formulation).
But I agree that this is a cute presentation of diagonal arguments in culturally relevant terms. A fun read.
I'm glad you bring this up. I have two comments:
(1) The absoluteness argument shows that intermediate uses of Choice don't matter. They happen (definably) in L, and the conclusion reflects back up to V.
(2) As you suspect, you don't really need Choice for compactness of the Hilbert cube. Given a sequence in it, I can define a convergent subsequence by repeatedly pigeonholing infinitely many points into finitely many holes. I believe sequential compactness is what is needed for the Kakutani theorem.
Variants of open cover compactness are derivable from sequential compactness without Choice in Polish spaces, in case something like that is needed. Or if you somehow really need full open cover compactness... well, return to point (1), I guess.
I gave this paper a pretty thorough read-through, and decided that the best way to really see what was going on would be to write it all down, expanding the most interesting arguments and collapsing others. This has been my main strategy for learning math for a while now, but I just decided recently that I ought to start a blog for this kind of thing. Although I wrote this post mainly for myself, it might be helpful for anyone else who's interested in reading the paper.
Towards the end, I note that the result can be made "constructive", in the sense of avoiding the Axiom of Choice by absoluteness arguments. Basically by Shoenfield's absoluteness theorem, but I went through the argument by hand because I was not aware of that theorem until I went to Wikipedia to find something to link for absoluteness.
The computability problem is due to the unbounded ability of P to reason about theorems in the base theory (so that would be a necessary point of relaxation). A computable total function can't even assign values > 1/2 to all the theorems of PA and values < 1/2 to all the sentences refuted by PA.
The set A is convex because the convex combination (t times one plus (1-t) times the other) of two coherent probability distributions remains a coherent probability distribution. This in turn is because the convex combination of two probability measures over a space of models (cf. definition 1) remains a probability distribution over the space of models.
I think, but am not sure, that your issue is looking at arbitrary points of [0,1]^{L'}, rather than the ones which correspond to probability measures.
Not only is PA not finitely axiomatizable, but any consistent extension of PA isn't (I think this is true, the same proof that works for PA should go through here, but I haven't checked the details)
Well if we had this, we would know immediately that Q + Con(PA) is not an extension of PA (which is what we originally wanted), because it certainly is finitely axiomatizable. I know there are several proofs that PA is not finitely axiomatizable, but I have not read any of them, so can't comment on the strengthened statement, though it sounds true.
Ah, so my question was more along the line: does finite axiomatizability of a stronger (consistent) theory imply finite axiomatizability of the weaker theory? (This would of course imply Q + Con(PA) is not stronger than PA, Q being the usual symbol for Robinson arithmetic.)
On the model theoretic side, I think I can make something work, but it depends on distorting the specific definition of Con(PA) in a way that I'm not really happy about. In any case, I agree that your example is trivial to state and trivial to believe correct, but maybe it's less trivial to prove correct.
Here's what I was thinking:
Consider the predicate P(x) which says "if x != Sx, then x does not encode a PA-proof of 0=1", and let ConMinus(PA) say for all x, P(x). Now, I think one could argue that ConMinus is a fair definition of (or substitute for?) Con, in that qualifying a formula with "if x != Sx" does not change its meaning in the standard model. Alternately, you could push this "if x != Sx" clause deeper, into basically every formula you would use to define the primitive recursive functions needed to talk about consistency in the first place, and you would not change the meanings of these formulas in the standard model. (I guess what I'm saying is that "the" formula asserting the consistency of PA is poorly specified.)
Also, PA is smart enough to prove that numbers are not their own successors, so PA believes in the equivalence of Con and ConMinus. In particular, PA does not prove ConMinus(PA), so PA is not stronger than Q + ConMinus(PA).
On the other hand, let M be the non-negative integers, together with one additional point omega. Put S(omega) = omega, put omega + anything = omega = anything + omega, and similarly for multiplication (except 0 * omega = omega * 0 = 0). I am pretty sure this is a model of Q.
Q is smart enough about its standard integers that it knows none of them encode PA-proofs of 0=1 (the "proves" predicate being Delta_0). Thus the model M satisfies Q + ConMinus(PA). But now we can see that Q + ConMinus(PA) is not stronger than PA, because PA proves "for all x, x is not equal to Sx", yet this statement fails in a model of Q + ConMinus(PA).
EDIT: escape characters for *.
Your post confused me for a moment, because Robinson + Con(PA) is of course not weaker than PA. It proves Con(PA), and PA doesn't.
I see now that your point is that Robinson arithmetic is sufficiently weak compared to PA that PA should not be weaker than Robinson + Con(PA). Is there an obvious proof of this?
(For example, if Robinson + Con(PA) proved all theorems of PA, would this contradict the fact that PA is not finitely axiomatizable?)
Since December, I've been persuing a "remedial computer science education", for the sake of both well-roundedness and employability. My background is in the purest of pure math (Ph. Dropout from a well-ranked program), so I feel I can move fairly quickly here, though the territory is new.
My biggest milestone to date has been solving the first 100 Project Euler problems in Python (no omissions!). I had had a bit of Python experience before, and I picked 100 as the smallest number that sounded impressive (to me).
Second biggest milestone: following a course outline, I wrote an interpreter for a very limited subset of Scheme/Racket. This really helped de-mystify programming languages for me. (Although rather than learn OCaML like the course wanted, I just hacked it together in Python so that I could move on to a new project sooner.)
In the same vein, I'm currently reading and working through SICP, still using Racket. I'm in Chapter 3 of 5, though I'm often peeking ahead to Chapter 4 because it looks pretty exciting.
Of course, I won't be a true LISP wizard without understanding macros, so the next (or concurrent) project is to go through the relevant Racket Docs tutorial.
I have some other likely future projects in mind, though I'm actually trying not to plan too far ahead lest it all appear more daunting.
- Forcing myself through some C, to build character. This was explicitly recommended by a software engineer friend as a more "useful" way to spend my time than learning to LISP.
- An algorithms course, possibly using this book
Registered.
From Kahneman's Thinking, Fast and Slow (p 325):
The probability of a rare event is most likely to be overestimated when the alternative is not fully specified... [Researcher Craig Fox] asked [participants] to estimate the probability that each of the eight participating teams would win the playoff; the victory of each team in turn was the focal event.
... The result: the probability judgments generated sucessively for the eight teams added up to 240%!
Do you (r_claypool) have reason to suspect that Christianity is much more likely to be true than other, (almost-) mutually exclusive supernatural worldviews like, say, Old Norse Paganism? If not, then 5% for Christianity is absurdly high.
But orthonormal, your example displays hindsight bias rather than confirmation bias!
I interpret billswift's comment to mean:
GlaDOS, you should not just seek confirmation of the legitimacy of the text; you should also seek refutation.
(Or possibly it was meant the other way around?)
In any case, I agree that billswift's comment is off-base, because GLaDOS' comment does not actually show confirmation bias.
I am currently reading Kahneman's book, and about 100 pages in I realized I was going to cache a lot more of the information if I started mapping out some of the dependencies between ideas in a directed graph. Example: I've got an edge from {Substitution} to {Affect heuristic}, labeled with the reminder "How do I feel about it? vs. What do I think about it?". My goal is not to write down everything I want to remember, rather to (1) provide just enough to jog my memory when I consult this graph in the future, and (2) force me to think critically about what I'm reading when deciding whether or not add more nodes and edges.
Grognor, I don't think it's fair to insinuate that you may have learned a wrong lesson here. If it's wrong (I actually doubt that it is), then it's up to you to try to resist learning it.
As regards walking readers into a trap to teach them lessons, one of my all-time favorite LW posts does exactly this, but is very forthcoming about it. By contrast, I think thomblake overestimates the absurdity of the examples here: I thought they seemed plausible, and that "Frodo Baggins" was just poor reasoning. The comments show I'm not alone here. This level of subtlety may be appropriate on April 1st, but by April 3rd, it's dated. I would recommend editing in a final line after the conclusion but before the references indicating that this post was an April Fool's joke.
I really, really dislike April Fool's jokes like this. Somebody will stumble onto this post at a later date, read it quickly, and come away misinformed.
I'll grant that the obviously horrible "Frodo Baggins" example should leave a bad taste in rationalists' mouths, but a glance at the comments shows that several readers initially took the post seriously, even on April 1st.
I suspect it has to do with some LW users taking FAI seriously and dropping everything to join the cause, as suggested in this comment by cousin_it. In the following discussion, RichardKennaway specifically links to "Taking ideas seriously".
Oh! Well I feel stupid indeed. I thought that all the text after the sidenote was a quotation from Luke (which I would find at the link in said sidenote), rather than a continuation of Mike Darwin's statement. I don't know why I didn't even consider the latter.
Additionally, the link in the OP is wrong. I followed it in hopes that Luke would provide a citation where I could see these estimates.
Well, models can have the same reals by fiat. If I cut off an existing model below an inaccessible, I certainly haven't changed the reals. Alternately I could restrict to the constructible closure of the reals L(R), which satisfies ZF but generally fails Choice (you don't expect to have a well-ordering of the reals in this model).
I think, though, that Stuart_Armstrong's statement
Often, different models of set theory will have the same model of the reals inside them
is mistaken, or at least misguided. Models of set theory and their corresponding sets of reals are extremely pliable, especially by the method of forcing (Cohen proved CH can consistently fail by just cramming tons of reals into an existing model without changing the ordinal values of that model's Alephs), and I think it's naive to hope for anything like One True Real Line.
The predicate "is a real number" is absolute for transitive models of ZFC in the sense that if M and N are such models with M contained in N, then for every element x of M, the two models agree on whether x is a real number. But it can certainly happen than N has more real numbers than M; they just have to lie completely outside of M.
Example 1: If M is countable with respect to N, then obviously M doesn't contain all of N's reals.
Example 2 (perhaps more relevant to what you asked): Under mild large cardinal assumptions (existence of a measurable cardinal is sufficient), there exists a real number 0# (zero-sharp) which encodes the shortcomings of Gödel's Constructible Universe L. In particular 0# lies outside of L, so L does not contain all the reals.
Thus if you started with L and insisted on adding a measurable cardinal on top, you would have to also add more reals as well.
Cached wisdom?
Anyway, I'd be more interested in hearing the regrets of those people who lived true to themselves, didn't work too hard, let themselves be happier, etc. Do they wish they'd worked harder and "made something of themselves"? Been better at cooperating with the rest of society?
Signed up. Upon reflection, I believe the deadline is what let me get away with doing this right now at the expense of putting off studying for yet another hour. But it's hard to say, because I decided pretty quickly that I was going to do it, and I only came up with that explanation after the fact.
Actually my revised opinion, as expressed in my reply to Tyrell_McAllister, is that the authors' analysis is correct given the highly unlikely set-up. In a more realistic scenario, I accept the equivalences A~B and C~D, but not B~C.
I claim that the answers to E, F, and G should indeed be the same, but H is not equivalent to them. This should be intuitive. Their line of argument does not claim H is equivalent to E/F/G - do the math out and you'll see.
I really don't know what you have in mind here. Do you also claim that cases A, B, C are equivalent to each other but not to D?
After further reflection, I want to say that the problem is wrong (and several other commenters have said something similar): the premise that your money buys you no expected utility post mortem is generally incompatible with your survival having finite positive utility.
Your calculation is of course correct insofar as it stays within the scope of the problem. But note that it goes through exactly the same for my cases F and G. There you'll end up paying iff X ≤ L, and thus you'll pay the same amount to remove just 1 bullet from a full 100-shooter as to remove all 100 of them.
I also reject the claim that C and B are equivalent (unless the utility of survival is 0, +infinity, or -infinity). If I accepted their line of argument, then I would also have to answer the following set of questions with a single answer.
Question E: Given that you're playing Russian Roulette with a full 100-shooter, how much would you pay to remove all 100 of the bullets?
Question F: Given that you're playing Russian Roulette with a full 1-shooter, how much would you pay to remove the bullet?
Question G: With 99% certainty, you will be executed. With 1% certainty you will be forced to play Russian Roulette with a full 1-shooter. How much would you pay to remove the bullet?
Question H: Given that you're playing Russian Roulette with a full 100-shooter, how much would you pay to remove one of the bullets?