nick_hay feed - LessWrong 2.0 Readernick_hay’s posts and comments on the Effective Altruism Forumen-usComment by Nick_Hay on Standard and Nonstandard Numbers
https://lw2.issarice.com/posts/i7oNcHR3ZSnEAM29X/standard-and-nonstandard-numbers#qxzxBs7q9ghS6hJRz
<p>Very nice. These notes say that every countable nonstandard model of Peano arithmetic is isomorphic, as an ordered set, to the natural numbers followed by lexicographically ordered pairs (r, z) for r a positive rational and z an integer. If I remember rightly, the ordering can be defined in terms of addition: x <= y iff exists z. x+z <= y. So if we want to have a countable nonstandard model of Peano arithmetic with successor function and addition we need all these nonstandard numbers.</p>
<p>It seems that if we only care about Peano arithmetic with the successor function, then the naturals plus a single copy of the integers is a model. If I was trying to prove this, I'd think that just looking at the successor function, to any first-order predicate an element of the copy of the integers would be indistinguishable from a very large standard natural number, by standard FO locality results.</p>
nick_hayqxzxBs7q9ghS6hJRz2012-12-20T12:02:42.585ZComment by Nick_Hay on Standard and Nonstandard Numbers
https://lw2.issarice.com/posts/i7oNcHR3ZSnEAM29X/standard-and-nonstandard-numbers#DbnCha7eytPPNKaeP
<p>Fascinating, I thought Tennanbaum's theorem implied non-standard models were rather impossible to visualize. The non-standard model of Peano arithmetic illustrated in the diagram only gives the successor relation, there's no definition of addition and multiplication. Tennenbaum's theorem implies there's no computable way to do this, but is there a proof that they can be defined at all for this particular model?</p>
nick_hayDbnCha7eytPPNKaeP2012-12-20T07:00:03.336ZComment by Nick_Hay on Review of Lakoff & Johnson, 'Philosophy in the Flesh'
https://lw2.issarice.com/posts/iwEDctfACQ6sQQk9j/review-of-lakoff-and-johnson-philosophy-in-the-flesh#CAXmJmuKjZoRSF6sm
<p>The chapter on Chomsky is contrasting the generative grammar approach, which Lakoff used to work within, to the cognitive science inspired cognitive linguistics approach, which Lakoff has been working in for the last few decades. Cognitive linguistics includes cognitive semantics which is rather different to generative semantics.</p>
nick_hayCAXmJmuKjZoRSF6sm2011-11-07T00:21:24.562ZComment by Nick_Hay on Review of Lakoff & Johnson, 'Philosophy in the Flesh'
https://lw2.issarice.com/posts/iwEDctfACQ6sQQk9j/review-of-lakoff-and-johnson-philosophy-in-the-flesh#EZojkdxRMYWj9QFYa
<p>I largely agree with your critique, but more as a description of a different book that could have been written in this book's place. For example, a book on philosophy applying the results of this book's methodology, of which chapter 25 is a poor substitute. Or books drilling into one particular area in more detail with careful connections to the literature. This book serves better as an inspiring manifesto.</p>
<blockquote>
<p>While these chapters are enlightening, they depend too heavily on the earlier account of metaphor, rarely draw upon other findings in cognitive science that are likely relevant, are sparse in scientific citations, and (as I've said) rarely cite actual philosophers claiming the things they say that philosophers claim.</p>
</blockquote>
<p>Why is the dependence on the earlier theory of metaphor a problem?</p>
<p>Do you think the authors misrepresent what philosophers claim, in those chapters addressing philosophy (15-24) rather than (informal) philosophical ideas (9-14)?</p>
nick_hayEZojkdxRMYWj9QFYa2011-11-06T23:17:38.306ZComment by Nick_Hay on Procedural Knowledge Gaps
https://lw2.issarice.com/posts/ka8eveZpT7hXLhRTM/procedural-knowledge-gaps#naFqwfFmQbcJKAPRp
<p>If the goal in exercise is to lose weight, have you tried replacing carbohydrates with fat in your diet? Forcing yourself to exercise will serve to work up an appetite and make you hungry, but not to lose weight. There is a correlation between exercising and being thin, but the causality is generally perceived the wrong way around. There is also a correlation between exercising and (temporarily) losing weight, but that is confounded by diet changes which typically involving reducing carbohydrate intake.</p>
<p>I've heard you mention Gary Taube's work, but not that you've read it. If you haven't read his book he has a new shorter on which is well worth reading, linked here: <a href="http://www.garytaubes.com/2010/12/inanity-of-overeating/">http://www.garytaubes.com/2010/12/inanity-of-overeating/</a> The appendix has specific diet recommendations. Also good are these notes: <a href="http://higher-thought.net/complete-notes-to-good-calories-bad-calories/">http://higher-thought.net/complete-notes-to-good-calories-bad-calories/</a></p>
nick_haynaFqwfFmQbcJKAPRp2011-02-08T07:38:10.518ZComment by Nick_Hay on Berkeley LW Meet-up Saturday November 6
https://lw2.issarice.com/posts/faYaa4ry7M7buSP9L/berkeley-lw-meet-up-saturday-november-6#Krk9hNsrQmgcxwMxG
<p>The T-rex is in the Valley Life Sciences Building. There's a few other fossils there too.</p>
nick_hayKrk9hNsrQmgcxwMxG2010-11-06T02:55:54.795ZComment by Nick_Hay on Fundamentally Flawed, or Fast and Frugal?
https://lw2.issarice.com/posts/psQYbMLWzS9sTsT2M/fundamentally-flawed-or-fast-and-frugal#ZfiZJDew4hQvnm9eh
<p>Idealized Bayesians don't have to be logically omniscient -- they can have a prior which assigns probability to logically impossible worlds.</p>
nick_hayZfiZJDew4hQvnm9eh2009-12-21T22:52:01.392ZComment by Nick_Hay on Auckland meet up Saturday Nov 28th
https://lw2.issarice.com/posts/AnGR4v3oDRAkyhdqu/auckland-meet-up-saturday-nov-28th#W7YTuFYYaKCynSeya
<p>I would be there, but I'm not back in NZ until 16th December! Everyone else should definitely go.</p>
nick_hayW7YTuFYYaKCynSeya2009-11-15T07:26:07.738ZComment by Nick_Hay on Expected utility without the independence axiom
https://lw2.issarice.com/posts/tGhz4aKyNzXjvnWhX/expected-utility-without-the-independence-axiom#QfwFmiBPiGCX6AbYW
<p>The Von-Neumann Morgenstern axioms talk just about preference over lotteries, which are simply probability distributions over outcomes. That is you have an unstructured set O of outcomes, and you have a total preordering over Dist(O) the set of probability distributions over O. They do not talk about a utility function. This is quite elegant, because to make decisions you must have preferences over distributions over outcomes, but you don't need to assume that O has a certain structure, e.g. that of the reals.</p>
<p>The expected utility theorem says that preferences which satisfy the first four axioms are exactly those which can be represented by:</p>
<p> A <= B iff E[U;A] <= E[U;B]</p>
<p>for some utility function U: O -> R, where</p>
<p> E[U;A] = \sum{o} A(o) U(o)</p>
<p>However, U is only defined up to positive affine transformation i.e. aU+b will work equally well for any a>0. In particular, you can amplify the standard deviation as much as you like by redefining U.</p>
<p>Your axioms require you to pick a particular representation of U for them to make sense. How do you choose this U? Even with a mechanism for choosing U, e.g. assume bounded nontrivial preferences and pick the unique U such that \sup{x} U(x) = 1 and \inf{x} U(x) = 0, this is still less elegant than talking directly about lotteries.</p>
<p>Can you redefine your axioms to talk only about lotteries over outcomes?</p>
nick_hayQfwFmiBPiGCX6AbYW2009-10-29T02:08:58.398ZComment by Nick_Hay on Extreme risks: when not to use expected utility
https://lw2.issarice.com/posts/kmjCaq66MDkfvZpFX/extreme-risks-when-not-to-use-expected-utility#JtGgvHpo336voLjg2
<p>To be concrete, suppose you want to maximise the average utility people have, but you also care about fairness so, all things equal, you prefer the utility to be clustered about its average. Then maybe your real utility function is not</p>
<p>U = (U[1] + .... + U[n])/n</p>
<p>but</p>
<p>U' = U + ((U[1]-U)^2 + .... + (U[n]-U)^2)/n</p>
<p>which is in some sense a mean minus a variance.</p>
nick_hayJtGgvHpo336voLjg22009-10-23T22:35:07.495ZComment by Nick_Hay on Extreme risks: when not to use expected utility
https://lw2.issarice.com/posts/kmjCaq66MDkfvZpFX/extreme-risks-when-not-to-use-expected-utility#AQrtQwgQSWcxyeqAa
<p>Can you translate your complaint into a problem with the independence axiom in particular?</p>
<p>Your second example is not a problem of variance in final utility, but aggregation of utility. Utility theory doesn't force "Giving 1 util to N people" to be equivalent to "Giving N util to 1 person". That is, it doesn't force your utility U to be equal to U1 + U2 + ... + UN where Ui is the "utility for person i".</p>
nick_hayAQrtQwgQSWcxyeqAa2009-10-23T22:24:33.927ZComment by Nick_Hay on Nonparametric Ethics
https://lw2.issarice.com/posts/eMSoo6izTTrL9j6iZ/nonparametric-ethics#kRNdcy8LnW5CiRHdm
<p>Your use of the terms parametric vs. nonparametric doesn't seem to be that used by people working in nonparametric Bayesian statistics, where the distinction is more like whether your statistical model has a fixed finite number of parameters or has no such bound. Methods such as Dirichlet processes, and its many variants (Hierarchical DP, HDP-HMM, etc), go beyond simple modeling of surface similarities using similarity of neighbours.</p>
<p>See, for example, this list of publications coauthored by Michael Jordan:</p>
<ul>
<li>Bayesian Nonparametrics <a href="http://www.cs.berkeley.edu/~jordan/bnp.html">http://www.cs.berkeley.edu/~jordan/bnp.html</a></li>
</ul>
nick_haykRNdcy8LnW5CiRHdm2009-06-21T22:23:36.050ZComment by Nick_Hay on That You'd Tell All Your Friends
https://lw2.issarice.com/posts/ajYePZpAM4FYMwrqT/that-you-d-tell-all-your-friends#vryKmycDBN84ahqDf
<p>Thou Art Godshatter: gives an intuitive grasp for why and how human morality is complex, but that not any complex thing will do.</p>
nick_hayvryKmycDBN84ahqDf2009-03-02T02:28:44.393ZComment by Nick_Hay on Issues, Bugs, and Requested Features
https://lw2.issarice.com/posts/qoFhFhGuFDRkb4GTq/issues-bugs-and-requested-features#P2fDzfWMHiCxJvZTx
<p>How about buttons "High quality", "Low quality", "Accurate", "Inaccurate". We're increasing options here, but there's probably a nice way to design the interface to reduce the cognitive load.</p>
<p>Using the word "vote" seems broken here more generally -- we aren't implementing some democratic process, we're aggregating judgments (read: collecting evidence) across a population.</p>
nick_hayP2fDzfWMHiCxJvZTx2009-02-28T08:49:09.959ZComment by Nick_Hay on Issues, Bugs, and Requested Features
https://lw2.issarice.com/posts/qoFhFhGuFDRkb4GTq/issues-bugs-and-requested-features#BcDF43gCsjcLqALMu
<p>Because quality and truth are separate judgments in practice, and forcing them to be conflated into a single scale is losing information. To the extent that truth is positively correlated with quality this will fall out automatically: highly truthy posts will tend to have high quality. Low quality and high truth are not opposites.</p>
nick_hayBcDF43gCsjcLqALMu2009-02-28T08:44:52.888ZComment by Nick_Hay on The Thing That I Protect
https://lw2.issarice.com/posts/3wyMbgeFfQWntFxTf/the-thing-that-i-protect#MTcwdaso9cfAhe6hv
<p>Z. M. Davis: Good point, I was brushing that distinction under the rug. From this perspective all people arguing about values are trying to change someone's value computation, to a greater or lesser degree i.e. this is not the place to look if you want to discriminate between "liberal" and "conservative".</p>
<p>With the obvious way to implement a CEV, you start by modeling a population of actual humans (e.g. Earth's), then consider extrapolations of these models (know more, thought faster, etc). No "wipe culturally-defined values" step, however that would be defined.</p>
<p>Where was it suggested otherwise?</p>
nick_hayMTcwdaso9cfAhe6hv2009-02-08T05:03:26.000ZComment by Nick_Hay on The Thing That I Protect
https://lw2.issarice.com/posts/3wyMbgeFfQWntFxTf/the-thing-that-i-protect#8Kc39MxJnTDfLwsAz
<p>Ian C: neither group is changing human values as it is referred to here: everyone is still human, no one is suggesting neurosurgery to change how brains compute value. See the post <a href="/lw/y3/value_is_fragile/">value is fragile</a>.</p>
nick_hay8Kc39MxJnTDfLwsAz2009-02-08T03:53:07.000ZComment by Nick_Hay on Continuous Improvement
https://lw2.issarice.com/posts/QfpHRAMRM2HjteKFK/continuous-improvement#HGcBnzLknhQYBxeLz
<p>Interestingly, you can have unboundedly many children with only quadratic population growth, so long as they are exponentially spaced. For example, give each newborn sentient a resource token, which can be used after the age of maturity (say, 100 years or so) to fund a child. Additionally, in the years 2^i every living sentient is given an extra resource token. One can show there is at most quadratic growth in the number of resource tokens. By adjusting the exponent in 2^i we can get growth O(n^{1+p}) for any nonnegative real p.</p>
nick_hayHGcBnzLknhQYBxeLz2009-01-11T23:56:30.000ZComment by Nick_Hay on What I Think, If Not Why
https://lw2.issarice.com/posts/z3kYdw54htktqt9Jb/what-i-think-if-not-why#6r6s8pKnv7xBdyTXx
<p>Phil: Yes. CEV completely replaces and overwrites itself, by design. Before this point it does not interact with the external world to change it in a significant sense (it cannot avoid all change; e.g. its computer will add tiny vibrations to the Earth, as all computers do). It executes for a while then overwrites itself with a computer program (skipping every intermediate step here). By default, and if anything goes wrong, this program is "shutdown silently, wiping the AI system clean."</p>
<p>(When I say "CEV" I really mean a FAI which satisfies the spirit behind the extremely partial specification given in the CEV document. The CEV document says essentially nothing of how to implement this specification.)</p>nick_hay6r6s8pKnv7xBdyTXx2008-12-12T02:57:00.000ZComment by Nick_Hay on The Nature of Logic
https://lw2.issarice.com/posts/c93eRh3mPaN62qrD2/the-nature-of-logic#4hZ5F4hPSmJPAsBTT
<p>Personally, I prefer the longer posts.</p>
nick_hay4hZ5F4hPSmJPAsBTT2008-11-18T04:56:55.000ZComment by Nick_Hay on Expected Creative Surprises
https://lw2.issarice.com/posts/rEDpaTTEzhPLz4fHh/expected-creative-surprises#aaTTCzwuFuXkuHezY
<p>guest: right, so with those definitions you are overconfident if you are suprised more than you expected, underconfident if you are suprised less, calibration being how close your suprisal is to your expectation of it.</p>
nick_hayaaTTCzwuFuXkuHezY2008-10-25T08:53:59.000ZComment by Nick_Hay on Expected Creative Surprises
https://lw2.issarice.com/posts/rEDpaTTEzhPLz4fHh/expected-creative-surprises#eMcAXNw3zAJXeRpDt
<p>I think there's a sign error in my post -- C(x0) = \log p(x0) + H(p) it should be.</p>
nick_hayeMcAXNw3zAJXeRpDt2008-10-25T08:03:47.000ZComment by Nick_Hay on Expected Creative Surprises
https://lw2.issarice.com/posts/rEDpaTTEzhPLz4fHh/expected-creative-surprises#zJzADwrKtQ9HjWF7C
<p>Anon: no, I mean the log probability. In your example, the calibratedness will generally be high: - \log 0.499 - H(p) ~= 0.00289 each time you see tails, and - log 0.501 - H(p) ~= - 0.00289 each time you come up tails. It's continuous.</p>
<p>Let's be specific. We have H(p) = - \sum_x p(x) \log p(x), where p is some probability distribution over a finite set. If we observe x0, the say the predictor's calibration is</p>
<p>C(x0)
= \sum_x p(x) \log p(x) - \log p(x0)
= - \log p(x0) - H(p)</p>
<p>so the expected calibration is 0 by the definition of H(p). The calibration is continuous in p. If \log p(x0) is higher then the expected value of \log p(x) then we are underconfident and C(x0) < 0; if \log p(x0) is lower than expected we are overconfident, and C>0.</p>
<p>With q = p(x) d(x,x0) the non-normalised probability distribution that assigns value only x0, we have</p>
<p>C = D(p||q)</p>
<p>so this is a relative entropy of sorts.</p>
nick_hayzJzADwrKtQ9HjWF7C2008-10-25T08:00:23.000ZComment by Nick_Hay on Expected Creative Surprises
https://lw2.issarice.com/posts/rEDpaTTEzhPLz4fHh/expected-creative-surprises#qShSMaxSzjToJWSQn
<p>Anon: well-calibrated means roughly that in the class of all events you think have probability p to being true, the proportion of them that turn out to be true is p.</p>
<p>More formally, suppose you have a probability distribution over something you are going to observe. If the log probability of the event which actually occurs is equal to the entropy of your distribution, you are well calibrated. If it is above you are over confident, if it is below you are under confident. By this measure, assigning every possibility equal probability will always be calibrated.</p>
<p>This is related to relative entropy.</p>
nick_hayqShSMaxSzjToJWSQn2008-10-25T03:37:23.000ZComment by Nick_Hay on How to Seem (and Be) Deep
https://lw2.issarice.com/posts/aSQy7yHj6nPD44RNo/how-to-seem-and-be-deep#vZsJvCFfPG77f8xmq
<p>Tiiba:</p>
<p>The hypothesis is actual immortality, to which nonzero probability is being assigned. For example, suppose under some scenario your probability of dying at each time decreases by a factor of 1/2. Then, your total probability of dying is 2 times the probability of dying at the very first step, which we can assume far less than 1/2.</p>nick_hayvZsJvCFfPG77f8xmq2007-10-16T22:43:00.000Z