jacobjacob feed - LessWrong 2.0 Reader jacobjacob’s posts and comments on the Effective Altruism Forum en-us Comment by jacobjacob on What are CAIS' boldest near/medium-term predictions? https://www.lesswrong.com/posts/3aRwhGWFkWPJiwnCq/what-are-cais-boldest-near-medium-term-predictions#L8X2ChbwRZ7j4Gra4 <blockquote> This is a prediction I make, with &quot;general-seeming&quot; replaced by &quot;more general&quot;, and I think of this as a prediction inspired much more by CAIS than by EY/Bostrom. </blockquote><p>I notice I&#x27;m confused. My model of CAIS predicts that there would be poor returns to building general services compared to specialised ones (though this might be more of a claim about economics than a claim about the nature of intelligence).</p> jacobjacob L8X2ChbwRZ7j4Gra4 2019-04-04T10:13:45.744Z Comment by jacobjacob on What are CAIS' boldest near/medium-term predictions? https://www.lesswrong.com/posts/3aRwhGWFkWPJiwnCq/what-are-cais-boldest-near-medium-term-predictions#P7bdWvujfFe8aPsqB <p><a href="https://www.lesswrong.com/posts/D3NspiH2nhKA6B2PE/what-evidence-is-alphago-zero-re-agi-complexity#Qkn5aBW8hcvTEbBFr">The following exchange</a> is also relevant:</p><p> [-] <strong><a href="https://www.lesswrong.com/users/raiden">Raiden</a> </strong><a href="https://www.lesswrong.com/posts/D3NspiH2nhKA6B2PE/what-evidence-is-alphago-zero-re-agi-complexity#Qkn5aBW8hcvTEbBFr">1y</a> <a href="https://www.lesswrong.com/posts/D3NspiH2nhKA6B2PE/what-evidence-is-alphago-zero-re-agi-complexity#Qkn5aBW8hcvTEbBFr">link</a> 30</p><p>Robin, or anyone who agrees with Robin:</p><p>What evidence can you imagine would convince you that AGI would go FOOM?</p><p>Reply[-] <strong><a href="https://www.lesswrong.com/users/jprwg">jprwg</a> </strong><a href="https://www.lesswrong.com/posts/D3NspiH2nhKA6B2PE/what-evidence-is-alphago-zero-re-agi-complexity#Thc9tg4qPEFzMyHjs">1y</a> <a href="https://www.lesswrong.com/posts/D3NspiH2nhKA6B2PE/what-evidence-is-alphago-zero-re-agi-complexity#Thc9tg4qPEFzMyHjs">link</a> 22</p><p>While I find Robin&#x27;s model more convincing than Eliezer&#x27;s, I&#x27;m still pretty uncertain.</p><p>That said, two pieces of evidence that would push me somewhat strongly towards the Yudkowskian view:</p><ul><li>A fairly confident scientific consensus that the human brain is actually simple and homogeneous after all. This could perhaps be the full blank-slate version of Predictive Processing as Scott Alexander <a href="http://slatestarcodex.com/2017/09/05/book-review-surfing-uncertainty/">discussed</a>recently, or something along similar lines.</li><li>Long-run data showing AI systems gradually increasing in capability without any increase in complexity. The AGZ example here might be part of an overall trend in that direction, but as a single data point it really doesn&#x27;t say much.</li></ul><p>Reply[-] <strong><a href="https://www.lesswrong.com/users/robinhanson">RobinHanson</a> </strong><a href="https://www.lesswrong.com/posts/D3NspiH2nhKA6B2PE/what-evidence-is-alphago-zero-re-agi-complexity#Gsgecn55fjFibbaSB">1y</a> <a href="https://www.lesswrong.com/posts/D3NspiH2nhKA6B2PE/what-evidence-is-alphago-zero-re-agi-complexity#Gsgecn55fjFibbaSB">link</a> 23</p><p>This seems to me a reasonable statement of the kind of evidence that would be most relevant.</p><p></p> jacobjacob P7bdWvujfFe8aPsqB 2019-03-29T14:01:16.820Z Comment by jacobjacob on What are CAIS' boldest near/medium-term predictions? https://www.lesswrong.com/posts/3aRwhGWFkWPJiwnCq/what-are-cais-boldest-near-medium-term-predictions#k384rrkgeJCq5CzP6 <p>EY <a href="https://www.lesswrong.com/posts/shnSyzv4Jq3bhMNw5/alphago-zero-and-the-foom-debate">seems to have interpreted</a> AlphaGo Zero as strong evidence for his view in the AI-foom debate, though <a href="https://www.facebook.com/yudkowsky/posts/10155848910529228?pnref=story">Hanson disagrees</a>. </p><p>EY: </p><blockquote>Showing excellent narrow performance *using components that look general* is extremely suggestive [of a future system that can develop lots and lots of different &quot;narrow&quot; expertises, using general components].</blockquote><p> Hanson:</p><blockquote>It is only broad sets of skills that are suggestive. Being very good at specific tasks is great, but doesn&#x27;t suggest much about what it will take to be good at a wide range of tasks. [...] The components look MORE general than the specific problem on which they are applied, but the question is: HOW general overall, relative to the standard of achieving human level abilities across a wide scope of tasks. </blockquote><p>It&#x27;s somewhat hard to hash this out as an absolute rather than conditional prediction (e.g. <em>conditional on </em>there being breakthroughs involving some domain-specific hacks, and major labs keep working on them, they will somewhat quickly superseded by breakthroughs with general-seming architectures). </p><p><em>Maybe</em> EY would be more bullish on Starcraft without imitation learning, or AlphaFold with only 1 or 2 modules (rather than <a href="https://docs.google.com/spreadsheets/d/1M217tZ2ZQpqsJ8yc43qmc3F54yJFOA9E8OXp06Z_y28/edit?usp=sharing">4/5 or 8/9 depending on how you count</a>).</p> jacobjacob k384rrkgeJCq5CzP6 2019-03-29T13:11:33.030Z Comment by jacobjacob on What would you need to be motivated to answer "hard" LW questions? https://www.lesswrong.com/posts/zEMzFGhRt4jZwyJqt/what-would-you-need-to-be-motivated-to-answer-hard-lw#gKwxwbzn9KWhwgjbg <p>If people provided this as a service, they might be risk-averse (it might make sense for people to be risk-averse with their runway), which means you&#x27;d have to pay more than hourly rate/chance of winning.</p><p>This might not be a problem, as long as the market does the cool thing markets do: allowing you to find someone with a lower opportunity cost than you for doing something.</p> jacobjacob gKwxwbzn9KWhwgjbg 2019-03-28T23:27:58.061Z Comment by jacobjacob on What would you need to be motivated to answer "hard" LW questions? https://www.lesswrong.com/posts/zEMzFGhRt4jZwyJqt/what-would-you-need-to-be-motivated-to-answer-hard-lw#JvWTJacWMKPaXRNhN <p>I think the question, narrowly interpreted as &quot;what would cause <em>me</em> to spend more time on the object-level answering questions on LW&quot; doesn&#x27;t capture most of the exciting things that happen when you build an economy around something. In particular, that suddenly makes various auxiliary work valuable. Examples: </p><ul><li>Someone spending a year living off of one’s savings, learning how to summarise comment threads, with the expectation that people will pay well for this ability in the following years</li><li>A competent literature-reviewer gathering 5 friends to teach them the skill, in order to scale their reviewing capacity to earn more prize money</li><li>A college student building up a strong forecasting track-record and then being paid enough to do forecasting for a few hours each week that they can pursue their own projects instead of having to work full-time over the summer</li><li>A college student dropping out to work full-time on answering questions on LessWrong, expecting this to provide a stable funding stream for 2+ years</li><li>A professional with a stable job and family and a hard time making changes to their life-situation, taking 2 hours/week off from work to do skilled cost-effectiveness analyses, while being fairly compensated</li><li>Some people starting a “Prize VC” or “Prize market maker”, which attempts to find potential prize winners and connect them with prizes (or vice versa), while taking a cut somehow</li></ul><p>I have an upcoming post where I describe in more detail what I think is required to make this work.</p> jacobjacob JvWTJacWMKPaXRNhN 2019-03-28T23:24:18.055Z Comment by jacobjacob on Unconscious Economies https://www.lesswrong.com/posts/PrCmeuBPC4XLDQz8C/unconscious-economies#cupvnfJPhgtkHxdpT <p>Thanks for pointing that out, the mention of YouTube might be misleading. Overall this should be read as a first-principles argument, rather than an empirical claim about YouTube in particular. </p> jacobjacob cupvnfJPhgtkHxdpT 2019-03-28T16:45:21.285Z Comment by jacobjacob on What are CAIS' boldest near/medium-term predictions? https://www.lesswrong.com/posts/3aRwhGWFkWPJiwnCq/what-are-cais-boldest-near-medium-term-predictions#JX8cmipTjTEikWMJA <p>Why are you measuring it in proportion of time-until-agent-AGI and not years? If it takes 2 years from comprehensive services to agent, and most jobs are automatable within 1.5 years, that seems a lot less striking and important than the claim pre-operationalisation. </p> jacobjacob JX8cmipTjTEikWMJA 2019-03-28T16:11:50.551Z Comment by jacobjacob on What are CAIS' boldest near/medium-term predictions? https://www.lesswrong.com/posts/3aRwhGWFkWPJiwnCq/what-are-cais-boldest-near-medium-term-predictions#MuYRoPF9goiGsib4k <p><u><a href="https://www.lesswrong.com/posts/HvNAmkXPTSoA4dvzv/comments-on-cais#o2syW8jqcQFWcyLQu">Wei_Dai writes</a></u>:</p><blockquote>A major problem in predicting CAIS safety is to understand the order in which various services are likely to arise, in particular whether risk-reducing services are likely to come before risk-increasing services. This seems to require a lot of work in delineating various kinds of services and how they depend on each other as well as on algorithmic advancements, conceptual insights, computing power, etc. (instead of treating them as largely interchangeable or thinking that safety-relevant services will be there when we need them). Since this analysis seems very hard to do much ahead of time, I think we&#x27;ll have to put very wide error bars on any predictions of whether CAIS would be safe or unsafe, until very late in the game.</blockquote> jacobjacob MuYRoPF9goiGsib4k 2019-03-28T13:17:32.439Z Comment by jacobjacob on What are CAIS' boldest near/medium-term predictions? https://www.lesswrong.com/posts/3aRwhGWFkWPJiwnCq/what-are-cais-boldest-near-medium-term-predictions#cviEz93vy6YbA2325 <p> <u><a href="https://www.lesswrong.com/posts/HvNAmkXPTSoA4dvzv/comments-on-cais">Ricraz writes</a></u>:</p><blockquote>I&#x27;m broadly sympathetic to the empirical claim that we&#x27;ll develop AI services which can replace humans at most cognitively difficult jobs significantly before we develop any single superhuman AGI (one unified system that can do nearly all cognitive tasks as well as or better than any human). </blockquote><p>I’d be interested in operationalising this further, and hearing takes on how many years “significantly before” entails. </p><p>He also adds:</p><blockquote>One plausible mechanism is that deep learning continues to succeed on tasks where there&#x27;s lots of training data, but doesn&#x27;t learn how to reason in general ways - e.g. it could learn from court documents how to imitate lawyers well enough to replace them in most cases, without being able to understand law in the way humans do. Self-driving cars are another pertinent example. If that pattern repeats across most human professions, we might see massive societal shifts well before AI becomes dangerous in the adversarial way that’s usually discussed in the context of AI safety.</blockquote> jacobjacob cviEz93vy6YbA2325 2019-03-28T13:16:50.692Z What are CAIS' boldest near/medium-term predictions? https://www.lesswrong.com/posts/3aRwhGWFkWPJiwnCq/what-are-cais-boldest-near-medium-term-predictions <p><strong>Background and questions</strong></p><p>Since Eric Drexler publicly released his “<a href="https://www.fhi.ox.ac.uk/wp-content/uploads/Reframing_Superintelligence_FHI-TR-2019-1.1-1.pdf">Comprehensive AI services model</a>” (CAIS) there has been a series of analyses on LW, from <a href="https://www.alignmentforum.org/posts/x3fNwSe5aWZb5yXEG/reframing-superintelligence-comprehensive-ai-services-as#sXHXAfSKWPyEMhtcu">rohinmshah</a>, <a href="https://www.lesswrong.com/posts/HvNAmkXPTSoA4dvzv/comments-on-cais">ricraz</a>, <a href="https://www.lesswrong.com/posts/bXYtDfMTNbjCXFQPh/drexler-on-ai-risk">PeterMcCluskey</a>, and others.</p><p>Much of this discussion focuses on the implications of this model for safety strategy and resource allocation. In this question I want to focus on the empirical part of the model.</p><ul><li>What are the boldest predictions the CAIS model makes about what the world will look in &lt;=10 years?</li></ul><p>“Boldest” might be interpreted as those predictions which CAIS gives a decent chance, but which have the lowest probability under other “worldviews” such as the Bostrom/Yudkowsky paradigm. </p><p>A prediction which <em>all</em> these worldviews agree on, but which is nonetheless quite bold, is less interesting for present purposes (possibly something like that we will see faster progress than places like mainstream academia expect).</p><p>Some other related questions:</p><ul><li>If you disagree with Drexler, but expect there to be empirical evidence within the next 1-10 years that would change your mind, what is it? </li><li>If you expect there to be events in that timeframe causing you to go “I told you so, the world sure doesn’t look like CAIS”, what are they?</li></ul><p></p><p><strong>Clarifications and suggestions</strong></p><p>I should clarify that answers can be about things that would change your mind about whether CAIS is safer than other approaches (see e.g. the Wei_Dai comment linked below). </p><p>But I suggest avoiding discussion of cruxes which are more theoretical than empirical (e.g. how decomposable high-level tasks are) unless you have a neat operationalisation for making them empirical (e.g. whether there will be evidence of large economies-of-scope of the most profitable automation services).</p><p>Also, it might be <em>really</em> hard to get this down to a single prediction, so it might be useful to pose a cluster of predictions and different operationalisations, and/or using conditional predictions. </p> jacobjacob 3aRwhGWFkWPJiwnCq 2019-03-28T13:14:32.800Z Comment by jacobjacob on Do you like bullet points? https://www.lesswrong.com/posts/dEcHid7tZPDNvhL4k/do-you-like-bullet-points#e5BLRoLXq6ikPRDSR <p>Another data-point: I love bullet points and have been sad and confused about how little they're used in writing generally. In fact, when reading dense text, I often invest a few initial minutes in converting it to bullet points just in order to be able to read and understand it better.</p> <p>Here's PG on a related topic, sharing some of his skepticism for when bullet points are not appropriate: <a href="http://paulgraham.com/nthings.html">http://paulgraham.com/nthings.html</a></p> jacobjacob e5BLRoLXq6ikPRDSR 2019-03-26T23:42:18.258Z Comment by jacobjacob on Understanding information cascades https://www.lesswrong.com/posts/2uDBJWCksvzhDzHGf/understanding-information-cascades#hsontxwhWpFJSGMC6 <p>One should be able to think quantitatively about that, eg how many questions do you need to ask until you find out whether your extremization hurt you. I'm surprised by the suggestion that GJP didn't do enough, unless their extremizations were frequently in the &gt;90% range.</p> jacobjacob hsontxwhWpFJSGMC6 2019-03-25T22:03:23.949Z Comment by jacobjacob on Understanding information cascades https://www.lesswrong.com/posts/2uDBJWCksvzhDzHGf/understanding-information-cascades#vrhWavmqeigyjW8EB <p>I did, he said a researcher mentioned it in conversation.</p> jacobjacob vrhWavmqeigyjW8EB 2019-03-25T22:00:06.295Z Comment by jacobjacob on Unconscious Economies https://www.lesswrong.com/posts/PrCmeuBPC4XLDQz8C/unconscious-economies#nwp8KcG9sEoFnBzLz <p>Good point, there's selection pressure for things which happen to try harder to be selected for ("click me! I'm a link!"), regardless of whether they are profitable. But this is not the <em>only</em> pressure, and depending on what happens to a thing when it is "selected" (viewed, interviewed, etc.) this pressure can be amplified (as in OP) or countered (as in Vaniver's comment).</p> jacobjacob nwp8KcG9sEoFnBzLz 2019-03-25T21:58:37.606Z Comment by jacobjacob on Understanding information cascades https://www.lesswrong.com/posts/2uDBJWCksvzhDzHGf/understanding-information-cascades#oGMaYntqzwbhPkDkp <blockquote>more recent data suggests that the successes of the extremizing algorithm during the forecasting tournament were a fluke. </blockquote><p>Do you have a link to this data?</p> jacobjacob oGMaYntqzwbhPkDkp 2019-03-22T15:10:05.785Z Comment by jacobjacob on Understanding information cascades https://www.lesswrong.com/posts/2uDBJWCksvzhDzHGf/understanding-information-cascades#fifnP9nTL6KwXC9iS <p>I haven&#x27;t looked through your links in much detail, but wanted to reply to this: </p><blockquote>Overall I would suggest to approach this with some intellectual humility and study existing research more, rather then try to reinvent large part of network science on LessWrong. (My guess is something like &gt;2000 research years were spent on the topic often by quite good people.) </blockquote><p>I either disagree or am confused. It seems good to use resources to outsource your ability to do literature reviews, distillation or extrapolation, to someone with higher comparative advantage. If the LW question feature can enable that, it will make the market for intellectual progress more efficient; and I wanted to test whether this was so. </p><p>I am not trying to <em>reinvent </em>network science, and I&#x27;m not that interested in the large amount of theoretical work that has been done. I am trying to 1) <em>apply </em>these insights to very particular problems I face (relating to forecasting and more); and 2) think about this from a cost-effectiveness perspective. </p><p>I am very happy to trade money for my time in answering these questions. </p><p>(Neither 1) nor 2) seems like something I expect the existing literature to have been very interested in. I believe this for similar reasons to those Holden Karnofsky express <a href="https://www.lesswrong.com/posts/nXZi8efFArfk3u568/extended-quote-on-the-institution-of-academia">here</a>.)</p> jacobjacob fifnP9nTL6KwXC9iS 2019-03-14T15:45:50.640Z Comment by jacobjacob on Understanding information cascades https://www.lesswrong.com/posts/2uDBJWCksvzhDzHGf/understanding-information-cascades#7H2wEu3SSoPnojDjY <p>Seems like a sensible worry, and we did consider some version of it. My reasoning was roughly: </p><p>1) The questions feature is quite new, and if it will be very valuable, most use-cases and the proper UI haven&#x27;t been discovered yet (these can be hard to predict in advance without getting users to play around with different things and then talking to them). </p><p>No one has yet attempted to use multiple questions. So it would be valuable for the LW team and the community to experiment with that, despite possible countervailing considerations (<a href="https://www.businessinsider.com/jeff-bezos-explains-the-perfect-way-to-make-risky-business-decisions-2017-4?r=US&IR=T&IR=T">any good experiment will have sufficient uncertainty that such considerations will always exist</a>).</p><p>2) Questions 1/2, 3 and 4 are quite different, and it seems good to be able to do research on one sub-problem without taking mindshare from <em>everyone</em> working on <em>any </em>subproblem. </p><p></p> jacobjacob 7H2wEu3SSoPnojDjY 2019-03-14T15:20:16.372Z Formalising continuous info cascades? [Info-cascade series] https://www.lesswrong.com/posts/MSJZwxKN3f4AfHB8m/formalising-continuous-info-cascades-info-cascade-series <p><em>This is a question in</em> <em><a href="https://www.lesswrong.com/posts/2uDBJWCksvzhDzHGf/understanding-info-cascades">the info-cascade question series</a>. There is a prize pool of up to $800 for answers to these questions. See the link above for full background on the problem (including a bibliography) as well as examples of responses we’d be especially excited to see.</em></p><hr class="dividerBlock"/><p>Mathematically formalising info-cascades would be great. </p><p>Fortunately, it&#x27;s already been done in the simple case. See this excellent <a href="https://www.lesswrong.com/posts/DNQw596nPCX4x7xT9/information-cascades">LW post</a> by Johnicholas, where he uses upvotes/downvotes as an example, and shows that after the second person has voted, all future voters are adding zero new information to the system. His explanation using likelihood ratios is the most intuitive I&#x27;ve found.</p><p>The <a href="https://en.wikipedia.org/wiki/Information_cascade">Wikipedia entry</a> on the subject is also quite good.</p><p>However, these two entries primarily explain how information cascades when people have to make a binary choice - good or bad, left or right, etc. The question I want to understand is how to think of the problem in a continuous case - do the problems go away? Or more likely, what variables determine the speed at which people update to one extreme? And how far toward that extreme do people go before they realise their error?</p><p>Examples of continuous variables include things like project time estimates, stocks, and probabilistic forecasts. I imagine it&#x27;s very likely that significant quantitative work has been done on the case of market bubbles, and anyone can write an answer summarising that work and explaining how to apply it to other domains like forecasting, that would be excellent.</p> jacobjacob MSJZwxKN3f4AfHB8m 2019-03-13T10:55:46.133Z How large is the harm from info-cascades? [Info-cascade series] https://www.lesswrong.com/posts/xmoQSAx8X5xJTxLd6/how-large-is-the-harm-from-info-cascades-info-cascade-series <p><em>This is a question in</em> <em><a href="https://www.lesswrong.com/posts/2uDBJWCksvzhDzHGf/understanding-info-cascades">the info-cascade question series</a>. There is a prize pool of up to $800 for answers to these questions. See the link above for full background on the problem (including a bibliography) as well as examples of responses we’d be especially excited to see.</em></p><p>___</p><p>How can we quantify the impact (harm) of info-cascades?</p><p>There are many ways in which info-cascades are harmful. Insofar as people base their decisions on the cascaded info, this can result in bad career choices, mistaken research directions, misallocation of grants, a culture that is easier to hijack by cleverly signalling outsiders (by simply “joining the bubble”), and more. </p><p>But in order to properly allocate resources to work on info-cascades we need a better model of how large the effects are, and how they compare with other problems. How can we think about info-cascades from a cost-effectiveness perspective?</p><p>We are especially interested in answers to this question that ultimately bear on the effective altruism/rationality communities, or analyses of other institutions with insights that transfer to these communities.</p><p>As an example step in this direction, we built <a href="https://www.getguesstimate.com/models/12929">a Guesstimate model</a>, which is described in an answer below. </p> jacobjacob xmoQSAx8X5xJTxLd6 2019-03-13T10:55:38.872Z How can we respond to info-cascades? [Info-cascade series] https://www.lesswrong.com/posts/NzQNKTXK2gtDW6KCk/how-can-we-respond-to-info-cascades-info-cascade-series <p><em>This is a question in</em> <em><a href="https://www.lesswrong.com/posts/2uDBJWCksvzhDzHGf/understanding-info-cascades">the info-cascade question series</a>. There is a prize pool of up to $800 for answers to these questions. See the link above for full background on the problem (including a bibliography) as well as examples of responses we’d be especially excited to see.</em></p> <p>___</p> <p>In my (Jacob's) work at <a href="https://goo.gl/forms/gMGGGtGYHztYJJ2s2">Metaculus AI</a>, I'm trying to build a centralised space for both finding forecasts as well as the reasoning underlying those forecasts. Having such a space might serve as a simple way for the AI community to avoid runway info-cascades.</p> <p>However, we are also concerned with situations where new forecasters overweight the current crowd opinion in their forecasts, compared to the underlying evidence, and see this as major risk for the trustworthiness of forecasts to those working in AI safety and policy.</p> <p>With this question, I am interested in previous attempts to tackle this problem, and how successful they have been. In particular:</p> <ul> <li> <p>What existing infrastructure has been historically effective for avoiding info-cascades in communities? (Examples could include short-selling to prevent bubbles in asset markets, or norms to share the causes rather than outputs of one’s beliefs)</p> </li> <li> <p>What problems are not adequately addressed by such infrastructure?</p> </li> </ul> jacobjacob NzQNKTXK2gtDW6KCk 2019-03-13T10:55:25.685Z Distribution of info-cascades across fields? [Info-cascade series] https://www.lesswrong.com/posts/FuqxhhhNAPNz5tGA4/distribution-of-info-cascades-across-fields-info-cascade <p><em>This is a question in</em> <em><a href="https://www.lesswrong.com/posts/2uDBJWCksvzhDzHGf/understanding-info-cascades">the info-cascade question series</a>. There is a prize pool of up to $800 for answers to these questions. See the link above for full background on the problem (including a bibliography) as well as examples of responses we’d be especially excited to see. </em></p><p>___</p><p>How common, and how large, are info-cascades in communities that seek to make intellectual progress, such as academia? This distribution is presumably very heavy-tailed as we are dealing with network phenomena. But what is its actual values? How can we estimate this number?</p><p>A good starting point for thinking about this might be the paper “How citation distortions create unfounded authority: analysis of a citation network” (<a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2714656/">Greenberg, 2009</a>), which uses social network theory and graph theory to trace how an at best very uncertain claim in biomedicine cascading into established knowledge. We won’t attempt to summarise the paper here. </p> jacobjacob FuqxhhhNAPNz5tGA4 2019-03-13T10:55:17.194Z Understanding information cascades https://www.lesswrong.com/posts/2uDBJWCksvzhDzHGf/understanding-information-cascades <p><em>Meta: Because we think understanding info cascades are important, we recently spent ~10 hours trying to figure out how to quantitatively model them, and have contributed our thinking as answers below. While we currently didn't have the time to continue exploring, we wanted to experiment with seeing how much the LW community could together build on top of our preliminary search, so we’ve put up a basic prize for more work and tried to structure the work around a couple of open questions. This is an experiment! We’re looking forward to reading any of your contributions to the topic, including things like summaries of existing literature and building out new models of the domain.</em></p> <h2>Background</h2> <p>Consider the following situation:</p> <blockquote> <p>Bob is wondering whether a certain protein injures the skeletal muscle of patients with a rare disease. He finds a handful papers with some evidence for the claim (and some with evidence against it), so he simply states the claim in his paper, with some caution, and adds that as a citation. Later, Alice comes across Bob’s paper and sees the cited claim, and she proceeds to cite Bob, but without tracing the citation trail back to the original evidence. This keeps happening, in various shapes and forms, and after a while a literature of hundreds of papers builds up where it’s common knowledge that β amyloid injures the skeletal muscle of patients with inclusion body myositis -- without the claim having accumulated any more evidence. (This real example was taken from <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2714656/">Greenberg, 2009</a>, which is a case study of this event.)</p> </blockquote> <p>An information-cascade occurs when people update on each others beliefs, rather than sharing the causes of those beliefs, and those beliefs end up with a vestige of support that far outstrips the evidence for them. Satvik Beri might describe this as the problem of only sharing the outputs of your thinking process, not your inputs.</p> <p>The dynamics here are perhaps reminiscent of those underlying various failures of collective rationality such as asset bubbles, bystander effects and stampedes.</p> <p>Note that his effect is different from other problems of collective rationality like the replication crisis, which involve <em>low standards</em> for evidence (such as unreasonably lax p-value thresholds or coordination problems preventing publishing of failed experiments), or the degeneracy of much online discussion, which involves tribal signalling and UI encouraging <a href="https://slatestarcodex.com/2014/12/17/the-toxoplasma-of-rage/">problematic selection effects</a>. Rather, information cascades involve people <em>rationally updating</em> without <em>any</em> object-level evidence at all, and would persist even if the replication crisis and online outrage culture disappeared. If nobody lies or tells untruths, you can still be subject to an information cascade.</p> <h2>Questions</h2> <p>Ben and I are confused about how to think about the negative effects of this problem. We understand the basic idea, but aren't sure how to reason quantitatively about the impacts, and how to trade-off solving these problems in a community versus doing other improvements to overall efficacy and efficiency of a community. We currently know only how to think about these qualitatively.</p> <p>We’re posting a couple of related questions that we have some initial thoughts on, that might help clarify the problem.</p> <ul> <li> <p><a href="https://www.lesswrong.com/posts/FuqxhhhNAPNz5tGA4/distribution-of-info-cascades-across-fields-info-cascade">How common, and how large, are info-cascades in communities that seek to make intellectual progress, such as academia?</a></p> </li> <li> <p><a href="https://www.lesswrong.com/posts/xmoQSAx8X5xJTxLd6/how-large-is-the-harm-from-info-cascades-info-cascade-series">How can we quantify the impact (harm) of info-cascades, and think about them in cost-effectiveness terms?</a></p> </li> <li> <p><a href="https://www.lesswrong.com/posts/NzQNKTXK2gtDW6KCk/how-can-we-respond-to-info-cascades-info-cascade-series">What have been some historically effective ways of responding to cascades, and where have those approaches failed?</a></p> </li> <li> <p><a href="https://www.lesswrong.com/posts/MSJZwxKN3f4AfHB8m/formalising-continuous-info-cascades-info-cascade-series">How do you mathematically formalise information cascades around continuous variables?</a></p> </li> </ul> <p>If you have something you’d like to contribute, but that doesn’t seem to fit into the related questions above, leave it as an answer to this question.</p> <h2>Bounties</h2> <p>We are committing to pay at least <strong>either $800 or (No. of answers and comments * $25), whichever is smaller,</strong> for work on this problem recorded on LW, done before May 13th. The prize pool will be split across comments in accordance with how valuable we find them, and we might make awards earlier than the deadline (though if you know you’ll put in work in x weeks, it would be good to mention that to one of us via PM).</p> <p>Ben and Jacob are each responsible for half of the prize money.</p> <p>Jacob is funding this through Metaculus AI, a new forecasting platform tracking and improving the state-of-the-art in AI forecasting, partly to help avoid info-cascades in the AI safety and policy communities (we’re currently live and inviting beta-users, you can sign-up <a href="https://docs.google.com/forms/d/e/1FAIpQLSduBjn3W_MpHHjsKUEhzV6Krkup78ujE5-8bpNJ5HDE7GGnmA/viewform?usp=sf_link">here</a>).</p> <p>Examples of work each of us are especially excited about:</p> <p><em>Jacob</em></p> <ul> <li> <p>Contributions to our Guesstimate model (linked <a href="https://www.lesswrong.com/posts/xmoQSAx8X5xJTxLd6/how-large-is-the-harm-from-info-cascades-info-cascade-series">here</a>), such as reducing uncertainty on the inputs or using better models.</p> </li> <li> <p>Extensions of the Guesstimate model beyond biomedicine, especially in ways that make it more directly applicable to the rationality/effective altruism communities</p> </li> <li> <p>Examples and analysis of existing interventions that deal with this and what makes them work, possibly suggestions for novel ones (though avoiding the trap of <a href="https://www.wamda.com/2013/10/why-good-startup-ideas-look-like-bad-ideas">optimising for good-seeming ideas</a>)</p> </li> <li> <p>Discussion of how the problem of info-cascades relates to forecasting</p> </li> </ul> <p><em>Ben</em></p> <ul> <li> <p>Concise summaries of relevant papers and their key contributions</p> </li> <li> <p>Clear and concise explanations of what other LWers have found (e.g. turning 5 long answers into 1 medium sized answer that links back to the others while still conveying the key info. Here’s <a href="https://www.lesswrong.com/posts/XYYyzgyuRH5rFN64K/what-makes-people-intellectually-active#pMvL4xCcntDk8oZbb">a good example</a> of someone distilling an answer section).</p> </li> </ul> jacobjacob 2uDBJWCksvzhDzHGf 2019-03-13T10:55:05.932Z Comment by jacobjacob on Understanding information cascades https://www.lesswrong.com/posts/2uDBJWCksvzhDzHGf/understanding-information-cascades#Mw9KRzuwjJbk4s5dF <p>See <a href="https://www.lesswrong.com/posts/DNQw596nPCX4x7xT9/information-cascades">this post</a> for a good, simple mathematical description of the discrete version of the phenomenon. </p> jacobjacob Mw9KRzuwjJbk4s5dF 2019-03-13T10:46:06.236Z Comment by jacobjacob on How large is the harm from info-cascades? [Info-cascade series] https://www.lesswrong.com/posts/xmoQSAx8X5xJTxLd6/how-large-is-the-harm-from-info-cascades-info-cascade-series#TWHzTcfLZDzeXJwDk <p>Me and Ben Pace (with some help from Niki Shams) made <a href="https://www.getguesstimate.com/models/12929">a Guesstimate model</a> of how much information cascades is costing science in terms of wasted grant money. The model is largely based on the excellent paper “How citation distortions create unfounded authority: analysis of a citation network” (Greenberg, 2009), which traces how an uncertain claim in biomedicine is inflated to established knowledge over a period of 15 years, and used to justify ~$10 million in grant money from the NIH (we calculated the number ourselves <a href="https://docs.google.com/spreadsheets/d/1tRTw2P_GYIV1E1_RsS4L-eFSE_1hdArq7_P6MXox06w/edit?usp=sharing">here</a>).</p><p>There are many open questions about some of the inputs to our model as well as how this generalises outside of academia (or even outside of biomedicine). However, we see this as a “Jellybaby” in Douglas Hubbard’s sense -- it’s a first data-point and stab at the problem which brings us from “no idea idea how big or small the costs of info-cascades are”, to at least “it is plausible though very uncertain that the costs can be on the order of magnitude of billions of dollars, yearly, in academic grant money”.</p> jacobjacob TWHzTcfLZDzeXJwDk 2019-03-13T10:26:49.855Z Comment by jacobjacob on How large is the harm from info-cascades? [Info-cascade series] https://www.lesswrong.com/posts/xmoQSAx8X5xJTxLd6/how-large-is-the-harm-from-info-cascades-info-cascade-series#LhxSK92Kzxcm3BgEs <p>This might be an interesting pointer. </p><p>In <u><a href="https://www.bmj.com/content/bmj/suppl/2009/07/21/bmj.b2680.DC1/gres611285.ww1.pdf">Note-8 in the supplementary materials</a></u>, Greenberg begins to quantify the problem. He defines an amplification measure for paper P as the number of citation-paths originating at P and terminating at all other papers, except for paths of length 1 flowing directly to primary data papers. The amplification density of a network is the mean amplification across its papers.</p><p>Greenberg then finds that, in the particular network analysed, you can achieve amplification density of about 1000 over a 15 year time-frame. This density grows exponentially with a doubling time of very roughly 2 years.</p> jacobjacob LhxSK92Kzxcm3BgEs 2019-03-13T10:26:32.892Z Comment by jacobjacob on Understanding information cascades https://www.lesswrong.com/posts/2uDBJWCksvzhDzHGf/understanding-information-cascades#9EdwZs27gj5dtQbxi <p>Here's a quick <strong>bibliography</strong> we threw together.</p> <p><em>Background:</em></p> <ul> <li> <p><a href="https://web.archive.org/web/20080215131441/http://www.info-cascades.info/">Information Cascades and Rational Herding: An Annotated Bibliography and Resource Reference</a> (Bikchandani et al. 2004). The best resource on the topic, see in particular the initial papers on the subject.</p> </li> <li> <p><a href="http://people.virginia.edu/~cah2k/cascy2k.htm">Y2K Bibliography of Experimental Economics and Social Science Information Cascades and Herd Effects</a> (Holt, 1999. Less thorough, but catches some papers the first one misses.</p> </li> <li> <p>“<a href="https://www.wikiwand.com/en/Information_cascade">Information cascade</a>” from Wikipedia. An excellent introduction.</p> </li> <li> <p>“<a href="https://www.investopedia.com/articles/investing/052715/guide-understanding-information-cascades.asp">Understanding Information Cascades</a>” from Investopedia.</p> </li> </ul> <p><em>Previous LessWrong posts referring to info cascades:</em></p> <ul> <li> <p><a href="https://www.lesswrong.com/posts/DNQw596nPCX4x7xT9/information-cascades">Information cascades</a>, by Johnicholas, 2009</p> </li> <li> <p><a href="https://www.lesswrong.com/posts/QtG2iDnYGZEumXzsb/information-cascades-in-scientific-practice">Information cascades in scientific practice</a>, by RichardKennaway, 2009</p> </li> <li> <p><a href="https://wiki.lesswrong.com/wiki/Information_cascade">Information cascades</a>, LW Wiki</p> </li> </ul> <p>And then here are all the LW posts we could find that used the concept (<a href="https://www.lesswrong.com/posts/qmufiasd6cevHRcr3/adversarial-system-hats">1</a>, <a href="https://www.lesswrong.com/posts/P3uavjFmZD5RopJKk/simultaneously-right-and-wrong#Rjo3HEvejnwv4EK6P">2</a>, <a href="https://www.lesswrong.com/posts/M2LWXsJxKS626QNEA/the-trouble-with-good">3</a>, <a href="https://www.lesswrong.com/posts/5XMrWNGQySFdcuMsA/how-to-use-philosophical-majoritarianism">4</a>, <a href="https://www.lesswrong.com/posts/ZP2om2oWHPhvWP2Q3/the-ethic-of-hand-washing-and-community-epistemic-practice">5</a>, <a href="https://www.lesswrong.com/posts/3o5hLZ479zNaJr9r7/rationality-tip-predict-your-comment-karma">6</a>, <a href="https://www.lesswrong.com/posts/8arFF9SdstBqz7c8K/which-cognitive-biases-should-we-trust-in">7</a>, <a href="https://www.lesswrong.com/posts/o5bpfnT3KkiSKtjB4/seeking-reliable-evidence-claim-that-closing-sweatshops">8</a>, <a href="https://www.lesswrong.com/posts/HnC29723hm6kJT7KP/taking-ai-risk-seriously-thoughts-by-critch">9</a>, <a href="https://www.lesswrong.com/posts/i5nv3ZnqPjiNudLWK/the-first-circle">10</a>, <a href="https://www.lesswrong.com/posts/EBAccQwDWMiRCWnyk/when-should-we-expect-the-education-bubble-to-pop-how-can-we#e6ewqnZmoBdvkBAYT">11</a>) . Not sure how relevant they are, but might be useful in orienting around the concept.</p> jacobjacob 9EdwZs27gj5dtQbxi 2019-03-13T10:17:00.402Z Comment by jacobjacob on [deleted post] https://www.lesswrong.com/posts/HieEyaMXqsFijyxCo/a-list-of-clues#Lm4ntmorggDo7vDNp <p>SPOILER WARNING</p><p></p><p></p><p></p><p></p><p></p><p>Schelling cafe? What Schelling cafe?</p><p>One you went to before, maybe?</p><p>And who on earth do you talk to? Maybe the guy who sat with us in a cab after EA Global Lonon last fall...</p> jacobjacob Lm4ntmorggDo7vDNp 2019-03-10T11:24:14.274Z Comment by jacobjacob on Unconscious Economies https://www.lesswrong.com/posts/PrCmeuBPC4XLDQz8C/unconscious-economies#k66seuevCcTTqpbJT <p>I found myself in a situation like: "if this is common knowledge within econ, writing an explanation would signal I'm not part of econ and hence my econ opinions are low status", but decided to go ahead anyway.</p> <p>It's good you found it helpful. I'm wondering if equilibria like the above is a mechanism preventing important stuff from being distilled.</p> jacobjacob k66seuevCcTTqpbJT 2019-02-27T17:38:20.638Z Comment by jacobjacob on Unconscious Economies https://www.lesswrong.com/posts/PrCmeuBPC4XLDQz8C/unconscious-economies#HYFA6jyMuCbcJRiFc <p>I really appreciate you citing that.</p> <p>I should have made it clearer, but for reference, the works I've been exposed to:</p> <ul> <li> <p>Hal Varian's undergrad textbook</p> </li> <li> <p>Marginal Revolution University</p> </li> <li> <p>Some amount of listening to Econ Talk, reading Investopedia and Wikipedia articles</p> </li> <li> <p>MSc degree at LSE</p> </li> </ul> jacobjacob HYFA6jyMuCbcJRiFc 2019-02-27T17:34:05.450Z Unconscious Economies https://www.lesswrong.com/posts/PrCmeuBPC4XLDQz8C/unconscious-economies <p>Here’s an insight I had about how incentives work in practice, that I’ve not seen explained in an econ textbook/course. </p><p>There are at least three ways in which incentives affect behaviour: 1) via consciously motivating agents, 2) via unconsciously reinforcing certain behaviour, and 3) via selection effects. I think perhaps 2) and probably 3) are more important, but much less talked about. </p><p>Examples of 1) are the following: </p><ul><li>When content creators get paid for the number of views their videos have... they will deliberately try to maximise view-count, for example by crafting vague, clickbaity titles that many people will click on. </li><li>When salespeople get paid a commision based on how many sales they do, but do not lose any salary due to poor customer reviews... they will selectively boast and exaggerate the good aspects of a product and downplay or sneakily circumvent discussion of the downsides. </li><li>When college admissions are partly based on grades, students will work really hard to find the teacher’s password and get good grades, instead of doing things like being independently curious, exploratory and trying to deeply understand the subject</li></ul><p>One objection you might have to this is something like:</p><blockquote>Look at those people without integrity, just trying so hard to optimise whatever their incentives tell them to! <em>I myself</em>, and indeed <em>most people</em>, wouldn’t behave that way. </blockquote><blockquote>On the one hand, I would make videos I think are good, and honestly sell products the way I would sell something to a friend, and make sure I understand my textbook instead of just memorising things. I’m not some kind of microeconomic robot! </blockquote><blockquote>And on the other hand, even if things were not like this… it’s just really hard to creatively find ways of maximising a target. I don’t know what appeals to ‘the kids’ on YouTube, and I don’t know how to find out except by paying for some huge survey or something... human brains aren’t really designed for doing maximising like that. I couldn’t optimise in all these clever ways even if I wanted to.</blockquote><p>One response to this is:</p><blockquote>Without engaging with your particular arguments, we know empirically that the conclusion is false. There’s a wealth of econometrics and micro papers showing how demand shifts in response to price changes. I could dig out plenty of references for you… but heck, just look around. </blockquote><blockquote>There’s a $10.000/year daycare close to where I live, and when the moms there take their kids to the cinema, they’ll tell them to pretend they’re 6 and not 7 years old just to get a $3 discount on the tickets. </blockquote><blockquote>And I’m pretty confident you’ve had persuasive salespeople peddle you something, and then went home with a lingering sense of regret in your belly… </blockquote><blockquote>Or have you ever seen your friend in a queue somewhere and casually slid in right behind them, just to get into the venue 5 minutes earlier? </blockquote><blockquote>All in all, if you give people an opportunity to earn some money or time… they’ll tend to take it!</blockquote><p>This might or might not be a good reply.</p><p>However, by appealing to 2) and 3), we don’t have to make this response <em>at all</em>. The effects of incentives on behaviour don’t <em>have to</em> be consciously mediated. Rather...</p><ul><li>When content creators get paid for the number of views their videos have, those whose natural way of writing titles is a bit more clickbait-y will tend to get more views, and so over time accumulate more influence and social capital in the YouTube community, which makes it harder for less clickbait-y content producers to compete. No one has to change their behaviour/or their strategies that much -- rather, when changing incentives you’re changing the rules of game, and so the winners will be different. Even for those less fortunate producers, those of their videos which are on the clickbait end of things will tend to give them more views and money, and insofar as they just “try to make videos they like, seeing what happens, and then doing more of what worked”, they will be pushed in this direction</li><li>When salespeople get paid a commission based on how many sales they do, but do not lose any salary due to poor customer reviews… employees of a more Machiavellian character will <em>tend to</em> perform better, which will give them more money and social capital at work… and this will give Machiavellian characteristics more influence over that workplace (before even taking into account returns to scale of capital). They will then be in positions of power to decide on which new policies get implemented, and might choose those that they genuinely think sound most reasonable and well-evidenced. They certainly <em>don’t</em> have to mercilessly optimise for a Machiavellian culture, yet because they have all been pre-selected for such personality traits, they’ll <em>tend to</em> be biased in the direction of choosing such policies. As for their more “noble” colleagues, they’ll find that out of all the tactics they’re comfortable with/able to execute, the more sales-y ones will lead them to get more hi-fives from the high-status people in the office, more room in the budget at the end of the month, and so forth</li><li>When college admissions are partly based on grades… the case is left as an exercise for the reader. </li></ul><p><strong>If this is true and important, why doesn’t standard econ textbooks/courses explain this?</strong></p><p>I have some hypotheses which seem plausible, but I don’t think they are exhaustive.</p><p><em>1. Selection pressure for explanations requiring the fewest inferential steps</em></p><p>Microeconomics is pretty counterintuitive (for more on the importance of this, see e.g. <a href="https://www.econlib.org/lets-not-emphasize-behavioral-economics/">this post</a> by Scott Sumner). Writing textbooks that explain it to hundreds of thousands of undergrads, <em>even just </em>using consciously scheming agents, is hard. Now both “selection effects” and “reinforcement learning” are independently difficult concepts, which the majority of students will not have been exposed to, and which aren’t the explanatory path of least resistance (even if they might be really important to a small subset of people who want to use econ insights to build new organisations that, for example, do better than the dire state of the attention economy. Such as LessWrong).</p><p><em>2. Focus on mathematical modelling</em></p><p>I did half an MSc degree in economics. The focus was not on intuition, but rather on something like “acquire mathematical tools enabling you to do a PhD”. There was <em>a lot</em> of focus on not messing up the multivariable calculus when solving strange optimisation problems with solutions at the boundary or involving utility functions with awkward kinks. </p><p>The extent of this mathematisation was sometimes scary. In a finance class I asked the tutor what practical uses there were of some obscure derivative, which we had spend 45 mins and several pages of stochastic calculus proving theorems about. “Oh” he said, “I guess a few years ago it was used to scheme Italian grandmas out of their pensions”. </p><p>In classes when I didn’t bother asking, I mostly didn’t find out what things were used for.</p><p><em>3. Focus on the properties of equilibria, rather than the processes whereby systems move to equilibria</em></p><p>Classic econ joke: </p><blockquote>There is a story that has been going around about a physicist, a chemist, and an economist who were stranded on a desert island with no implements and a can of food. The physicist and the chemist each devised an ingenious mechanism for getting the can open; the economist merely said, &quot;Assume we have a can opener&quot;!</blockquote><p>Standard micro deals with unbounded rational agents, and its arsenal of fixed point theorems and what-not reveals the state of affairs after all maximally rational actions have already been taken. When asked how equilibria manifest themselves, and emerge, in practice, one of my tutors helplessly threw her hands in the air and laughed “that’s for the macroeconomists to work out!”</p><p>There seems to be little attempts to teach students how the solutions to the unbounded theorems are approximated in practice, whether via conscious decision-making, selection effects, reinforcement learning, memetics, or some other mechanism. </p><p><em>Thanks to Niki Shams and Ben Pace for reading drafts of this. </em></p> jacobjacob PrCmeuBPC4XLDQz8C 2019-02-27T12:58:50.320Z Comment by jacobjacob on Less Competition, More Meritocracy? https://www.lesswrong.com/posts/idFsH3kWbkKFkawAy/less-competition-more-meritocracy#Tn429Reb7KPs3ZCHR <p>For section III. it would be really helpful to concretely work through what happens in the examples of divorce, nuclear war, government default, etc. <strong>What&#x27;s a plausible thought process of the agents involved? </strong></p><p>My current model is something like &quot;my marriage is worse than I find tolerable, so I have nothing to loose. Now that divorce is legal, I might as well gamble my savings in the casino. If I win we could move to a better home and maybe save the relationship, if I lose we&#x27;ll get divorced.&quot;</p><p>People who have nothing to lose start taking risks which fill up the merely possibly bad outcomes until they start mattering. </p> jacobjacob Tn429Reb7KPs3ZCHR 2019-02-27T11:50:32.523Z Comment by jacobjacob on How important is it that LW has an unlimited supply of karma? https://www.lesswrong.com/posts/wgvwhokzYArtBKxtW/how-important-is-it-that-lw-has-an-unlimited-supply-of-karma#fgTS6p8jYPhbdcC9g <p>In the broader economy, it&#x27;s not the case that &quot;If buying things reduced your income, people stop buying things, and eventually money stops flowing altogether&quot;.</p><p>So the only way that makes sense to me is if you model content as a public good which no user is incentivised to contribute to maintaining. </p><p>Speculatively, this might be avoided if votes were public: because then voting would be a costly signal of one&#x27;s epistemic values or other things.</p> jacobjacob fgTS6p8jYPhbdcC9g 2019-02-12T10:37:09.695Z Comment by jacobjacob on How important is it that LW has an unlimited supply of karma? https://www.lesswrong.com/posts/wgvwhokzYArtBKxtW/how-important-is-it-that-lw-has-an-unlimited-supply-of-karma#buZFnB7S83eKAqC4E <blockquote> though I&#x27;m not sure how that is calculateed from one&#x27;s karma </blockquote><p>I believe it&#x27;s proportional to the log of your user karma. But I&#x27;m not sure. </p><blockquote>One can get high karma from a small amount of content that a small number of sufficiently high karma users that double up vote it. </blockquote><p>There is still an incentive gradient towards &quot;least publishable units&quot;. </p><p>Suppose you have a piece of work worth 18 karma to high-karma user U. However, U&#x27;s strong upvote is only worth 8 karma. </p><p>If you just post one piece of work, you get 8 karma. If you split your work into three pieces, each of which U values at 6 karma, you&#x27;re better off. U might strong-upvote all of them (they&#x27;d rather allocate a little too much karma than way too little), and you get 24 karma. </p><p>To the extend the metaphor in the original question: maybe if the world economy ran on the equivalent of strong upvotes there would still be cars around, yet no one could buy airplanes. </p> jacobjacob buZFnB7S83eKAqC4E 2019-02-12T10:32:22.489Z Comment by jacobjacob on How important is it that LW has an unlimited supply of karma? https://www.lesswrong.com/posts/wgvwhokzYArtBKxtW/how-important-is-it-that-lw-has-an-unlimited-supply-of-karma#vFfG5TrZXmWTvxqao <p>Do you have details on when and why that was removed? Or past posts discussing that system?</p> jacobjacob vFfG5TrZXmWTvxqao 2019-02-12T10:17:48.260Z How important is it that LW has an unlimited supply of karma? https://www.lesswrong.com/posts/wgvwhokzYArtBKxtW/how-important-is-it-that-lw-has-an-unlimited-supply-of-karma <h2>Question</h2><p>LessWrong users can up/downvote posts and comments, which then receive a karma boost (capped by the voters own karma). There is no limit to how many <em>different</em> posts and comments one can do this to. In this sense there is an unlimited supply of karma to be handed out. (This is also the case for Facebook, YouTube, Instagram, HackerNews(?), Medium, ...)</p><p><em>Is this important? That is, does it have non-trivial medium or long-term effects on the “LessWrong economy&quot; -- the kind and amount of content that gets produced?</em></p><h2>Rough Thoughts</h2><p>Here are some quick thoughts I wrote down. I publish this question despite them being unfinished, instead of letting them wither deep in my Google Drive. </p><p>Under the current system…</p><ul><li>Over time, it’s not clear whether karma is inflationary or deflationary. It depends at least on whether the rate of growth of content is slower or faster than the rate of increase of karma production.</li><li>The only way to get a large amount of karma is to produce content that appeals to <em>many</em> users, or produce a large amount of content that appeals to at least some users. One cannot get high karma by producing a small amount of content that a small number of users likes <em>a lot</em>. </li><ul><li>If the real economy was like this, there wouldn’t exist businesses like SpaceX, Palantir or Boeing. </li><ul><li>Something seems very broken about LW, if, were the big world to run on LW principles, people wouldn’t be able to fly as a means of travel. <em>Lots</em> of people want to fly. But <em>very</em> <em>few</em> are able to pay for the construction of a 747. So we only have airtravel because there can exist intermediaries who <em>can</em> make that payment, and in turn get rewarded by collecting all the little flight desires of very many people kind-of-keen to fly. </li><li>Currently, there cannot be any such intermediaries on LessWrong. A concrete example of a LessWrong Boeing might be something like: CFAR really wants someone to write a 40-page literature review of X. No one else really cares, apart from the fact that were CFAR to get that review, their workshops would improve pretty significantly for most attendees.</li></ul></ul><li>There are fewer free-rider problems. D<em>espite</em> content being non-excludable and publicly available, users have an incentive to upvote things, because Alice doing so instead of Bob does not cost Alice anything (we’re assuming they both end up consuming the content, so attention and time costs are the same).</li><ul><li>This seems very important, and like something that could offset the &quot;Boeing problem&quot; mentioned above. </li></ul></ul><p>If instead of the current system each karma point given was taken from your own score, then…</p><ul><li>One could not indefinitely keep up/down-voting content without producing new content oneself. In practice, one could do this if one had created one beacon of amazing work in the past.</li><li>Over time, as the same amount of karma gets spread across more and more content, the value of a karma point increases (because the opportunity cost of what else that karma point could have been used for increases). </li><li>There might be even more deflationary pressure on karma if users produce great content but then leave the site. </li><li>There is a disconnect between content karma and user karma. A user who has produced much high-quality content might not have a corresponding amount of karma, having given it away.</li></ul><p>Some uncertainties</p><ul><li>A salient implementation is that an upvote costs exactly the amount of karma that’s being awarded to the content. But how much karma should downvotes cost?</li><li>It is unclear how a limited karma supply interfaces with a limited maximal upvote size</li><li>There might be lessons from macroeconomics and monetary policy relevant to this. I don’t know, because I know something-that-rounds-to-nothing about those fields. </li></ul> jacobjacob wgvwhokzYArtBKxtW 2019-02-11T01:41:51.797Z Comment by jacobjacob on The Case for a Bigger Audience https://www.lesswrong.com/posts/2E3fpnikKu6237AF6/the-case-for-a-bigger-audience#u9gB5iQqx2Q5P2Awd <p>I was going to say 2000 times sounded like <em>way</em> too much, but making the guesstimates that means on average using &quot;common knowledge&quot; once every other day since it was published, and &quot;out to get you&quot; once every third day, and that does seem consistent with my experience hanging out with you (though of course with a fat tail of the distribution, using some concepts like 10 times in a single long hangout). </p> jacobjacob u9gB5iQqx2Q5P2Awd 2019-02-10T00:21:08.240Z Comment by jacobjacob on When should we expect the education bubble to pop? How can we short it? https://www.lesswrong.com/posts/EBAccQwDWMiRCWnyk/when-should-we-expect-the-education-bubble-to-pop-how-can-we#e6ewqnZmoBdvkBAYT <p>Asset bubbles can be Nash equilibria for a while. This is a really important point. If surrounded by irrational agents, it might be rational to play along with the bubble instead of shorting and waiting. &quot;The market can stay irrational longer than you can stay solvent.&quot; </p><p>For most of 2017, you shouldn&#x27;t have shorted crypto, even if you knew it would eventually go down. The rising markets and the interest on your short would kill you. It might take big hedge funds with really deep liquidity to ride out the bubble, and even they might not be able to make it if they get in too early. In 2008 none of the investment banks could short things early enough because no one else was doing it. </p><p>The difference between genius (shorting at the peak) and really smart (shorting pre-peak) matters a lot in markets. (There&#x27;s this scene in the Big Short where some guy covers the cost of his BBB shorts by buying a ton of AAA-rated stuff, assuming that <em>at least those </em>will keep rising.)</p><p>So shorting and buying are not symmetric (as you might treat them in a mathematical model, only differing by the sign on the quantity of assets bought). Shorting is much harder and much more dangerous. </p><p>In fact, my current model [1] is that this is the very reason financial markets <em>can</em> exhibit bubbles of &quot;irrationality&quot; despite all their beautiful properties of self-correction and efficiency.</p><p>[1] For transparency, I basically downloaded this model from <a href="https://www.lesswrong.com/users/davidmanheim">davidmanheim</a>. </p> jacobjacob e6ewqnZmoBdvkBAYT 2019-02-10T00:06:57.591Z When should we expect the education bubble to pop? How can we short it? https://www.lesswrong.com/posts/EBAccQwDWMiRCWnyk/when-should-we-expect-the-education-bubble-to-pop-how-can-we <p>I won&#x27;t attempt to summarise the case for there being an education bubble here (see links below for some pointers). Rather, my questions are: </p><p>1) <em>assuming</em> there is an education bubble, when will it -- as bubbles tend to do -- pop? </p><p>(This plausibly entails some disjunction of *hundreds of thousands to millions of students defaulting on their debt, *lower number of college applicants, *non-top-tier colleges laying off faculty, *substantial reductions the signalling value of obtaining a diploma, *substantial reductions in tuition fees, *reduction in the level of education required by various employers, and more)</p><p>2) Which assets will be more scarce/in demand as that happens? Are there currently available opportunities for &quot;shorting&quot; the education bubble and invest in ways which will yield profit when it pops?</p><p>(I hereby preface the comments by noting that nothing discussed there is investment advice and no users can be held liable for investment decisions based on it.)</p><hr class="dividerBlock"/><p><a href="https://www.youtube.com/watch?v=crAHDXdBCXg">Peter Thiel summarises the inside view of there being an &quot;education bubble&quot; well</a>.</p><p>And here are some interesting numbers: </p><ul><li>In the last 35 years, median household income has grown by about 20% (<u><a href="https://fred.stlouisfed.org/series/MEHOINUSA672N">FRED</a></u>). In roughly the same time, <a href="https://www.lesswrong.com/posts/Fafzj3wMvoCW4WjeF/against-tulip-subsidies">the price of college has grown by 300% when adjusting for inflation</a> </li><li><u><a href="http://fortune.com/2017/07/10/higher-education-student-loans-economic-bubble-federal/">College spending is one sixth of US economy</a> </u>UPDATE: this is probably false/misleading, see comment from paulfchristiano below.</li><li><u><a href="https://www.forbes.com/sites/zackfriedman/2017/02/21/student-loan-debt-statistics-2017/#2ca68b5d5dab">Student debt is at &gt;$1 trillion</a></u> (for comparison US GDP is around $20 trillion, US federal budget is around $4 trillion) </li></ul> jacobjacob EBAccQwDWMiRCWnyk 2019-02-09T21:39:10.918Z Comment by jacobjacob on X-risks are a tragedies of the commons https://www.lesswrong.com/posts/P4sn9MNrFv6RR3moc/x-risks-are-a-tragedies-of-the-commons#9sBR3ScTXoeDtB7q8 <p>In case others haven&#x27;t seen it, <a href="http://livingeconomics.org/article.asp?docId=239">here</a>&#x27;s a great little matrix summarising the classification of goods on &quot;rivalry&quot; and &quot;excludability&quot; axes. </p> jacobjacob 9sBR3ScTXoeDtB7q8 2019-02-09T21:00:03.333Z Comment by jacobjacob on (notes on) Policy Desiderata for Superintelligent AI: A Vector Field Approach https://www.lesswrong.com/posts/hyfedqhgCQriBB9wT/notes-on-policy-desiderata-for-superintelligent-ai-a-vector#bRipaSGLdcW2zTPCQ <p>Hanson&#x27;s speed-weighted voting reminds me a bit of <a href="http://ericposner.com/quadratic-voting/">quadratic voting</a>.</p> jacobjacob bRipaSGLdcW2zTPCQ 2019-02-05T10:07:42.749Z Comment by jacobjacob on What are some of bizarre theories based on anthropic reasoning? https://www.lesswrong.com/posts/8uTfyAZ6sMamSn6Z3/what-are-some-of-bizarre-theories-based-on-anthropic#zPyfy6gT5fZMdG8gA <p>I presume that, unlike X-risk, s-risks don't remove the vast majority of observer moments.</p> jacobjacob zPyfy6gT5fZMdG8gA 2019-02-05T09:40:54.184Z Comment by jacobjacob on Announcement: AI alignment prize round 4 winners https://www.lesswrong.com/posts/nDHbgjdddG5EN6ocg/announcement-ai-alignment-prize-round-4-winners#xZTLu3JrEFLcB7mJM <p>I disagree with the view that it&#x27;s bad to spend the first few months prizing top researchers who would have done the work anyway. This _in and of itself_ is cleary burning cash, yet the point is to change incentives over a longer time-frame. </p><p>If you think research output is heavy-tailed, what you should expect to observe is something like this happening for a while, until promising tail-end researchers realise there&#x27;s a stable stream of value to be had here, and put in the effort required to level up and contribute themselves. It&#x27;s not implausible to me that would take a &gt;1 year of prizes. </p><p>Expecting lots of important counterfactual work, that beats the current best work, to be come out of the woodwork within ~6 months seems to assume that A) making progress on alignment is quite tractable, and B) the ability to do so is fairly widely distributed across people; both to a seemingly unjustified extent.</p><p>I personally think prizes should be announced together with precommitments to keep delivering them for a non-trivial amount of time. I believe this because I think changing incentives involves changing expectations, in a way that changes medium-term planning. I expect people to have qualitatively different thoughts if their S1 reliably believes that fleshing out the-kinds-of-thoughts-that-take-6-months-to-flesh-out will be reward after those 6 months. </p><p>That&#x27;s expensive, in terms of both money and trust.</p> jacobjacob xZTLu3JrEFLcB7mJM 2019-01-22T16:56:42.368Z Comment by jacobjacob on What are good ML/AI related prediction / calibration questions for 2019? https://www.lesswrong.com/posts/pQfL25ZHE2HvQrWhi/what-are-good-ml-ai-related-prediction-calibration-questions#kmgmE3kLE8bvXwD6t <p>elityre has done work on this for BERI, suggesting <a href="http://existence.org/prediction-market-questions/">&gt;30 questions</a>. </p><p>Regarding the question metatype, Allan Dafoe has offered a set of desiderata in the appendix to his <a href="https://www.fhi.ox.ac.uk/wp-content/uploads/GovAIAgenda.pdf">AI governance research agenda</a>. </p> jacobjacob kmgmE3kLE8bvXwD6t 2019-01-16T17:56:20.563Z Comment by jacobjacob on Why is so much discussion happening in private Google Docs? https://www.lesswrong.com/posts/hnvPCZ4Cx35miHkw3/why-is-so-much-discussion-happening-in-private-google-docs#c3qi2ZKPcScp4iaaz <p>If true, sounds like a bug and not a feature of lw. </p> jacobjacob c3qi2ZKPcScp4iaaz 2019-01-12T11:17:04.531Z Comment by jacobjacob on What is a reasonable outside view for the fate of social movements? https://www.lesswrong.com/posts/RCQ3vintjuGWiMbsa/what-is-a-reasonable-outside-view-for-the-fate-of-social#6GfaNYnP94Hjhqxbj <p>So: habryka did say &quot;anyone&quot; in the original description, and so he will pay both respondents who completed the bounty according to original specifications (which thereby excludes gjm). I will only pay Radamantis as I interpreted him as &quot;claiming&quot; the task with his original comment. </p><p>I suggest you pm with payment details.</p> jacobjacob 6GfaNYnP94Hjhqxbj 2019-01-10T23:36:20.535Z Comment by jacobjacob on What is a reasonable outside view for the fate of social movements? https://www.lesswrong.com/posts/RCQ3vintjuGWiMbsa/what-is-a-reasonable-outside-view-for-the-fate-of-social#8CC6e5tFcDh5ujfSq <p>I&#x27;ll PM habryka about what to do with the bounty given that there were two respondents. </p><p>Overall I&#x27;m excited this data and analysis was generated and will sit down to take a look and update his weekend. :)</p> jacobjacob 8CC6e5tFcDh5ujfSq 2019-01-10T23:10:03.202Z Comment by jacobjacob on What is a reasonable outside view for the fate of social movements? https://www.lesswrong.com/posts/RCQ3vintjuGWiMbsa/what-is-a-reasonable-outside-view-for-the-fate-of-social#fsjzeKtLPiZnNJ6AQ <p>What's your "reasonable sounding metric" of success?</p> jacobjacob fsjzeKtLPiZnNJ6AQ 2019-01-07T19:37:29.714Z Comment by jacobjacob on What is a reasonable outside view for the fate of social movements? https://www.lesswrong.com/posts/RCQ3vintjuGWiMbsa/what-is-a-reasonable-outside-view-for-the-fate-of-social#M9CyzmZCNf5WXwpmc <p><strong>I add $30 to the bounty. </strong></p><p>There are 110 items in the list. So 25% is ~28. </p><p>I hereby set the random seed as whatever will be the last digit and first two decimals (3 digits total) of the S&amp;P 500 Index price on January 7, 10am GMT-5, as found in the interactive chart by Googling &quot;s&amp;p500&quot;. </p><p>For example, the value of the seed on 10am January 4 was &quot;797&quot;. </p><p>[I would have used <a href="https://beacon.nist.gov/home">the NIST public randomness beacon (v2.0)</a> but it appears to be down due to government shutdown :( ].</p><p><strong>Instructions for choosing the movements</strong></p><p>Let the above-generated seed be <em>n</em>.</p><p>Using Python3.0:</p><blockquote>import random</blockquote><blockquote>random.seed(<em>n</em>)</blockquote><blockquote>indices = sorted(random.sample([i for i in range(1,111)], 28))</blockquote> jacobjacob M9CyzmZCNf5WXwpmc 2019-01-07T02:44:56.319Z Comment by jacobjacob on Does anti-malaria charity destroy the local anti-malaria industry? https://www.lesswrong.com/posts/HkpYyr93P2j5R5Go7/does-anti-malaria-charity-destroy-the-local-anti-malaria#ofPmznvyY9rDKCj9k <p>I&#x27;m confused: why doesn&#x27;t variability cause any trouble in the standard models? It seems that if producers are risk-averse, it results in less production than otherwise. </p> jacobjacob ofPmznvyY9rDKCj9k 2019-01-06T00:41:19.707Z What is a reasonable outside view for the fate of social movements? https://www.lesswrong.com/posts/RCQ3vintjuGWiMbsa/what-is-a-reasonable-outside-view-for-the-fate-of-social <p><em>Epistemic status: very hand-wavy and vague, but confident there is a substantial and well-understood core. Hoping for an answer that elucidates that core more clearly.</em> </p><p>It is something of a rationalist folk theorem that social movements face the risk of an &quot;Eternal September&quot;, or of scaling into oblivion. (See e.g. <a href="http://leverageresearch.org/blog/ea-failure-scenarios">this blog post by Leverage research</a>, <a href="http://globalprioritiesproject.org/wp-content/uploads/2015/05/MovementGrowth.pdf">this paper by Owen Cotton-Barratt</a>, <a href="https://meaningness.com/geeks-mops-sociopaths">David Chapman</a> or <a href="http://benjaminrosshoffman.com/construction-beacons/">Benjamin Hoffman</a> on &quot;Geeks, MOPs and sociopaths&quot;, and Scott Alexander on <a href="https://slatestarcodex.com/2014/12/17/the-toxoplasma-of-rage/">&quot;the toxoplasma of rage&quot;</a>). </p><p>I&#x27;ve had the sense that some cocktail of Hansonian/Dunbarian evolutionary psychology, basic game theory/microeconomics, memetic theory and <a href="https://en.wikipedia.org/wiki/Sturgeon%27s_law">Sturgeon&#x27;s law</a>, should predict this. That is, that some reasonably operationalised version of the claim &quot;most social movements fail&quot; is true.</p><p>Yet I am not able to point to &gt;=5 historical examples of social movements that suffered this fate, along with some gears for what went wrong. </p><p>Hence I&#x27;m looking for links, historical examples, more fleshed-out gears, ... anything that might form a more rigorous reference point for <em>an outside view on the fate of social movements</em>.</p> jacobjacob RCQ3vintjuGWiMbsa 2019-01-04T00:21:20.603Z Comment by jacobjacob on Reinterpreting "AI and Compute" https://www.lesswrong.com/posts/EjssJnp9fNhvdDEdK/reinterpreting-ai-and-compute#rGXpzKtfLhszFJ3vj <p>I&#x27;m confused. Do you mean &quot;worlds&quot; as in &quot;future trajectories of the world&quot; or as in &quot;subcommunities of AI researchers&quot;? And what&#x27;s a concrete example of gains from trade between worlds?</p> jacobjacob rGXpzKtfLhszFJ3vj 2018-12-27T16:17:15.242Z Comment by jacobjacob on Against Tulip Subsidies https://www.lesswrong.com/posts/Fafzj3wMvoCW4WjeF/against-tulip-subsidies#NhKCuXzSKWJPy63Sw <p>The link to &quot;<a href="http://www.waldenu.edu/~/media/Files/WAL/outcomes-research-broch-faqs-web-final.pdf">many rigorous well-controlled studies</a>&quot; is broken.</p> jacobjacob NhKCuXzSKWJPy63Sw 2018-12-20T05:10:52.381Z List of previous prediction market projects https://www.lesswrong.com/posts/5aacTtn8fWavbBeEe/list-of-previous-prediction-market-projects <p><a href="https://docs.google.com/spreadsheets/d/1XB1GHfizNtVYTOAD_uOyBLEyl_EV7hVtDYDXLQwgT7k/edit?usp=sharing">Here</a> I try to curate what is, to my knowledge, the most extensive list of previous prediction market projects.</p><p>This is useful because it provides a reference class to answer the question: “Why is our society inadequate at building functional futures markets for basically anything beyond equities, currencies and commodities, despite the massive benefits this would bring (see e.g. <a href="http://science.sciencemag.org/content/320/5878/877">Arrow et al., 2008</a>)?”</p><p>In this vein, I aim to primarily focus on real-money projects that might have scaled to something like an institutionalised exchange (and thus focus less on play-money projects, consulting companies offering a more nebolous “collective intelligence”, or forecasting platforms which are not actually <em>markets</em>).</p><p>I post this spreadsheet publicly despite its incompleteness, as doing so is likely the best way to complete it.</p><p><strong>Feedback is very welcome.</strong></p><p>If you -- yes you! -- are keen to help, the single most useful 15-30 min task would be to further research a cell in the column “Shut-down date and reason”, and comment with links/findings.</p> jacobjacob 5aacTtn8fWavbBeEe 2018-10-22T00:45:01.425Z Comment by jacobjacob on Some cruxes on impactful alternatives to AI policy work https://www.lesswrong.com/posts/DJB82jKwgJE5NsWgT/some-cruxes-on-impactful-alternatives-to-ai-policy-work#qGnNbzjYbreTsBcTF <p>Suppose your goal is not to maximise an objective, but just to cross some threshold. This is plausibly the situation with existential risk (e.g. &quot;maximise probability of okay outcome&quot;). Then, if you&#x27;re above the threshold, you want to minimise variance, whereas if you&#x27;re below it, you want to maximise variance. (See <a href="https://www.mathsjam.com/assets/talks/2017/AlexanderBolton-Chalkdustcoingame.pdf">this</a> for a simple example of this strategy applied to a game.) If Richard believes we are currently above the x-risk threshold and Ben believes we are below it, this might be a simple crux. </p> jacobjacob qGnNbzjYbreTsBcTF 2018-10-13T21:49:30.228Z Comment by jacobjacob on Some cruxes on impactful alternatives to AI policy work https://www.lesswrong.com/posts/DJB82jKwgJE5NsWgT/some-cruxes-on-impactful-alternatives-to-ai-policy-work#foRuzrDRmJjNrRTxt <blockquote>However, I think the distribution of success is often very different from the distribution of impact, because of replacement effects. If Facebook hadn&#x27;t become the leading social network, then MySpace would have. If not Google, then Yahoo. If not Newton, then Leibniz (and if Newton, then Leibniz anyway). </blockquote><p>I think this is less true for startups than for scientific discoveries, because of bad Nash equilibrium stemming from founder effects. The objective which Google is maximising might not be concave. It might have many peaks, and which you reach might be quite arbitrarily determined. Yet the peaks might have very different consequences when you have a billion users.</p><p>For lack of a concrete example... suppose a webapp W uses feature x, and this influences which audience uses the app. Then, once W has scaled and depend on that audience for substantial profit they can&#x27;t easily change x. (It might be that changing x to y wouldn&#x27;t decrease profit, but just not increase it.) Yet, had they initially used y instead of x, they could have grown just as big, but they would have had a different audience. Moreover, because of network effects and returns to scale, it might not be possible for a rivalling company to build their own webapp which is basically the same thing but with y instead. </p><p></p> jacobjacob foRuzrDRmJjNrRTxt 2018-10-13T18:24:10.107Z Comment by jacobjacob on Four kinds of problems https://www.lesswrong.com/posts/NgSJwki4dxJTZAWqH/four-kinds-of-problems#Wec6ZrPiS5w3CcXTR <p>I don&#x27;t want to base my argument on that video. It&#x27;s based on the intuitions for philosophy I developed doing my BA in it at Oxford. I expect to be able to find better examples, but don&#x27;t have the energy to do that now. This should be read more as &quot;I&#x27;m pointing at something that others who have done philosophy might also have experienced&quot;, rather than &quot;I&#x27;m giving a rigorous defense of the claim that even people outside philosophy might appreciate&quot;. </p> jacobjacob Wec6ZrPiS5w3CcXTR 2018-09-20T18:30:10.125Z Comment by jacobjacob on An Ontology of Systemic Failures: Dragons, Bullshit Mountain, and the Cloud of Doom https://www.lesswrong.com/posts/2rgDr5WC82P7XQo9z/an-ontology-of-systemic-failures-dragons-bullshit-mountain#aLT3qpv3e4A8gX9Ez <p>That still seems too vague to be useful. I don&#x27;t have the slack to do the work of generating good examples myself at the moment. </p> jacobjacob aLT3qpv3e4A8gX9Ez 2018-09-10T20:50:33.109Z Comment by jacobjacob on An Ontology of Systemic Failures: Dragons, Bullshit Mountain, and the Cloud of Doom https://www.lesswrong.com/posts/2rgDr5WC82P7XQo9z/an-ontology-of-systemic-failures-dragons-bullshit-mountain#ZrrPc7r3SNT5ArAh3 <p>This would have been more readable if you gave concrete examples of each kind of problem. It seems like your claim might be a useful dichotomy, but in its current state it&#x27;s likely not going to cause me to analyse problems differently or take different actions. </p> jacobjacob ZrrPc7r3SNT5ArAh3 2018-09-08T23:20:35.268Z Comment by jacobjacob on Quick Thoughts on Generation and Evaluation of Hypotheses in a Community https://www.lesswrong.com/posts/amJcwgY9YucygWP8g/quick-thoughts-on-generation-and-evaluation-of-hypotheses-in#FaHYJezte2YmZRZ9z <p>Somewhat tangential, but...</p><p>You point to the following process:</p><p>Generation --&gt; Evaluation --&gt; Acceptance/rejection.</p><p>However, generation is often risky, and not everyone has the capacity to absorb that risk. For example, one might not have the exploratory space needed to pursue spontaneous 5-hour reading sprints while working full-time.</p><p>Hence, I think much of society looks like this: </p><p>Justification --&gt; Evaluation --&gt; Generation --&gt; Evaluation --&gt; Acceptance/rejection.</p><p>I think we some very important projects never happen because the people who have taken all the inferential steps necessary to understand them are not the same as those evaluating them, and so there&#x27;s an information-asymmetry.</p><p>Here&#x27;s <a href="http://www.paulgraham.com/marginal.html">PG</a>: </p><blockquote>That&#x27;s one of California&#x27;s hidden advantages: the mild climate means there&#x27;s lots of marginal space. In cold places that margin gets trimmed off. There&#x27;s a sharper line between outside and inside, and only projects that are officially sanctioned — by organizations, or parents, or wives, or at least by oneself — get proper indoor space. That raises the activation energy for new ideas. You can&#x27;t just tinker. You have to justify.</blockquote><p>(This is one of the problems I model impact certificates as trying to solve.)</p><p></p> jacobjacob FaHYJezte2YmZRZ9z 2018-09-07T12:25:50.197Z Comment by jacobjacob on Musings on LessWrong Peer Review https://www.lesswrong.com/posts/rCzXcyM9Gt2pxmhcE/musings-on-lesswrong-peer-review#EPBrCZJqHuveeQwt6 <p>Have you written any of the upcoming posts yet? If so, can you link them?</p> jacobjacob EPBrCZJqHuveeQwt6 2018-09-06T22:45:15.597Z Comment by jacobjacob on History of the Development of Logical Induction https://www.lesswrong.com/posts/iBBK4j6RWC7znEiDv/history-of-the-development-of-logical-induction#MzaLznPavrgpc9oZE <p>How causally important was Dutch-book theorems in suggesting to you that market behaviour could be used for logical induction? This seems like the most &quot;non sequitur&quot; part of the story. Suddenly, what seems like a surprising insight was just there. </p><p>I predict somewhere between &quot;very&quot; and &quot;crucially&quot; important. </p> jacobjacob MzaLznPavrgpc9oZE 2018-09-02T00:07:24.914Z Four kinds of problems https://www.lesswrong.com/posts/NgSJwki4dxJTZAWqH/four-kinds-of-problems <p> </p><p>I think there are four natural kinds of problems, and learning to identify them helped me see clearly what’s bad with philosophy, good with start-ups, and many things in-between. </p><p>Consider these examples:</p><ol><li>Make it so that bank transfers to Africa do <em>not</em> take weeks and require visiting physical offices, in order to make it easier for immigrants to send money back home to their poor families. </li><li>Prove that the external world exists and you’re not being fooled by an evil demon, in order to use that epistemic foundation to derive a theory of how the world works. </li><li>Develop a synthetic biology safety protocol, in order to ensure your lab does not accidentally leak a dangerous pathogen.</li><li>Build a spaceship that travels faster than the speed of light, in order to harvest resources from outside our light cone. </li></ol><p>These examples all consist in problems that are encountered as part of work on larger projects. We can classify them by asking how we should respond when they arise, as follows: </p><span><figure><img src="http://res.cloudinary.com/dq3pms5lt/image/upload/v1534891411/Table_1_Quattro_Formaggi_rsirqy.png" class="draft-image center" style="" /></figure></span><p>1. is a problem to be solved. In this particular example, it turns out global remittances are several times larger than the combined foreign aid budgets of the Western world. Building a service avoiding the huge fees charged by e.g. Western Union is a very promising way of helping the global poor. </p><p>2. is a problem to be gotten over. You probably won’t find a solution of the kind philosophers usually demand. But, evidently, you don’t <em>have to</em> in order to make meaningful epistemic progress, such as deriving General Relativity or inventing vaccines.</p><p>3. is a crucial consideration -- a problem so important that it might force you to drop the entire project that spawned it, in order to just focus on solving this particular problem. Upon discovering that there is a non-trivial risk of <a href="https://80000hours.org/podcast/episodes/we-are-not-worried-enough-about-the-next-pandemic/">tens of millions of people dying in a natural or engineered pandemic within our lifetimes</a>, and then realising how woefully underprepared our health care systems are for this, publishing yet another paper suddenly appears less important. </p><p>4. is a defeating problem. Solving it is impossible. If a solution forms a crucial part of a project, then the problem is going to bring that project with it into the grave. If whatever we want to spend our time doing, if it requires resources from outside our light cone, we should give it up. </p><p>With this categorisation in mind, we can understand some good and bad ways of thinking about problems. </p><p>For example, I found that learning the difference between a defeating problem and a problem-to-be-solved was what was required to adopt a “hacker mindset”. Consider the remittances problem above. If someone had posed it as something to do after they graduate, they might have expected replies like: </p><p>“Sending money? Surely that’s what banks do! You can’t just... build a bank?” </p><p>“What if you get hacked? Software infrastructure for sending money has to be crazy reliable!”</p><p>“Well, if you’re going to build a startup to help to global poor, you’d have to move to Senegal.” </p><p>Now of course, neither of these things <em>violate the laws of physics</em>. They might violate a few social norms. They might be scary. They might seem like the kind of problem an ordinary person would not be allowed to try to solve. However, if you <em>really</em> wanted to, you <em>could</em> do these things. And some less conformist people who did just that have now become billionaires or, well, moved to Senegal (c.f. PayPal, Stripe, <a href="https://monzo.com/">Monzo</a> and <a href="http://www.wave.com/">Wave</a>). </p><p>As Hannibal said when his generals cautioned him that it was impossible to cross the Alps by elephant: &quot;I shall either find a way or make one.&quot; </p><p>This is what’s good about startup thinking. Philosophy, however, has a big problem which goes the other way: mistaking problems-to-be-solved for defeating problems. </p><p>For example, a frequentist philosopher might object to Bayesianism saying something like “Probabilities can’t represent the degrees of belief of agents, because in order to prove all the important theorems you have to assume the agents are logically omniscient. But that’s an unreasonable constraint. For one thing, it requires you to have an infinite number of beliefs!” (this objection is made <a href="https://www.youtube.com/watch?v=pkGFSkGzg4M">here</a>, for example). And this might convince people to drop the Bayesian framework.</p><p>However, the problem here is that it has not been <em>formally proven</em> that the important theorems of Bayesianism <em>ineliminably require </em>logical omniscience in order to work. Rather, that is often <em>assumed</em>, because people find it hard to do things formally otherwise. </p><p>As it turns out, though, <a href="https://intelligence.org/2016/09/12/new-paper-logical-induction/">the problem is solvable</a>. Philosophers did not find this out, however, as they get paid to argue and so love making objections. The proper response to that might just be “<a href="https://www.lesswrong.com/posts/nCvvhFBaayaXyuBiD/shut-up-and-do-the-impossible">shut up and do the impossible</a>”. (A funny and anecdotal example of this is the <a href="https://www.snopes.com/fact-check/the-unsolvable-math-problem/">student who solved an unsolved problem in maths because he thought it was an exam question</a>.)</p><p>Finally, we can be more systematic in classifying several of these misconceptions. I’d be happy to take more suggestions in the comments. </p><span><figure><img src="http://res.cloudinary.com/dq3pms5lt/image/upload/v1534891415/Table_2_x6oqgw.png" class="draft-image center" style="" /></figure></span><p></p><p></p> jacobjacob NgSJwki4dxJTZAWqH 2018-08-21T23:01:51.339Z Brains and backprop: a key timeline crux https://www.lesswrong.com/posts/QWyYcjrXASQuRHqC5/brains-and-backprop-a-key-timeline-crux <p>[Crossposted from <a href="https://jacoblagerros.wordpress.com/2018/03/09/brains-and-backprop-a-key-timeline-crux/">my blog</a>]</p><h1>The Secret Sauce Question</h1><p>Human brains still outperform deep learning algorithms in a wide variety of tasks, such as playing soccer or knowing that it’s a bad idea to drive off a cliff without having to try first (for more formal examples, see <u><a href="https://www.cambridge.org/core/journals/behavioral-and-brain-sciences/article/building-machines-that-learn-and-think-like-people/A9535B1D745A0377E16C590E14B94993">Lake et al., 2017</a></u>; <u><a href="https://www.youtube.com/watch?v=rTawFwUvnLE">Hinton, 2017</a></u>; <u><a href="https://www.youtube.com/watch?v=cWzi38-vDbE">LeCun, 2018</a></u>; <u><a href="https://www.alexirpan.com/2018/02/14/rl-hard.html">Irpan, 2018</a></u>). This fact can be taken as evidence for two different hypotheses: </p><ol><li>In order to develop human-level AI, we have to develop entirely new learning algorithms. At the moment, AI is a deep conceptual problem. </li><li>In order to develop human-level AI, we basically just have to improve current deep learning algorithms (and their hardware) a lot. At the moment, AI is an engineering problem.</li></ol><p>The question of which of these views is right I call “the secret sauce question”. </p><p>The secret sauce question seems like one of the most important considerations in estimating how long there is left until the development of human-level artificial intelligence (“timelines”). If something like 2) is true, timelines are arguably substantially shorter than if something like 1) is true [1]. </p><p>However, it seems initially difficult to arbitrate these two vague, high-level views. It appears as if though an answer requires complicated inside views stemming from deep and wide knowledge of current technical AI research. This is partly true. Yet this post proposes that there might also be single, concrete discovery capable of settling the secret sauce question: does the human brain learn using gradient descent, by implementing backpropagation?</p><h1>The importance of backpropagation</h1><p>Underlying the success of modern deep learning is a single algorithm: gradient descent with backpropagation of error (<u><a href="https://www.nature.com/articles/nature14539">LeCun et al., 2015</a></u>). In fact, the majority of research is not focused on finding better algorithms, but rather on finding better cost functions to descend using this algorithm (<u><a href="https://www.frontiersin.org/articles/10.3389/fncom.2016.00094/full">Marblestone et al., 2016</a></u>). Yet, in stark contrast to this success, since the 1980’s the key objection of neuroscientists to deep learning has been that backpropagation is not biologically plausible (Crick, 1989; Stork, 1989). </p><p>As a result, the question of whether the brain implements backpropagation provides critical evidence on the secret sauce problem. If the brain does<em> not </em>use it, and <em>still</em> outperforms deep learning while running on the energy of a laptop and training on several orders of magnitude fewer training examples than parameters, this suggests that a deep conceptual advance is necessary to build human-level artificial intelligence. There’s some other remarkable algorithm out there, and evolution found it. But if the brain <em>does</em> use backprop, then the reason deep learning works so well is because it’s somehow <em>on the right track</em>. Human researchers and evolution converged on a common solution to the problem of optimising large networks of neuron-like units. (These arguments assume that <em>if</em> a solution is biologically plausible and the best solution available, then it would have evolved). </p><p>Actually, the situation is a bit more nuanced than this, and I think it can be clarified by distinguishing between algorithms that are:</p><p><strong>Biologically actual: </strong> What the brain <em>actually </em>does. </p><p><strong>Biologically plausible: </strong>What the brain <em>might</em> have done, while still being restricted by evolutionary selection pressure towards energy efficiency etc. </p><p><em>For example, humans walk with legs, but it seems possible that evolution might have given us wings or fins instead, as those solutions work for other animals. However, evolution could not have given us wheels, as that requires a separable axle and wheel, and it&#x27;s unclear what an evolutionary path to an organism with two separable parts looks like (excluding symbiotic relationships).</em></p><p><strong>Biologically possible:</strong> What is technically possible to do with collections of cells, regardless of its relative evolutionary advantage. </p><p><em>For example, even though evolving wheels is implausible, there might be no inherent problem with an organism having wheels (created by &quot;God&quot;, say), in the way in which there&#x27;s an inherent problem with an organism’s axons sending action potentials faster than the speed of light.</em></p><p>I think this leads to the following conclusions:</p><h2><strong>Nature of backprop: Implication for timelines </strong></h2><p><strong>Biologically impossible:</strong> Unclear, there might be multiple “secret sauces”</p><p><strong>Biologically possible, but not plausible:</strong> Same as above</p><p><strong>Biologically plausible, but not actual:</strong> Timelines are long, there’s likely a “secret sauce”</p><p><strong>Biologically actual:</strong> Timelines are short, there’s likely no “secret sauce”</p><hr class="dividerBlock"/><p>In cases where evolution could not invent backprop anyway, it’s hard to compare things. That is consistent both with backprop not being the right way to go and with it being better than whatever evolution did. </p><p>It might be objected that this question doesn’t really matter, since <em>if</em> neuroscientists found out that the brain does backprop, they have not thereby created any new algorithm -- but merely given stronger evidence for the workability of previous algorithms. Deep learning researchers wouldn’t find this any more useful than Usain Bolt would find it useful to know that his starting stance during the sprint countdown is optimal: he’s been using it for years anyway, and is mostly just eager to go back to the gym.</p><p>However, this argument seems mistaken. </p><p>On the one hand, just because it’s not useful to deep learning practitioners does not mean it’s not useful others trying to estimated the timelines of technological development (such as policy-makers or charitable foundations).</p><p>On the other hand, I think this knowledge <em>is</em> very practically useful for deep learning practitioners. According to my current models, the field seems unique in combining the following features:</p><ul><li>Long iteration loops (on the order of GPU-weeks to GPU-years) for testing new ideas. </li><li>High dependence of performance on hyperparameters, such that the right algorithm with slightly off hyperparameters will not work <em>at all.</em></li><li>High dependence of performance on the amount of compute accessible, such that the differences between enough and almost enough are step-like, or qualitative rather than quantitative. Too little compute and the algorithm just doesn’t work <em>at all.</em></li><li>Lack of a unified set of first principles for understanding the problems, and instead a collection of effective heuristics</li></ul><p>This is an environment where it is critically important to develop strong priors on what <em>should </em>work, and to stick with those in face countless fruitless tests. Indeed, LeCun, Hinton and Bengio seem to have persevered for decades before the AI community stopped thinking they were crazy. (This is similar in some interesting ways to the state of astronomy and physics before Newton. I’ve blogged about this before <u><a href="https://jacoblagerros.wordpress.com/">here</a></u>.) There’s an asymmetry such that even though training a very powerful architecture can be quick (on the order of a GPU-day), iterating over architectures to figure out which ones to train fully in the first place can be incredibly costly. As such, knowing whether gradient descent with backprop is or is not the way to go would lead enable more efficient allocation of research time (though mostly so in case backprop is <em>not</em> the way to go, as the majority of current researchers assume it anyway).</p><h1>Appendix: Brief theoretical background</h1><p>This section describes what backpropagation is, why neuroscientists have claimed it is implausible, and why some deep learning researchers think those neuroscientists are wrong. The latter arguments are basically summarised from <u><a href="https://www.youtube.com/watch?v=VIRCybGgHts">this talk by Hinton</a></u>.</p><p>Multi-layer networks with access to an error signal face the so-called “credit assignment problem”. The error of the computation will only be available at the output: a child pronouncing a word erroneously, a rodent tasting an unexpectedly nauseating liquid, a monkey mistaking a stick for a snake. However, in order for the network to improve its representations and avoid making the same mistake in the future, it has to know which representations to “blame” for the mistake. Is the monkey too prone to think long things are snakes? Or is it bad at discriminating the textures of wood and skin? Or is it bad at telling eyes from eye-sized bumps? And so forth. This problem is exacerbated by the fact that neural network models often have tens or hundreds of thousands of parameters, not to mention the human brain, which is estimated to have on the order of 1014 synapses. Backpropagation proposes to solve this problem by observing that the maths of gradient descent work out such that one can essentially send the error signal from the output, back through the network towards the input, modulating it by the strength of the connections along the way. (A complementary perspective on backprop is that it is just an efficient way of computing derivatives in large computational graphs, see e.g. <u><a href="http://colah.github.io/posts/2015-08-Backprop/">Olah, 2015</a></u>).</p><p>Now why do some neuroscientists have a problem with this?</p><h2>Objection 1: </h2><p>Most learning in the brain is unsupervised, without any error signal similar to those used in supervised learning.</p><p><strong>Hinton&#x27;s reply:</strong></p><p>There are at least three ways of doing backpropagation without an external supervision signal: </p><p>1. <strong>Try to reconstruct the original input (using e.g. auto-encoders), and thereby develop representations sensitive to the statistics of the input domain </strong></p><p>2. <strong>Use the broader context of the input to train local features</strong></p><p><em>For example, in the sentence “She scromed him with the frying pan”, we can infer that the sentence as a whole doesn’t sound very pleasant, and use that to update our representation of the novel word “scrom” </em></p><p>3. <strong>Learn a generative model that assigns high probability to the input (e.g. using variational auto-encoders or the wake-sleep algorithm from the 1990’s)</strong></p><p>Bengio and colleagues (<u><a href="https://www.youtube.com/watch?v=FhRW77rZUS8">2017</a></u>) have also done interesting work on this, partly reviving energy-minimising Hopfield networks from the 1980’s</p><h2>Objection 2:</h2><p>Objection 2. Neurons communicate using binary spikes, rather than real values (this was among the earliest objections to backprop).</p><p><strong>Hinton&#x27;s reply: </strong></p><p>First, one can just send spikes stochastically and use the expected spike rate (e.g. with a poisson rate, which is somewhat close to what real neurons do, although there are important differences see e.g., <u><a href="https://www.nature.com/articles/nn1790">Ma et al., 2006</a></u>; <u><a href="https://www.cs.cmu.edu/afs/cs/academic/class/15883-f15/readings/pouget-2003.pdf">Pouget et al. 2003</a></u>). </p><p>Second, this might make evolutionary sense, as the stochasticity acts as a regularising mechanism making the network more robust to overfitting. This behaviour is in fact where Hinton got the idea for the drop-out algorithm (which has been very popular, though it recently seems to have been largely replaced by batch normalisation). </p><h2>Objection 3:</h2><p>Single neurons cannot represent two distinct kind of quantities, as would be required to do backprop (the presence of features and gradients for training).</p><p><strong>Hinton&#x27;s reply:</strong></p><p>This is in fact possible. One can use the temporal derivative of the neuronal activity to represent gradients. </p><p>(There is interesting neuropsychological evidence supporting the idea that the temporal derivative of a neuron can <em>not</em> be used to represent changes in that feature, and that different populations of neurons are required to represent the presence and the change of a feature. Patients with certain brain damage seem able to recognise that a moving car occupies different locations at two points in time, without being able to ever detect the car changing position.)</p><h2>Objection 4: </h2><p>Cortical connections only transmit information in one direction (from soma to synapse), and the kinds of backprojections that exist are far from the perfectly symmetric ones used for backprop.</p><p><strong>Hinton&#x27;s reply: </strong></p><p>This led him to abandon the idea that the brain could do backpropagation for a decade, until “a miracle appeared”. Lillicrap and colleagues at DeepMind (<u><a href="https://www.nature.com/articles/ncomms13276">2016</a></u>) found that a network propagating gradients back through <em>random</em> and <em>fixed</em> feedback weights in the hidden layer can match the performance of one using ordinary backprop, given a mechanism for normalization and under the assumption that the weights preserve the sign of the gradients. This is a remarkable and surprising result, and indicates that backprop is still poorly understood. (See also follow-up work by <u><a href="https://arxiv.org/pdf/1510.05067.pdf">Liao et al., 2016</a></u>).</p><hr class="dividerBlock"/><p>[1] One possible argument for this is that in a larger number of plausible worlds, if 2) is true and conceptual advances are necessary, then building superintelligence will turn into an engineering problem once those advances have been made. Hence 2) requires strictly more resources than 1). </p><h1>Discussion questions</h1><p>I&#x27;d encourage discussion on:</p><p>Whether the brain does backprop (object-level discussion on the work of Lillicrap, Hinton, Bengio, Liao and others)?</p><p>Whether it&#x27;s actually important for the secret sauce question to know whether the brain does backprop?</p><p>To keep things focused and manageable, it seems reasonable to disencourage discussion of what <em>other</em> secret sauces there might be.</p> jacobjacob QWyYcjrXASQuRHqC5 2018-03-09T22:13:05.432Z The Copernican Revolution from the Inside https://www.lesswrong.com/posts/JAAHjm4iZ2j5Exfo2/the-copernican-revolution-from-the-inside <p>The Copernican revolution was a pivotal event in the history of science. Yet I believe that the lessons most often taught from from this period are largely historically inaccurate and that the most important lessons are basically <em>not taught at all </em>[1]. As it turns out, the history of the Copernican revolution carries important and surprising lessons about rationality -- about what it is and is not like to figure out how the world actually works. Also, it’s relevant to deep learning, but it’ll take me about 5000 words on renaissance astronomy to make that point.</p><p>I used to view the Copernican revolution as an epic triumph of reason over superstition, of open science over closed dogma. Basically, things went as follows: Copernicus figured out that the sun rather than the earth is at the center of our planetary system. This theory immediately made sense of the available data, undermining its contorted predecessors with dazzling elegance. Yet its adoption was delayed by the Catholic Church fighting tooth and claw to keep the truth at bay. Eventually, with the emergence of Newton’s work and the dawn of the Enlightenment, heliocentrism became undeniable and its adoption inevitable [2].</p><p>This view is inaccurate. Copernicus system was <em>not</em> immediately superior. It was rejected by many people who were <em>not</em> puppets of the Church. And among those who did accept it, better fit to the data was <em>not</em> a main reason. What did in fact happen will become clear in a moment. But in reading that, I’d like to prompt you to consider the events from a very particular vantage point: namely what they would be like <em>from the inside</em>. Ask yourself not what these events seem like for a millennial with the overpowered benefit of historical hindsight, but for a Prussian astronomer, an English nobleman or a Dominican priest.</p><p>More precisely, there are two key questions here.</p><p>First, if you lived in the time of the Copernican revolution, would you have accepted heliocentrism? I don’t mean this as a social question, regarding whether you would have had the courage and resources to stand up to the immensely powerful Catholic Church. Rather, this is an epistemic question: based on the evidence and arguments available to you, would you have accepted heliocentrism? For most of us, I think the answer is unfortunately, emphatically, and surprisingly, <em>no</em>. The more I’ve read about the Copernican revolution, the less I’ve viewed it as a key insight followed by a social struggle. Instead I now view it as a complete mess: of inconsistent data, idiosyncratic mysticism, correct arguments, equally convincing arguments <em>that were wrong</em>, and various social and religious struggles thrown in<em> as well</em>. It seems to me an incredibly valuable exercise to try and feel this mess from the inside, in order to gain a sense what intellectual progress, historically, has actually been like. Hence a key reason for writing this post is not to provide any clear answers -- although I will make some tentative suggestions -- but to provoke a legitimate sense of confusion.</p><p>If things were that chaotic, then this raises the second question. How should you develop intellectually, in order to become the kind of person who would have accepted heliocentrism during the Copernican revolution? Which intellectual habits, if any, unite heliocentric thinkers like Copernicus, Kepler, Galileo and Descartes, and separates them from thinkers like Ptolemy and Tycho? Once again, my answer will be tentative and limited. But my questions, on the other hand, are arguably the right ones.</p><h2>What happened</h2><p>My view of the Copernican revolution used to be that when people finally switched to the heliocentric model, <em>something clicked</em>. The data was suddenly predictable and understandable. Something like how Andrew Wiles describes his experience of doing mathematics:</p><p>“[...] in terms of entering a dark mansion. You go into the first room and it&#x27;s dark, completely dark. You stumble around, bumping into the furniture. Gradually, you learn where each piece of furniture is. And finally, after six months or so, you find the light switch and turn it on. Suddenly, it&#x27;s all illuminated and you can see exactly where you were.”</p><p>However, this is most certainly <em>not</em> how things appeared at the time. Let’s start at the beginning.</p><h3>1. Scholasticism</h3><p>The dominant medieval theory of physics, and by extension astronomy, was Scholasticism, a combination of Aristotelian physics and Christian theology. Scholasticism was a geocentric view. It placed the earth firmly at the center of the universe, and surrounded it with a series of concentric, rotating “crystalline spheres”, to which the celestial bodies were attached.</p><h3>2. Ptolemy</h3><p>Ptolemy of Alexandria provided the mathematical foundation for geocentrism, around 100 AD. He wanted to explain two problematic observations. First, the planets appear to move at different speeds at different times, contrary to the Aristotelian thesis that they should move with a constant motion. Second, some planets, like Mars, occasionally seem to briefly move backwards in their paths before returning to their regular orbit. Like this:</p><p><img src="http://img.youtube.com/vi/5-bJGzLAq58/0.jpg" class="draft-inline-image" alt="Mars RetrogradeMotion"/> </p><p><a href="http://www.youtube.com/watch?v=5-bJGzLAq58">Link to 7-second video.</a></p><p>In order to explain these phenomena, Ptolemy introduced the geometric tools of equants and epicycles. He placed the earth slightly off the center of the planetary orbits, had the planets themselves orbit in little mini-cycles -- so-called “epicycles” -- along their original orbit, and introduced another off-center point, called the equant, in relation to which the motions of the planets are uniform, and which Ptolemy also claimed “controlled” the speed of the planets along their larger orbits. Like this:</p><p><img src="http://res.cloudinary.com/dq3pms5lt/image/upload/v1509529514/Circles_eqxajv.png" class="draft-inline-image" alt=""/> [3]</p><p>Here’s how these additions make sense of retrograde motion [4]:</p><p>https://www.youtube.com/watch?v=</p><p> <img src="https://img.youtube.com/vi/ubBcO2CpOMA/0.jpg" class="draft-inline-image" alt="MarsRetrograde"/></p><p><a href="https://www.youtube.com/watch?v=ubBcO2CpOMA">Link to 9-second video.</a></p><p>The ability of the Ptolemaic system to account for these phenomena, predicting planetary positions to within a few degrees (Brown, 2016), was a key contributor to its widespread popularity. In fact, the Ptolemaic model is so good that it’s still being used to generate celestial motions in planetariums (Wilson, 2000).</p><h3>3. Copernicus</h3><p>Copernicus published his heliocentric theory while on his deathbed, in 1543. It retained the circular orbits. More importantly, it of course placed the sun at the centre of the universe and proposed that the earth rotates around its own axis. Copernicus was keen to get rid of Ptolemy’s equants, which he abhorred, and instead introduced the notion of an epicyclet (which, to be fair, is kind of just like an equant with its own mini-orbit) [5]. Ptolemy’s system had required huge epicycles, and Copernicus was able to substantially reduce their size.</p><p>Retrograde motion falls out of his theory like this:</p><p><img src="https://img.youtube.com/vi/8c_xWokvZz4/0.jpg" class="draft-inline-image" alt=""/> </p><p><a href="https://www.youtube.com/watch?v=8c_xWokvZz4">Link to 9-second video.</a></p><p>In order to get the actual motion of the planets correct, both Ptolemy and Copernicus had to bolster their models with many more epicycles, and epicycles upon epicycles, than shown in the above figure and video. Copernicus even considered introducing an epicyclepicyclet -- “an epicyclet whose center was carried round by an epicycle, whose center in turn revolved on the circumference of a deferent concentric with the sun as the center of the universe”... (Complete Dictionary of Scientific Biography, 2008).</p><p>Pondering his creation, Copernicus concluded an early manuscript outline his theory thus “Mercury runs on seven circles in all, Venus on five, the earth on three with the moon around it on four, and finally Mars, Jupiter, and Saturn on five each. Thus 34 circles are enough to explain the whole structure of the universe and the entire ballet of the planets” (MacLachlan &amp; Gingerich, 2005).</p><p>These inventions might appear like remarkably awkward -- if not ingenious -- ways of making a flawed system fit the observational data. There is however quite an elegant reason why they worked so well: they form a primitive version of Fourier analysis, a modern technique for function approximation. Thus, in the constantly expanding machinery of epicycles and epicyclets, Ptolemy and Copernicus had gotten their hands on a powerful computational tool, which would in fact have allowed them to approximate orbits of a very large number of shapes, including squares and triangles (Hanson, 1960)!</p><p>Despite these geometric acrocrabitcs, <em>Copernicus theory did not fit the available data better than Ptolemy’s</em>. In the second half of the 16th century, renowned imperial astronomer Tycho Brahe produced the most rigorous astronomical observations to date -- and found that they even fit Copernicus’ data worse than Ptolemy’s in some places (Gingerich, 1973, 1975).</p><p>This point seems to have been recognized clearly by enlightenment scholars, many of whom instead chose to praise the increased simplicity and coherence of the Copernican system. However, as just described, it is unclear whether it even offered any such improvements. As Kuhn put it, Copernicus’s changes seem “great, yet strangely small”, when considering the complexity of the final system (Kuhn, 1957). The mathematician and historian Otto Neugebauer writes:</p><p>“Modern historians, making ample use of the advantage of hindsight, stress the revolutionary significance of the heliocentric system and the simplifications it had introduced. In fact, the actual computation of planetary positions follows exactly the ancient pattern and the results are the same. [...] Had it not been for Tycho Brahe and Kepler, the Copernican system would have contributed to the perpetuation of the Ptolemaic system in a slightly more complicated form but more pleasing to philosophical minds.” (Neugebauer, 1968)</p><h3>4. Kepler and Galileo</h3><p>At the turn of the 17th century Kepler, armed with Tycho Brahe’s unprecedentedly rigorous data, revised Copernicus’ theory and introduced elliptical orbits [6]. He also stopped insisting that the planets follow uniform motions, allowing him to discard the cumbersome epicyclical machinery.</p><p>Around the same time Galileo invented the telescope. Upon examining the celestial bodies, he found irregularities that seemed to contradict the Scholastic view of the heavens as a perfect, unchanging realm. There were spots on the sun...</p><p><img src="http://res.cloudinary.com/dq3pms5lt/image/upload/v1509529651/Circles2_oaf80m.png" class="draft-inline-image" alt=""/> </p><p>...craters and mountains on the moon…</p><p><img src="http://res.cloudinary.com/dq3pms5lt/image/upload/v1509529652/Circles3_pekgmf.png" class="draft-inline-image" alt=""/> </p><p>...and four new moons orbiting Jupiter.</p><p><img src="http://res.cloudinary.com/dq3pms5lt/image/upload/v1509529651/Circles4_crmrw1.png" class="draft-inline-image" alt=""/> </p><p>Spurred on by his observations, Galileo would soon begin his ardent defense of heliocentrism. Despite the innovations of Galileo and Kepler, the path ahead wasn’t straightforward.</p><p>Galileo focused his arguments on Copernicus’ system, <em>not</em> Kepler’s. And in doing so he faced not only the problems with fitting positional planet data, which Kepler had solved, but also theoretical objections, to which Kepler was still vulnerable.</p><p>Consider the tower argument. This is a simple thought experiment: if you drop an object from a tower, it lands right below where you dropped it. But if the earth were moving, shouldn’t it instead land some distance away from where you dropped it?</p><p>You might feel shocked upon reading the argument, in the same way you might feel shocked by your grandpa making bigoted remarks at the Christmas table, or by a friend trying to recruit you to a pyramid scheme. Just <em>writing</em> it, I feel like I’m penning some kind of crackpot, flat-earth polemic. But if the reason is “well obviously it doesn’t fall like that… something something Newton…” then remind yourself of the fact that Isaac Newton <em>had not yet been born</em>. The dominant physical and cosmological theory of the day was still Aristotle’s. If your answer to the tower argument in any way has to invoke Newton, then you likely wouldn’t have been able to answer it in 1632.</p><p>Did you manage to find some other way of accounting for objects falling down in a straight line from the tower? You might want to take a few minutes to think about it.</p><p>[...time for thinking…]</p><p>Now if at the end of thinking you convinced yourself of yadda yadda straight line physics yadda yadda you were unfortunately mistaken. The tower argument is <em>correct</em>. Objects <em>do</em> drift when falling, due to the earth’s rotation -- but at a rate which is imperceptible for most plausible tower heights. This is known as the “Coriolis effect”, and wasn’t properly understood mathematically until the 19th century.</p><p>In addition a fair number of astronomical observations seemed to qualitatively contradict heliocentrism -- by leaving out predicted phenomena -- as opposed to just providing quantitative discrepancies in planetary positions. Consider the stellar parallax. A “parallax” is the effect you might have noticed while looking out of a car window, and seeing how things that are closer to you seem to fly by at a faster pace than things farther away. Like this:</p><p><img src="https://img.youtube.com/vi/sGApLQTV-sI/0.jpg" class="draft-inline-image" alt=""/> </p><p><a href="https://www.youtube.com/watch?v=sGApLQTV-sI">Link to 15-second video.</a></p><p>If the earth orbits the sun, something similar should be visible on the night sky, with nearby stars changing their position substantially in relation to more distant stars. Like this:</p><p><img src="https://img.youtube.com/vi/_D7sbn27arE/0.jpg" class="draft-inline-image" alt=""/> </p><p><a href="https://www.youtube.com/watch?v=_D7sbn27arE">Link to 22-second video.</a></p><p>No one successfully detected a stellar parallax during the renaissance. This included Tycho, who as mentioned above had gathered the most accurate and exhaustive observations to date. His conclusion was that either the distant stars were so distant that a parallax wasn’t detectable using his instruments -- which would entail that space was mostly an unfathomably vast void -- or there simply was no stellar parallax to be detected.</p><p>Once again, with the benefit of hindsight it is easy to arbitrate this debate. Space just is really, really, really vast. But it is worth noticing here the similarity to Russell’s teapot-style arguments [7]. On two points in a row, the defenders of heliocentrism have been pushed into unfalsifiable territory:</p><p><strong>Heliocentrist:</strong> “There <em>is</em> drift when objects fall from towers -- we just can’t measure it!”</p><p><strong>Geocentrist:</strong> “But provide a phenomenon we <em>can </em>measure, then.”</p><p><strong>Heliocentrist:</strong> “Well, according to my recent calculations, a stellar parallax should be observable under these conditions…”</p><p><strong>Geocentrist: </strong>“But Tycho’s data -- the best data astronomical we’ve ever had -- fails to find any semblance of a parallax. Even Tycho himself thinks the idea is crazy”</p><p><strong>Heliocentrist:</strong> “The fact that Tycho couldn’t detect it doesn’t mean it’s not there! The stars could be too far away for it to be detected. And things aren’t absurd just because prominent scientists say they’re absurd”</p><p><strong>Geocentrist</strong>: “Hold on… not only does your new theory contradict all of established physics, but whenever you’re asked for a way to verify it you propose a phenomenon that’s barely testable… and when the tests come out negative you blame the tests and not the theory!”</p><p><strong>Heliocentrist:</strong> “Okay okay, I’ll give you something else… heliocentrism predicts that Venus will sometimes be on the same side of the sun as Earth, and sometimes on the opposite side...”</p><p><strong>Geocentrist:</strong> “Yes?”</p><p><strong>Heliocentrist:</strong> “This means that Venus should appear to vary in size… by...” and the heliocentrist scribbles in his notebook “… as much as… six times.”</p><p>And this prediction of the change in size of venus was indeed made by proponents of heliocentrism.</p><p>And, once again, although today we know this phenomena does in fact appear, the available observations of the 17th century failed to detect it.</p><p>This might all seem messy, complicated, disappointing. If <em>this</em> is what the history of intellectual progress actually looks like, how can we ever hope to make deliberate progress in the direction of truth?</p><p>It might be helpful to examine a few thinkers -- Copernicus, Kepler, Descartes, Galileo -- who actually accepted heliocentrism, and try to better understand their reasons for doing so.</p><p>Little is known about the intellectual development and motivations of Copernicus, as the biography written about him by his sole pupil has been lost. Nonetheless, a tentative suggestion is that he developed rigorous technical knowledge across many fields and found himself in environments which were, if not iconoclastic, at least exceptionally open-minded. According to historian Paul Knoll:</p><p>“[The arts faculty at the University of Cracow, where Copernicus studied] held the threefold promise of mathematics and astronomy which were abreast of any developments elsewhere in Europe, of philosophical questioning which undermined much the foundations of much that had been characteristically medieval, and of a critical humanistic attitude which was transforming older cultural and educational values” (Knoll, 1975)</p><p>Later, when studying law at the University of Bologna, Copernicus stayed with the astronomy professor Domenico Maria Novara, described as “a mind that dared to challenge the authority of [Ptolemy], the most eminent ancient writer in his chosen fields of study” (Sheila, 2015). Copernicus was also a polymath, who studied law in addition to mathematics and astronomy, and developed an early theory of inflation. His pupil Rheticus was an excellent mathematician, and provided crucial support in helping Copernicus complete his final, major work.</p><p>Beyond that some authors claim that Copernicus was influenced by a kind of neoplatonism that regarded the sun as a semi-divine entity, being the source of life and energy -- which made him more content to place it at the centre of the universe (Kuhn, 1957). These claims are however disputed (Sheila, 2015).</p><p>These conditions -- technical skill, interdisciplinary knowledge and open-mindedness -- seem necessary for Copernicus development, but they also feel glaringly insufficient.</p><p>As for Kepler and Descartes, their acceptance of heliocentrism was not motivated by careful consideration of the available data, but commitments to larger philosophical projects. Kepler is known as a mathematician and astronomer, but in his own day he insisted that he be regarded as a philosopher, concerned with understanding the ultimate nature of the cosmos (Di Liscia, 2017). He did have access to better data -- Tycho’s observations -- than most people before him, and he pored over it with tremendous care. Nonetheless, his preference for elliptical over circular orbits was equally influenced by mystical views regarding the basic geometric harmony of the universe, in which the sun provided the primary source of motive force (Ladyman, 2001; Di Liscia, 2017; Westman, 2001).</p><p>Something similar was true of Descartes, although his underlying philosophical agenda is quite different. A striking example of these commitments is that both Kepler and Descartes argued that a heliocentric world-view was self-evident, in the sense of being derivable from first principles without recourse to empirical observation (Frankfurt, 1999).</p><p>Beyond that, I know too little about their respective views to be able to offer any more detailed, mechanistic account of why they preferred heliocentrism.</p><p>Galileo -- Copernicus’ bulldog -- is a confusing figure as well. Just like Copernicus, Kepler and Descartes, Galileo was not purely guided by careful experiment and analysis of the data -- despite the weight popular history often places upon these characteristics of his. As Einstein writes in his foreword to a modern edition of Galileo’s <em>Dialogue</em>:</p><p>“It has often been maintained that Galileo became the father of modern science by replacing the speculative, deductive method with the empirical, experimental method. I believe, however, that this interpretation would not stand close scrutiny. There is no empirical method without speculative concepts and systems; and there is no speculative thinking whose concepts do not reveal, on closer investigation, the empirical material from which they stem.” (Einstein, 2001)</p><p>For Galileo, this speculative system consisted in replacing the four Aristotelian elements with a single, unified theory of matter, and replacing the view of nature as a teleological process with a view of it a deterministic, mechanistically intelligible process. Einstein later points out that in some respects this approach was inevitable given the limited experimental methods available to Galileo (for example, he could only measure time intervals longer than a second).</p><p>Galileo was also a man of courage and belligerence. One of his strengths was an absolute refusal to accept arguments from authority without experimental evidence or careful reasoning. It appears as if though his belligerence aided him several times in a quite ironic way. Many of the arguments he marshalled against his opponents were either incorrect, or correct but based on incorrect observations. One example is his attempt to derive a theory of tides from the motions of the earth, a project to which he devotes about a fourth of his famous <em>Dialogue</em>. Einstein, again, writes “it was Galileo’s longing for a mechanical proof of the motion of the earth which misled him into formulating a wrong theory of the tides. [These] fascinating arguments [...] would hardly have been accepted as proofs by Galileo, had his temperament not got the better of him”.</p><p>Moreover, Galileo’s observations of sunspots and moon craters weren’t unproblematic. In both cases there is evidence to indicate that he was fooled by optical illusions. And though he <em>was</em> also right about the existence of moons orbiting Jupiter, which contradicted the uniqueness of the earth as the only planet with a moon, what he actually observed rather seems to have been Saturn’s rings (Ladyman, 2001) [8].</p><p>Nonetheless, at this point you might be aching to object that, disregarding inconsistent data, theoretical flaws, failed predictions and incorrect formulation of a theory of the tides... surely Galileo’s <em>Dialogue</em> provided other convincing arguments that finally tipped the balance in favour of heliocentrism?</p><p>Alas, history is <em>messy</em>.</p><p>Recall that Galileo defended Copernicus system, not Kepler’s, and hence had to deal with its flaws. More strikingly, in the above I still haven’t mentioned the existence of a third major theory, rivalling both Ptolemy and Copernicus: Tycho Brahe’s combined geoheliocentric theory. This theory retained the moon and sun in orbit around the earth but placed all the other planets in orbit around the sun.</p><p><img src="http://res.cloudinary.com/dq3pms5lt/image/upload/v1509529651/Circles5_dh26vk.png" class="draft-inline-image" alt=""/> </p><p>Galileo’s <em>Dialogue</em> does not engage with Tycho’s theory <em>at all</em>. One suggested explanation (given by an unknown Wikipedia contributor) is that, assuming Galileo’s theory of the tides, the Ptolemaic and Tychonic systems are identical, and hence it would suffice to rebut the former. <em>But the theory of the tides was wrong.</em></p><p>These theories only differ in their prediction of whether we should be able to observe stellar parallaxes. And as mentioned above, Tycho’s data had failed to detect one, which he saw as key evidence for his view.</p><p>Eventually though, this historical mess was straightened out, and a crucial experiment arbitrated Galileo, Tycho and Ptolemy. German astronomer Friedrich Bessel’s finally managed to observe a stellar parallax <em>in 1838</em>. About 200 hundred years later. By <em>that point</em>, the Copernican revolution was surely already over -- even the Catholic church had removed Copernicus’ <em>De revolutionibus</em> from Its index of banned books, as it was simply accepted as true (Lakatos &amp; Zahar, 1975).</p><h3>5. Newton</h3><p>At one point Newton also came along, but Galileo died about a year before he was born. Newton’s marriage of physics and mathematics, which implied Kepler’s laws as a special case, was crucial in demonstrating the viability of heliocentrism. But nonetheless some thinkers did something very right decades before the arrival of the Cambridge genius, which he was very well aware of. For the Copernican revolution might have been completed by Newton, but in the end he still stood on the shoulders of giants.</p><h2>Now what?</h2><p>One purpose of this essay has been to portray an important historical era in a more realistic way than other popular portrayals. I asked two questions at the beginning:</p><ol><li>If you lived in the time of the Copernican revolution, would you have accepted heliocentrism?</li><li>How should you develop intellectually, in order to become the kind of person who would have accepted heliocentrism during the Copernican revolution?</li></ol><p>The preceding section argued that the answer to the first question might quite likely have been no. This section takes a closer look at the second question. I do however want to preface these suggestions by saying that I don’t have a good answer to this myself, and suggest you take some time to think of your own answers to these questions. I’d love to hear your thought in the comments.</p><h3>What about Ibn ash-Shãtir?</h3><p>There seems to be some Islamic scholars who beat Copernicus to his own game by a few hundred years. I’d be keen to learn more about their story and intellectual habits.</p><h3>Careful with appearances</h3><p>Geocentrists liked to claim that it certainly seems like the sun orbits the earth, and not vice versa. There is something odd about this. Consider the following Wittgenstein anecdote:</p><p>“He [Wittgenstein] once asked me [Anscombe]: ‘Why do people say it is more logical to think that the sun turns around the Earth than Earth rotating around its own axis?’ I answered: ‘I think because it seems as if the sun turns around the Earth.’ ‘Good,’ he said, ‘but how would it have been if it had seemed as if the Earth rotates around its own axis then?’” (Anscombe, 1959)</p><p>This quote hopefully inspired in you a lovely sense of confusion. If it it didn’t, try reading it again.</p><p>When I said above that it certainly seems like the sun orbits the earth and not vice versa, what I meant to say was that it certainly <em>seems like</em> it seems like the sun orbits the earth and not vice versa [9].</p><p>There’s a tendency to use the word “seems” in quite a careless fashion. For example, most people might agree that it seems like, if an astronaut were to push a bowling ball into space, it would eventually slow down and stop, because that’s what objects do. At least most people living prior to the 20th century. However, we, <em>and they</em>, <em>already know</em> that this cannot be true. It suffices to think about the difference between pushing a bowling ball over a carpet, or over a cleaned surface like polished wood, or over ice -- there’s a slippery slope here which, if taken to its logical extreme, should make it seem reasonable that a bowling ball wouldn’t stop in space. A prompt I find useful is to try to understand why the behaviour of the bowling ball in space could not have been any other way, given how it behaves on earth. That is, trying to understand why, if we genuinely thought a bowling ball would slow down in space, this would entail that the universe was impossibly different from the way it actually is.</p><p>Something similar seems true of the feeling that the sun orbits the earth, and this is brought out in the Wittgenstein anecdote. What we think of as “it seems as if though the sun orbits the earth” is actually just us carelessly imposing a mechanism upon a completely different sensation, namely the sensation of “celestial objects seeming to move <em>exactly as they would move</em> if the earth orbited the sun and not vice versa”. Whatever it would look like to live in a world where the opposite was true, it certainly wouldn’t look like this.</p><h3>Careful with your reductios</h3><p>Many of the major mistakes made by opponents of heliocentrism was to use reductio ad absurdum arguments without really considering whether the conclusion was absurd enough to actually overturn the original argument. Tycho correctly noted that either there wasn’t a stellar parallax or he couldn’t measure it, but incorrectly took the former as more plausible. Proponents of the tower argument <em>assumed</em> that objects fall down in straight lines without drift, and that anything else would be percetible by the naked eye. In both cases, people would just have been better off biting the bullet and accepting the implications of heliocentrism. That, of course, raises the question of which bullets one should bite -- and that question is beyond the scope of this essay.</p><h3>The data is not enough</h3><p>There’s a naïve view of science according to which the scientist first observes all the available data, then formulates a hypothesis that fits it, and finally tries to falsify this new hypothesis by making a new experiment. The Copernican revolution teaches us that the relation between data and theory is in fact much more subtle than this.</p><p>A true theory does not have to immediately explain all the data better than its predecessors, and can remain inconsistent with parts of the data for a long time.</p><p>The relation between data and theory is not a one-way shooting range, but an intricate two-way interplay. The data indicates which of our theories are more or less plausible. But our theories also indicate which data is more or less trustworthy [10]. This might seem like a sacrilegious claim to proponents of the naïve view described above: “ignore the data!? That’s just irrational cherry-picking!” Sure, dishonest cherry-picking is bad. Nonetheless, as the Copernican revolution shows, the act of disregarding some data in a principled manner as it doesn’t conform to strong prior expectations has been critical to the progress of science [11].</p><p>When Einstein famously remarked “God doesn’t play dice”, he arguably adopted the same kind of mindset. He had built a complex worldview characterised by a certain mathematical law-likeness, and was confident in it to the extent that if quantum mechanics threatened its core principles, then quantum mechanics was wrong -- not him.</p><p>Sometimes, scientists have to be bold -- or arrogant -- enough to trust their priors over the data.</p><h3>...and, finally, deep learning</h3><p>It seems apt to notice some similarities between the state of astronomy during the Copernican revolution and the current state of deep learning research.</p><p>Both are nascent fields, without a unifying theory that can account for the phenomena from first principles, like Newtonian physics eventually did for astronomy.</p><p>Both have seen researchers cling to their models for decades without encouraging data: many of the most succesful current deep learning techniques (conv nets, recurrent nets and LSTMs, gradient descent, ...) were invented in the 20th century, but didn’t produce spectacular results until decades later when sufficient computing power became available. It would be interesting to find out if people like like Geoffrey Hinton and Yann Le Cunn share intellectual habits with people like Copernicus and Galileo.</p><p>Finally, I’m particularly struck by the superficial similarities between the way Ptolemy and Copernicus happened upon a general, overpowered tool for function approximation (Fourier analysis) that enabled them to misleadingly gerrymander false theories around the data, and the way modern ML has been criticized as an inscrutable heap of linear algebra and super-efficient GPUs. I haven’t explored whether these similarities go any deeper, but one implication seems to be that the power and versatility of deep learning might allow suboptimal architectures to perform deceivingly well (just like the power of epicycle-multiplication kept geocentrism alive) and hence distract us from uncovering the actual architectures underlying cognition and intelligence.</p><hr class="dividerBlock"/><p>Crossposted to my blog <a href="https://jacoblagerros.wordpress.com/2017/10/26/the-copernican-revolution-from-the-inside/">here.</a></p><hr class="dividerBlock"/><h2>Footnotes</h2><p>[1] They of course <em>are</em> taught, because that is how I learnt about them. But this was in a university course on the philosophy of science. The story of Galileo is probably taught in most middle schools [no source, my own hunch]. But only about 0.5% of US college students major in philosophy [<a href="http://dailynous.com/2016/04/18/philosophy-degrees-how-many-are-awarded-and-to-whom/">source</a>], and I’d guesstimate something like a third of them to take classes in philosophy of science.</p><p>[2] This last step is kind of a blackbox. My model was something like “a true theory was around for long enough, and gained enough support, that it was eventually adopted”. This sounds quite romantic, if not magical. It’s unclear exactly <em>how</em> this happened, and in particular what strategic mistakes the Church made that allowed it to.</p><p>[3] Figure credit of the Polaris Institute of Iowa State University, which provides a great tutorial on medieval and renaissance astronomy <a href="http://www.polaris.iastate.edu/EveningStar/Unit2/unit2_sub1.htm">here</a>.</p><p>[4] I spent way too long trying to understand this, but <a href="http://www.keplersdiscovery.com/Equant.html">this animation</a> was helpful.</p><p>[5] I spent two hours trying to understand the geometry of this and I won’t drag you down that rabbit-hole, but if you’re keen to explore yourself, check out these links: [<a href="https://books.google.co.uk/books?id=_NZsb6B8ATkC&amp;pg=PA19&amp;lpg=PA19&amp;dq=equants+vs+%22epicyclet%22&amp;source=bl&amp;ots=Z4fofMO5Fl&amp;sig=dt26e43FEGJ_oo4qD89YDosx3h0&amp;hl=en&amp;sa=X&amp;ved=0ahUKEwi78IWf0IzXAhWBfRoKHVvpCUIQ6AEILDAB#v=onepage&amp;q=equants%20vs%20%22epicyclet%22&amp;f=false">1</a>], [<a href="https://books.google.co.uk/books?id=RbT1CAAAQBAJ&amp;pg=PA78&amp;lpg=PA78&amp;dq=equants+vs+%22epicyclet%22&amp;source=bl&amp;ots=evHCiHDudO&amp;sig=Ft9h8RpSK88KYebtujfdmtT2slY&amp;hl=en&amp;sa=X&amp;ved=0ahUKEwi78IWf0IzXAhWBfRoKHVvpCUIQ6AEIKjAA#v=onepage&amp;q=equants%20vs%20%22epicyclet%22&amp;f=false">2</a>].</p><p>[6] It is however a common mistake to imagine these as clearly elongated ellipses: their eccentricity is very small. For most practical purposes apart from measurement and prediction they look like circles (Price, 1957).</p><p>[7] Russell&#x27;s teapot is a skeptic thought-experiment intended to reveal the absurdity of unfalsifiable views, by postulating that there’s a teapot orbiting Jupiter and that it’s too small to be detectable, but nonetheless insisting that it really is there.</p><p>[8] And to think my philosopher friends thinks Gettier problems are nonsense!</p><p>[9] There’s of course a sense of mysticism in this, which -- like the rest of Wittgenstein’s mysticism -- I don’t like. Mysticism is mostly just a clever way of scoring social understanding-the-world-points without actually understanding the world. It might be that heliocentrism and geocentrism are genuinely indistinguishable from our vantage point, in which case the confusion here is just a linguistic sleight-of-hand, rather than an actual oddity in how we perceive the world. But this doesn’t seem correct. After all, we were able to figure out heliocentrism <em>from our</em> <em>vantage point</em>, indicating that heliocentrism is distinguishable from geocentrism from our vantage point.</p><p>[10] In Bayesian terms, your posterior is determined by both your likelihoods and your priors.</p><p>[11] And is core to rationality itself, on the Bayesian view.</p><h2>References</h2><p>Anscombe, E. (1959). <em>An Introduction to Wittgenstein’s Tractatus. </em>pp. 151.</p><p>Brown, M. (2016) “Copernicus’ revolution and Galileo’s vision: our changing view of the universe in pictures”. <em>The Conversation</em>. Available online <a href="https://theconversation.com/copernicus-revolution-and-galileos-vision-our-changing-view-of-the-universe-in-pictures-60103">here</a>.</p><p>&quot;Copernicus, Nicholas.&quot; Complete Dictionary of Scientific Biography. Retrieved October 26, 2017 from Encyclopedia.com, <a href="http://www.encyclopedia.com/science/dictionaries-thesauruses-pictures-and-press-releases/copernicus-nicholas">here</a>.</p><p>Di Liscia, D. A. &quot;Johannes Kepler&quot;. <em>The Stanford Encyclopedia of Philosophy </em>(Fall 2017 Edition), Zalta, E. N. (ed.).</p><p>Einstein, A. (2001). (Foreword) <em><a href="http://www.amazon.com/gp/product/037575766X?ie=UTF8&amp;tag=kenperrott&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=037575766X">Dialogue Concerning the Two Chief World Systems</a></em>: Ptolemaic and Copernican.</p><p>Frankfurt, H. (1999), <em>Necessity, Volition and Love</em>. pp. 40.</p><p>Gingerich, O. J. (1973) “The Copernican Celebration”. <em>Science Year</em>, pp. 266-267.</p><p>Gingerich, O. J. (1975) “‘Crisis’ versus aesthetic in the Copernican revolution”. <em>Vistas in Astronomy</em> 17(1), pp. 85-95.</p><p>Hanson, N. R. (1960) “The Mathematical Power of Epicyclical Astronomy” <em>Isis</em>, 51(2), pp. 150-158.</p><p>Knoll, P. (1975) “The Arts Faculty at the University of Cracow at the end of the Fifteenth Century”. <em>The Copernican Achievement</em>, Westman, R. S (ed.)</p><p>Kuhn, T. (1957) <em>The Copernican Revolution</em>. pp. 133.</p><p>Ladyman, J. (2001) <em>Understanding Philosophy of Science</em>. Chapter 4: Revolutions and Rationality.</p><p>Lakatos, I. &amp; Zahar, E. (1975). “Why did Copernicus Research Program Supersede Ptolemy’s?”. <em>The Copernican Achievement</em>, Westman, R. S (ed.)</p><p>MacLachlan, J &amp; Gingerich, O. J. (2005) <em>Nicolaus Copernicus: Making the Earth a Planet</em>, pp. 76.</p><p>Neugebauer, O. (1968). “On the Planetary Theory of Copernicus”, <em>Vistas in Astronomy.</em>,10, pp. 103.</p><p>Sheila, R. &quot;Nicolaus Copernicus&quot;. <em>The Stanford Encyclopedia of Philosophy </em>(Fall 2015 Edition), Zalta, E. N. (ed.), available <a href="https://plato.stanford.edu/archives/fall2015/entries/copernicus/">online</a>.</p><p>Price, D. J. (1957) “Contra Copernicus: a critical re-estimation of the mathematical Planetary Theory of Copernicus, Ptolemy and Kepler”. <em>Critical Problems in the History of Science</em>, Clagett, M. (ed.).</p><p>Westman, R. S. (2001) &quot;Kepler&#x27;s early physical-astrological problematic.&quot; <em>Journal for the History of Astronomy, </em>32, pp. 227-236.</p><p>Wilson, L. A. (2000) “The Ptolemaic Model” in the Polaris Project, Iowa State University. Available online <a href="http://www.polaris.iastate.edu/EveningStar/Unit2/unit2_sub1.htm">here</a>.</p> jacobjacob JAAHjm4iZ2j5Exfo2 2017-11-01T10:51:50.127Z