david_kristoffersson feed - LessWrong 2.0 Readerdavid_kristoffersson’s posts and comments on the Effective Altruism Forumen-usComment by David_Kristoffersson on Is this viable physics?
https://lw2.issarice.com/posts/gXL6N4rgyKQyfSJqN/is-this-viable-physics?commentId=u49hB5J4WgFh9QW9o
<blockquote>
<p>That being said, I would bet that one would be able to find other formalisms that are equivalent after kicking down the door...</p>
</blockquote>
<p>At least, we've now hit one limit in the shape of universal computation: No new formalism will be able to do something that couldn't be done with computers. (Unless we're gravely missing something about what's going on in the universe...)</p>
david_kristofferssonu49hB5J4WgFh9QW9o2020-06-25T16:00:09.588ZComment by David_Kristoffersson on Good and bad ways to think about downside risks
https://lw2.issarice.com/posts/NdDrh3ZRJvuv7BcL9/good-and-bad-ways-to-think-about-downside-risks?commentId=KpwPmo3pAwS2qGB8m
<blockquote>
<p>When it comes to the downside risk, it's often that there are more unknown unknown that produce harm then positive unknown unknown. People are usually biased to overestimate the positive effects and underestimate the negative effects for the known unknown.</p>
</blockquote>
<p>This seems plausible to me. Would you like to expand on why you think this is the case?</p>
<p>The asymmetry between creation and destruction? (I.e., it's harder to build than it is to destroy.)</p>
david_kristofferssonKpwPmo3pAwS2qGB8m2020-06-12T14:07:00.762ZComment by David_Kristoffersson on Good and bad ways to think about downside risks
https://lw2.issarice.com/posts/NdDrh3ZRJvuv7BcL9/good-and-bad-ways-to-think-about-downside-risks?commentId=btRgvD5NTBBTFAtid
<p>Very good point! The effect of not taking an action depends on what the counterfactual is: what would happen otherwise/anyway. Maybe the article should note this.</p>
david_kristofferssonbtRgvD5NTBBTFAtid2020-06-11T14:18:40.334ZComment by David_Kristoffersson on mind viruses about body viruses
https://lw2.issarice.com/posts/xhYE36x4bzjoCTAf5/mind-viruses-about-body-viruses?commentId=AyDFjCaAWticB9dkf
<p>Excellent comment, thank you! Don't let the perfect be the enemy of the good if you're running from an exponential growth curve.</p>
david_kristofferssonAyDFjCaAWticB9dkf2020-04-03T17:28:54.075ZComment by David_Kristoffersson on The recent NeurIPS call for papers requires authors to include a statement about the potential broader impact of their work
https://lw2.issarice.com/posts/3xRyt4KnmTMcpgfdX/the-recent-neurips-call-for-papers-requires-authors-to?commentId=gdSCoz6sWCAc8PJHJ
<p>Looks promising to me. <a href="https://forum.effectivealtruism.org/posts/XCwNigouP88qhhei2/differential-progress-intellectual-progress-technological">Technological development isn't by default good</a>.</p>
<p>Though I agree with the other commenters that this could fail in various ways. For one thing, if a policy like this is introduced without guidance on how to analyze the societal implications, people will think of wildly different things. ML researchers aren't by default going to have the training to analyze societal consequences. (Well, who does? We should develop better tools here.)</p>
david_kristofferssongdSCoz6sWCAc8PJHJ2020-02-24T13:16:53.628ZComment by David_Kristoffersson on Jan Bloch's Impossible War
https://lw2.issarice.com/posts/hmai5Lru5kWXpH7Ju/jan-bloch-s-impossible-war?commentId=gN5LoZr4FC22zLp8T
<p>Or, at least, include a paragraph or a few to summarize it!</p>
david_kristofferssongN5LoZr4FC22zLp8T2020-02-21T12:13:01.828ZState Space of X-Risk Trajectories
https://lw2.issarice.com/posts/P5bczkJAAST9KrkRW/state-space-of-x-risk-trajectories
<p><em>Justin Shovelain developed the core ideas in this article and assisted in writing, David Kristoffersson was the lead writer and editor.</em></p>
<p><a href="https://forum.effectivealtruism.org/posts/TCxik4KvTgGzMowP9/state-space-of-x-risk-trajectories"><em>Cross-posted on the EA Forum</em></a></p>
<h1>Abstract</h1>
<p>Currently, people tend to use many key concepts informally when reasoning and forming strategies and policies for existential risks (x-risks). A well-defined formalization and graphical language for paths and choices would help us pin down more exactly what we think and let us see relations and contrasts more easily. We construct a common state space for futures, trajectories, and interventions, and show how these interact. The space gives us a possible beginning of a more precise language for reasoning and communicating about the trajectory of humanity and how different decisions may affect it.</p>
<h1>Introduction</h1>
<p>Understanding the possible trajectories of human civilization and the futures they imply is key to steering development towards safe and beneficial outcomes. The trajectory of human civilization will be highly impacted by the development of advanced technology, such as synthetic biology, nanotechnology, or <a href="https://en.wikipedia.org/wiki/Artificial_general_intelligence">artificial general intelligence (AGI)</a>. Reducing existential risks means intervening on the trajectory civilization takes. To identify effective interventions, we need to be able to answer questions like: how close are we to good and bad outcomes? How probable are the good and bad outcomes? What levers in the system could allow us to shape its trajectory?</p>
<p>Previous work has modeled some aspects of existential risk trajectories. For example, various surveys and extrapolations have been made to forecast AGI timelines [1; 2; 3; 4; 5], and scenario and risk modeling have provided frameworks for some aspects of risks and interventions [6; 7; 8]. The paper <em>Long Term Trajectories of Human Civilization</em> [9] offers one form of visualization of civilizational trajectories. However, so far none of these works have defined a unified graphical framework for futures, trajectories, and interventions. Without developing more and better intellectual tools to examine the possible trajectories of the world and how to shape them, we are likely to remain confused about many of the requirements for reaching a flourishing future, causing us to take less effective actions and leaving us ill-prepared for the events ahead of us.</p>
<p>We construct a <em><a href="https://en.wikipedia.org/wiki/State_space_(physics)">state space</a> model</em> of existential risk, where our closeness to stably good and bad outcomes is represented as coordinates, ways the world could develop are paths in this space, our actions change the shapes of these paths, and the likelihood of good or bad outcomes is based on how many paths intercept those outcomes. We show how this framework provides new theoretical foundations to guide the search for interventions that can help steer the development of technologies such as AGI in a safe and beneficial direction.</p>
<p>This is part of <a href="https://www.convergenceanalysis.org/">Convergence</a>’s broader efforts to construct new tools for generating, mapping out, testing, and exploring timelines, interventions, and outcomes, and showing how these all interact. The state space of x-risk trajectories could form a cornerstone in this larger framework.</p>
<h1>State Space Model of X-Risk Trajectories</h1>
<p>As humanity develops advanced technology we want to move closer to beneficial outcomes, stay away from harmful ones, and better understand where we are. In particular, we want to understand where we are in terms of existential risk from advanced technology. Formalization and visual graphs can help us think about this more clearly and effectively. Graphs make relationships between variables clearer, allowing one to take in a lot of information in a single glance, and affording us to discern various patterns, like derivatives, oscillations, and trends. The state space model formalizes x-risk trajectories and provides us the power of visual graphs. To construct the state space, we will define x-risk trajectories geometrically. Thus, we need to formalize position, distance, and trajectories in terms of existential risk, and we need to incorporate uncertainty.</p>
<p>In the state space model of existential risk, current and future states of the world are points, possible progressions through time are trajectories through these points, and stably good or bad futures happen when any coordinate drops below zero (i.e. absorbing states).</p>
<p>Stable futures are futures of either existential catastrophe or existential safety. Stably good futures (compare to Bostrom’s ‘OK outcomes’, as in [10]) are those where society has achieved enough wisdom and coordination to guarantee the future against existential risks and other dystopian outcomes, perhaps with the aid of <a href="https://wiki.lesswrong.com/wiki/Friendly_artificial_intelligence">Friendly AI (FAI)</a>. Stably bad futures (‘bad outcomes’) are those where existential catastrophe has occurred.</p>
<p><img src="https://www.convergenceanalysis.org/wp-content/uploads/2020/02/Trajectories_illustration-e1580994931333.png%22" alt="Trajectories illustration"></p>
<p>While the framework presented in this article can be used to analyse any specific existential risk, or existential risks in general, in this article we will illustrate with a scenario where humanity may develop FAI or <a href="https://wiki.lesswrong.com/wiki/Unfriendly_artificial_intelligence">“unfriendly” AI (UFAI)</a>. For the simplest visualization of the state space, one can draw a two-dimensional coordinate system, and let the x-coordinates below 0 be the “UFAI” part of the space, and the y-coordinates below 0 be the “FAI” part of the space. The world will then take positions in the upper right quadrant, with the x-coordinate being the world’s distance from a UFAI future, and the y-coordinate being the world’s distance from an FAI future. As time progresses, the world will probably move closer to one or both of these futures, tracing out a trajectory through this space. By understanding the movement of the world through this space, one can understand which future we are headed for (in this example, whether we look likely to end up in the FAI future or the UFAI future). <sup class="footnote-ref"><a href="#fn-8SzFhSJK9P4ETKaen-1" id="fnref-8SzFhSJK9P4ETKaen-1">[1]</a></sup></p>
<p>Part of the advantage of this approach is as a cognitive aid to facilitate better communication of possible scenarios between existential risk researchers. For instance, there might be sensible (but implicit) variance in their estimates of current distance to UFAI and/or FAI (perhaps because one researcher thinks UFAI will be much easier to build than the other researcher thinks). In Fig 1, we show this as two possible start positions. They might also agree about how an intervention represented by the black line (perhaps funding a particular AI safety research agenda) would affect trajectories. But because they disagree on the world's current position in the space, they'll disagree on whether that intervention is enough. (Fig 1 indicates the intervention will not have had a chance to substantially bend the trajectory before UFAI is reached, if the world is in Start 1.) If the researchers can both see the AGI trajectory space like this, they can identify their precise point of disagreement, and thus have more chance of productively learning from and resolving their disagreement.</p>
<p>Essentially, the state space is a way to describe where the world is, where we want to go, and what we want to steer well clear of. We proceed by outlining a number of key considerations for the state space.</p>
<h1>Trajectories of the world</h1>
<p>We want to understand what outcomes are possible and likely in the future. We do this by projecting from past trends into the future and by building an understanding of the system’s dynamics. A trajectory is a path in the state space between points in the past, present, or future that may move society closer or farther away from certain outcomes.</p>
<p>As time progresses, the state of the world changes, and thus the position of the world in the space changes. As technology becomes more advanced, more extreme outcomes become possible, and the world moves closer to both the possibilities of existential catastrophe and existential safety. Given the plausible trajectories of the world, one can work out probabilities for the eventual occurrence of each stably good or bad future.</p>
<p>In the example, by drawing trajectories of the movement of the world, one can study how the world is or could be moving in relation to the FAI and UFAI futures. As a simplified illustration of how a trajectory may be changed, say society would decide to stop developing generic AGI capability and would focus purely and effectively on FAI; this may change the trajectory to a straight line moving towards FAI, assuming it’s possible to decouple progress on the two axes.</p>
<h1>Defining distance more exactly</h1>
<p>We need a notion of distance that is conducive to measurement, prediction, and action. The definition of distance is central to the meaning of the space and determines much of the mechanics of the model. The way we’ve described the state space thus far leaves it compatible with many different types of distance. However, in order to fully specify the space, one needs to choose one distance. Possible interesting choices for defining distance include: work hours, bits of information, computational time, actual time, and probabilistic distance. Further, building an ensemble of different metrics would allow us to make stronger assessments.</p>
<h1>Uncertainty over positions and trajectories</h1>
<p>We need to take into account uncertainty to properly reflect what we know about the world and the future. There is much uncertainty about how likely we are to reach various societal outcomes, whether due to uncertainty about our current state, about our trajectory, or about the impact certain interventions would have. By effectively incorporating uncertainties into our model, we can more clearly see what we don’t know (and what we should investigate further), draw more accurate conclusions, and make better plans.</p>
<p>Taking uncertainty into account means having probability distributions over positions and over the shape and speed of trajectories. Technically speaking, trajectories are represented as probability distributions that vary with time and that are roughly speaking determined by taking the initial probability distribution and repeatedly applying a transition matrix to it. This is a stochastic process that would look something like a <a href="https://en.wikipedia.org/wiki/Random_walk">random walk</a> (such as in <a href="https://en.wikipedia.org/wiki/Brownian_motion">brownian motion</a>) drifting in a particular direction. (We’d also ideally have a probability function over both the shape and speed of trajectories in a way that doesn’t treat the shape and speed as conditionally independent.) Using the example in the earlier diagram, we don’t know the timelines for FAI or UFAI with certainty. It may be 5 or 50 years, or more, before one of the stable futures is achieved. Perhaps society will adapt and self-correct towards developing the requisite safe and beneficial AGI technology, or perhaps safety and preparation will be neglected. These uncertainties and events can be modeled in the space, with trajectories passing through positions with various probabilities.</p>
<h1>Defining how to calculate speed over a trajectory</h1>
<p>We need a method to calculate the speed of the trajectories. The speed of a trajectory is assumed to primarily be determined by the rate of technological development. Timeline models allow us to determine the speed of trajectories. The connection between state space coordinates and time is in reality non trivial and possibly sort of jumpy (unless smoothed out by uncertainty). For example, an important cause of a trajectory might be a few discrete insights, such that the world has sudden, big lurches along that trajectory at the moments when those insights are reached, but moves slowly at other times.</p>
<p>Trajectories as defined in the coordinate space do not have time directly associated with them from their shapes; instead, that is an implicit quantity. That is, distance in the coordinate space does not correspond uniformly to distance in time. The same trajectory can take more or less time to go through.</p>
<p>In general, there are several ways to calculate the expected time until we have a certain technology. Expert surveys, trend extrapolation, and simulation are three useful tools that can be used for this purpose.</p>
<h1>Extensions</h1>
<p>We see many ways to extend this research:</p>
<ul>
<li><strong>Shaping trajectories</strong>: Extending the modeling with systematization (<a href="https://forum.effectivealtruism.org/posts/MvtiaBboSTrnNBp7G/four-components-of-strategy-research">starting by mapping interventions and constructing strategies</a>) and visualization of interventions to help in better understanding how to shape trajectories in order to provide new ideas for interventions.</li>
<li><strong>Specialized versions</strong>: Making specialized versions of the space for each particular existential risk (such as AI risk, biorisk, nuclear war, etc.).</li>
<li><strong>Trajectories and time</strong>: Further examining the relationships between the trajectories and time. Convergence has one mathematical model for AI timelines, and there is a family of approaches, each valid in certain circumstances. These can be characterized, simulated, and verified, and could help inspire interventions.</li>
<li><strong>Measurability</strong>: Further increasing the measurability of our position and knowledge of the state space dynamics. We don’t know exactly where the world is in coordinates, how we are moving, or how fast we’re going, and we want to refine measurement to be less of an ad hoc process and more like engineering. For example, perhaps we can determine how far we are away from AGI by projecting the needed computer power or the rate at which we’re having AGI insights. Proper measurement here is going to be subtle, because we cannot sample, and don’t entirely know what the possible AGI designs are. But by measuring as best we can, we can verify dynamics and positions more accurately and so fine tune our strategies.</li>
<li><strong>Exploring geometries</strong>: Exploring variations on the geometry, such as with different basis spaces or parametrizations of the state space, could provide us with new perspectives. Maybe there are invariants, symmetries, boundaries, or non-trivial topologies that can be modelled.</li>
<li><strong>Larger spaces</strong>: Immerse the space in larger ones, like the full geometry of Turing Machines, or state spaces that encode things like social dynamics, resource progress, or the laws of physics. This would allow us to track more dynamics or to see things more accurately.</li>
<li><strong>Resource distribution</strong>: Using the state space model as part of a greater system that helps determine how to distribute resources. How does one build a system that handles the explore vs exploit tradeoff properly, allows delegation and specialization, evaluates teams and projects clearly, self-improves in a reweighting way, allows interventions with different completion dates to be compared, incorporates unknown unknowns and hard to reverse engineer data rich intuitions cleanly, and doesn’t suffer from decay, Goodhart’s law, or the principal-agent problem? Each of these questions needs investigation.</li>
<li><strong>Trajectories simulator</strong>: Implement a software platform for the trajectories model to allow exploration, learning, and experimentation using different scenarios and ways of modeling</li>
</ul>
<h1>Conclusion</h1>
<p>The state space model of x-risk trajectories can help us think about and visualise trajectories of the world, in relation to existential risk, in a more precise and structured manner. The state space model is intended to be a stepping stone to further formalizations and “mechanizations” of strategic matters on reducing existential risk. We think this kind of mindset is rarely applied to such “strategic” questions, despite potentially being very useful for them. There are of course drawbacks to this kind of approach as well; in particular, it won’t do much good if it isn’t calibrated or combined with more applied work. We see the potential to build a synergistic set of tools to generate, map out, test, and explore timelines, interventions, and outcomes, and to show how these all interact. We intend to follow this article up with other formalizations, insights, and project ideas that seem promising. This is a work in progress; thoughts and comments are most welcome.</p>
<p><em>We wish to thank Michael Aird, Andrew Stewart, Jesse Liptrap, Ozzie Gooen, Shri Samson, and Siebe Rozendal for their many helpful comments and suggestions on this document.</em></p>
<h1>Bibliography</h1>
<p>[1]. Grace, K., Salvatier, J., Dafoe, A., Zhang, B., & Evans, O. (2017). When Will AI Exceed Human Performance? Evidence from AI Experts, 1–21. <a href="https://arxiv.org/abs/1705.08807">https://arxiv.org/abs/1705.08807</a><br>
[2]. <a href="https://nickbostrom.com/papers/survey.pdf">https://nickbostrom.com/papers/survey.pdf</a><br>
[3]. <a href="https://www.eff.org/ai/metrics">https://www.eff.org/ai/metrics</a><br>
[4]. <a href="http://theuncertainfuture.com">http://theuncertainfuture.com</a><br>
[5]. OpenPhil: What Do We Know about AI Timelines? <a href="https://www.openphilanthropy.org/focus/global-catastrophic-risks/potential-risks-advanced-artificial-intelligence/ai-timelines">https://www.openphilanthropy.org/focus/global-catastrophic-risks/potential-risks-advanced-artificial-intelligence/ai-timelines</a><br>
[6]. Barret, A. M., & Baum, S. D. (2016). A model of pathways to artificial superintelligence catastrophe for risk and decision analysis. Journal of Experimental & Theoretical Artificial Intelligence, (789541031), 1–21. <a href="https://doi.org/10.1080/09528130701472416">https://doi.org/10.1080/09528130701472416</a><br>
[7]. <a href="http://aleph.se/andart2/math/adding-cooks-to-the-broth/">http://aleph.se/andart2/math/adding-cooks-to-the-broth/</a><br>
[8]. Cotton-Barratt, O., Daniel M., Sandberg A. (2020). Defence in Depth Against Human Extinction: Prevention, Response, Resilience, and Why They All Matter. <a href="https://onlinelibrary.wiley.com/doi/full/10.1111/1758-5899.12786">https://onlinelibrary.wiley.com/doi/full/10.1111/1758-5899.12786</a><br>
[9]. Seth D Baum, et al. (2019). Long-Term Trajectories of Human Civilization. <a href="http://gcrinstitute.org/papers/trajectories.pdf">http://gcrinstitute.org/papers/trajectories.pdf</a><br>
[10]. Bostrom, N. (2013). Existential risk prevention as global priority. Global Policy, 4(1), 15–31. <a href="https://doi.org/10.1111/1758-5899.12002">https://doi.org/10.1111/1758-5899.12002</a></p>
<hr class="footnotes-sep">
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn-8SzFhSJK9P4ETKaen-1" class="footnote-item"><p>How does the state space of x-risk trajectories model compare to the trajectory visualizations in [9]? The axes are almost completely different. Their trajectories graphs have an axis for time; the state space doesn’t. Their graphs have an axis for population size; the state space doesn’t. In the state space, each axis represents a stably bad or a stably good future. Though, in the visualizations in [9], hitting the x-axis represents extinction, which maps somewhat to hitting the axis of a stably bad future in the trajectories model. The visualizations in [9] illustrate valuable ideas but they seem to be less about choices or interventions than the state space model is. <a href="#fnref-8SzFhSJK9P4ETKaen-1" class="footnote-backref">↩︎</a></p>
</li>
</ol>
</section>
david_kristofferssonP5bczkJAAST9KrkRW2020-02-09T13:56:41.402ZComment by David_Kristoffersson on A point of clarification on infohazard terminology
https://lw2.issarice.com/posts/Rut5wZ7qyHoj3dj4k/a-point-of-clarification-on-infohazard-terminology?commentId=awyBZA8RLq5biKZQf
<p>Some quick musings on alternatives for the "self-affecting" info hazard type:</p><ul><li>Personal hazard</li><li>Self info hazard</li><li>Self hazard</li><li>Self-harming hazard</li></ul>david_kristofferssonawyBZA8RLq5biKZQf2020-02-03T11:10:05.019ZComment by David_Kristoffersson on AI alignment concepts: philosophical breakers, stoppers, and distorters
https://lw2.issarice.com/posts/AFfjT9ySr8zgqMRPw/ai-alignment-concepts-philosophical-breakers-stoppers-and?commentId=QArMvPBd6b5bHRaSb
<p>I wrote this comment to an earlier version of Justin's article:</p><p>It seems to me that most of the 'philosophical' problems are going to get solved as a matter of solving practical problems in building useful AI. You could call ML systems, AI, that is getting developed now 'empirical'. From the perspective of the people building current systems, they likely don't consider what they're doing as solving philosophical problems. <a href="https://en.wikipedia.org/wiki/Symbol_grounding_problem">Symbol grounding problem</a>? Well, an image classifier built on a convolutional neural network learns to get quite proficient at grounding out classes like 'cars' and 'dogs' (symbols) from real physical scenes.</p><p>So, the observation I want to make, is that the philosophical problems we can think of that might trip over a system are likely to turn out to look like technical/research/practical problems that need to be solved by default for practical reasons in order to make useful systems.</p><p>The image classification problem wasn't solved in one day, but it was solved using technical skills, engineering skills, more powerful hardware, and more data. People didn't spend decades discussing philosophy: the problem was solved from some advances in the ideas of building neural networks and from more powerful computers.<br>Of course, image classification doesn't solve the symbol grounding problem in full. But other aspects of symbol grounding that people might find mystifying are getting solved piece-wise, as researchers and engineers are solving practical problems of AI.</p><p>Let's look at a classic problem formulation from MIRI, 'Ontology Identification':</p><blockquote>Technical problem (Ontology Identification). Given goals specified in some ontology and a world model, how can the ontology of the goals be identified in the world model? What types of world models are amenable to ontology identification? For a discussion, see Soares (2015).</blockquote><p>When you create a system that performs any function in the real world, you are in some sense giving it goals. Reinforcement Learning-trained systems are pursuing 'goals'. An autonomous car takes you from chosen points A to chosen points B; it has the overall goal of transporting people. The ontology identification problem is getting solved piece-wise as a practical matter. Perhaps the MIRI-style theory could give us a deeper understanding that helps us avoid some pitfalls, but it's not clear why these wouldn't be caught as practical problems.</p><p>What would a real philosophical landmine look like? A class of philosophical problems that wouldn't get solved as a practical matter, <em>and</em> pose a risk for harm against humanity would be the real philosophical landmines.</p>david_kristofferssonQArMvPBd6b5bHRaSb2020-01-30T14:44:35.752ZComment by David_Kristoffersson on AIXSU - AI and X-risk Strategy Unconference
https://lw2.issarice.com/events/QBqzEsX64M5ZyaLPS/aixsu-ai-and-x-risk-strategy-unconference?commentId=ykazQQTp2pqe5mer6
<p>I expect the event to have no particular downside risks, and to give interesting input and spark ideas in experts and novices alike. Mileage will vary, of course. Unconferences foster dynamic discussion and a living agenda. If it's risky to host this event, then I expect AI strategy and forecasting meetups and discussions at EAG to be risky and they should also not be hosted.</p><p>I and other attendees of AIXSU pay careful attention to potential downside risks. I also think it's important we don't strangle open intellectual advancement. We need to figure out what we should talk about; not that we shouldn't talk.</p><p>AISC: To clarify: AI safety camp is different and puts bigger trust in the judgement of novices, since teams are generally run entirely by novices. The person who proposed running a strategy AISC found the reactions from experts to be mixed. He also reckoned the event would overlap with the existing AI safety camps, since they already include strategy teams. </p><p>Potential negative side effects of strategy work is a very important topic. Hope to discuss it with attendees at the unconference!</p>david_kristofferssonykazQQTp2pqe5mer62019-09-06T00:55:46.831ZAIXSU - AI and X-risk Strategy Unconference
https://lw2.issarice.com/events/QBqzEsX64M5ZyaLPS/aixsu-ai-and-x-risk-strategy-unconference
<p><strong>Start:</strong> Friday, November 29, 10am<br><strong>End:</strong> Sunday, December 1, 7pm<br><strong>Location:</strong> <a href="http://eahotel.org/wiki/#travel">EA Hotel</a>, 36 York Street, Blackpool</p><p><strong>AIXSU</strong> is an unconference on AI and existential risk strategy. As it is an unconference, the event will be created by the participants. There will be an empty schedule which you, the participants, will fill up with talks, discussions and more.</p><p>AIXSU is inspired by <u><a href="https://www.lesswrong.com/posts/yuMuDGnJ8omGhMx9y/taisu-technical-ai-safety-unconference">TAISU</a></u>, which was <a href="https://www.lesswrong.com/posts/MmX2ZqET2QDYpSMDp/taisu-2019-field-report#TAISU">a successful AI Safety unconference at the EA Hotel</a> in August. The AI and existential risk strategy space seems to be in need for more events and AIXSU hopes to close this gap a bit. The unconference will be three days long.</p><p>To enable high-level discussion during the unconference, we require that all participants have some prior involvement with AI or existential risk strategy. AI and existential risk strategy concerns the broad spectrum of things we need to solve in order for humanity to handle the technological transitions ahead of us. Topics of interest include but are not limited to: <strong>Macrostrategy, technological forecasting, technological scenarios, AI safety strategy, AI governance, AI policy, AI ethics, cooperative principles and institutions, and foundational philosophy on the future of humanity</strong>. Here is an incomplete list of sufficient criteria:</p><ul><li>You have participated in one of the following: <a href="https://forum.effectivealtruism.org/posts/cPZ9w2Wxxu2kA9EDg/workshop-strategy-ideas-and-life-paths-for-reducing">Strategy, ideas, and life paths for reducing existential risks</a>, <a href="https://aisafetycamp.com/">AI Safety Camp</a>, <a href="https://rationality.org/workshops/apply-msfp">MSFP/AISFP</a>, <a href="http://humanaligned.ai/">Human-aligned AI Summer School</a>, <a href="https://www.lesswrong.com/events/z9peEBfuiPB7L2Edb/learning-by-doing-ai-safety-workshop">Learning-by-doing AI Safety workshop</a>, and have an interest in strategic questions.</li><li>Are currently or have previously worked for or interned at an established existential risk reduction organization </li><li>Have published papers or sufficiently high quality blog posts on strategy-related topics</li><li>Combination of involvement in AI safety or other existential risk work with interest in strategy. For example, you’ve worked on AI safety on and off for a few years and also have an active interest in strategy-related questions.</li><li>You are pursuing a possible future in AI strategy or existential risk strategy and have read relevant texts on the topic.</li></ul><p>If you feel uncertain about qualifying, please feel free to reach out and we can have a chat about it.</p><p>You can participate in the unconference as many or as few days as you would like to. You are also welcome to stay longer at the EA Hotel before or after the unconference.</p><p><strong>Price:</strong> Pay what you want (cost price is £10/person/day). <br><strong>Food:</strong> All meals will be provided by EA Hotel. All food will be vegan. <br><strong>Lodging:</strong> The EA hotel has two dorm rooms that have been reserved for AIXSU participants. If the dorm rooms are filled up enough, or if you would like your own room, there are many nearby hotels that you can book. We will provide information on nearby hotels.</p><p>Attendance is on a first-come, first-served basis. Make sure to apply soon if you want to secure your spot.</p><p><em><u><a href="https://docs.google.com/forms/d/1lY_VjGeJxI-zY8AzoE-cux9vYIX-QNIXrxcTPJPXsKA">Apply to attend AIXSU here</a></u></em></p>david_kristofferssonQBqzEsX64M5ZyaLPS2019-09-03T11:35:39.283ZComment by David_Kristoffersson on Three Stories for How AGI Comes Before FAI
https://lw2.issarice.com/posts/2Z8pMDfDduAwtwpcX/three-stories-for-how-agi-comes-before-fai?commentId=DxewDcRpuAL3gjSa4
<blockquote>We can subdivide the security story based on the ease of fixing a flaw if we're able to detect it in advance. For example, vulnerability #1 on the <a href="https://www.cloudflare.com/learning/security/threats/owasp-top-10/">OWASP Top 10</a> is injection, which is typically easy to patch once it's discovered. Insecure systems are often right next to secure systems in program space.</blockquote><p>Insecure systems are right next to secure systems, and many flaws are found. Yet, the larger systems (the company running the software, the economy, etc) manage to correct somehow. It's because there are mechanisms in the larger systems poised to patch the software when flaws are discovered. Perhaps we could fit and optimize this flaw-exploit-patch-loop in security as a technique for AI alignment.</p><blockquote>If the security story is what we are worried about, it could be wise to try & develop the AI equivalent of OWASP's <a href="https://cheatsheetseries.owasp.org/">Cheat Sheet Series</a>, to make it easier for people to find security problems with AI systems. Of course, many items on the cheat sheet would be speculative, since AGI doesn't actually exist yet. But it could still serve as a useful starting point for brainstorming.</blockquote><p>This sounds like a great idea to me. Software security has a very well developed knowledge base at this point and since AI is software, there should be many good insights to port.</p><blockquote>What possibilities aren't covered by the taxonomy provided?</blockquote><p>Here's one that occurred to me quickly: Drastic technological progress (presumably involving AI) destabilizes society and causes strife. In this environment with more enmity, safety procedures are neglected and UFAI is produced.</p>david_kristofferssonDxewDcRpuAL3gjSa42019-08-17T14:48:16.645ZComment by David_Kristoffersson on Project Proposal: Considerations for trading off capabilities and safety impacts of AI research
https://lw2.issarice.com/posts/y5fYPAyKjWePCsq3Y/project-proposal-considerations-for-trading-off-capabilities?commentId=T6v28siGeQLkMs38h
<p>This seems like a valuable research question to me. I have a project proposal in a drawer of mine that is strongly related: "Entanglement of AI capability with AI safety".</p>david_kristofferssonT6v28siGeQLkMs38h2019-08-17T13:38:34.713ZComment by David_Kristoffersson on A case for strategy research:
what it is and why we need more of it
https://lw2.issarice.com/posts/pE5LgmF9mJptvund9/a-case-for-strategy-research-what-it-is-and-why-we-need-more?commentId=qw7gQwFyjszEQDJgn
<p>My guess is that the ideal is to have semi-independent teams doing research. Independence in order to better explore the space of questions, and some degree of plugging in to each other in order to learn from each other and to coordinate.</p><blockquote>Are there serious info hazards, and if so can we avoid them while still having a public discussion about the non-hazardous parts of strategy?</blockquote><p>There are info hazards. But I think if we can can discuss Superintelligence publicly, then yes; we can have a public discussion about non-hazardous parts of strategy.</p><blockquote>Are there enough people and funding to sustain a parallel public strategy research effort and discussion?</blockquote><p>I think you could get a pretty lively discussion even with just 10 people, if they were active enough. I think you'd need a core of active posters and commenters, and there needs to be enough reason for them to assemble.</p>david_kristofferssonqw7gQwFyjszEQDJgn2019-07-12T07:08:57.094ZComment by David_Kristoffersson on A case for strategy research:
what it is and why we need more of it
https://lw2.issarice.com/posts/pE5LgmF9mJptvund9/a-case-for-strategy-research-what-it-is-and-why-we-need-more?commentId=uBSnkK7Rfqy6MxuTw
<p>Nice work, Wei Dai! I hope to read more of your posts soon.</p><blockquote>However I haven't gotten much engagement from people who work on strategy professionally. I'm not sure if they just aren't following LW/AF, or don't feel comfortable discussing strategically relevant issues in public.</blockquote><p>A bit of both, presumably. I would guess a lot of it comes down to incentives, perceived gain, and habits. There's no particular pressure to discuss on LessWrong or the EA forum. LessWrong isn't perceived as your main peer group. And if you're at FHI or OpenAI, you'll have plenty contact with people who can provide quick feedback already.</p>david_kristofferssonuBSnkK7Rfqy6MxuTw2019-06-21T18:02:59.030ZComment by David_Kristoffersson on A case for strategy research:
what it is and why we need more of it
https://lw2.issarice.com/posts/pE5LgmF9mJptvund9/a-case-for-strategy-research-what-it-is-and-why-we-need-more?commentId=cD8Wyr8snFSoiyNtG
<blockquote>I'm very confused why you think that such research should be done publicly, and why you seem to think it's not being done privately.</blockquote><p>I don't think the article implies this:</p><blockquote>Research should be done publicly</blockquote><p>The article states: "We especially encourage researchers to share their strategic insights and considerations in write ups and blog posts, unless they pose information hazards."<br/>Which means: share more, but don't share if you think there are possible negative consequences of it.<br/>Though I guess you could mean that it's <em>very hard</em> to tell what might lead to negative outcomes. This is a good point. This is why we (Convergence) is prioritizing research on information hazard handling and research shaping considerations.</p><blockquote>it's not being done privately</blockquote><p>The article isn't saying strategy research isn't being done privately. What it is saying is that we <em>need more</em> strategy research and should increase investment into it.</p><blockquote>Given the first sentence, I'm confused as to why you think that "strategy research" (writ large) is going to be valuable, given our fundamental lack of predictive ability in most of the domains where existential risk is a concern.</blockquote><p>We'd argue that to get better predictive ability, we need to do strategy research. Maybe you're saying the article makes it looks like we are recommending any research that looks like strategy research? This isn't our intention.</p>david_kristofferssoncD8Wyr8snFSoiyNtG2019-06-21T17:09:58.336ZComment by David_Kristoffersson on AI Safety Research Camp - Project Proposal
https://lw2.issarice.com/posts/KgFrtaajjfSnBSZoH/ai-safety-research-camp-project-proposal?commentId=aN2SL9mLsjM6Rf9ph
<p>Yes -- the plan is to have these on an ongoing basis. I'm writing this just as the deadline was passed for the one planned to April.</p><p>Here's the web site: https://aisafetycamp.com/</p><p>The facebook is also a good place to keep tabs on it: https://www.facebook.com/groups/348759885529601/</p>david_kristofferssonaN2SL9mLsjM6Rf9ph2019-01-24T11:15:00.685ZComment by David_Kristoffersson on Beware Social Coping Strategies
https://lw2.issarice.com/posts/QJRo5HZp9ZdzoK7x3/beware-social-coping-strategies?commentId=8qj2biFkFJwNyNEWd
<blockquote>Your relationship with other people is a macrocosm of your relationship with yourself. </blockquote><p>I think there's <em>something</em> to that, but it's not that general. For example, some people can be very kind to others but harsh with themselves. Some people can be cruel to others but lenient to themselves.</p><blockquote>If you can't get something nice, you can at least get something predictable</blockquote><p><a href="http://slatestarcodex.com/2016/09/12/its-bayes-all-the-way-up/">The desire for the predictable is what Autism Spectrum Disorder is all about, I hear. </a></p>david_kristoffersson8qj2biFkFJwNyNEWd2018-02-05T09:42:40.043ZComment by David_Kristoffersson on "Taking AI Risk Seriously" (thoughts by Critch)
https://lw2.issarice.com/posts/HnC29723hm6kJT7KP/taking-ai-risk-seriously-thoughts-by-critch?commentId=33a77FQ7fcLn2AZNu
<p>Here's the <a href="https://www.lesserwrong.com/posts/KgFrtaajjfSnBSZoH/ai-safety-research-camp-project-proposal">Less Wrong post for the AI Safety Camp</a>!</p><p></p>david_kristoffersson33a77FQ7fcLn2AZNu2018-02-02T04:32:45.847ZAI Safety Research Camp - Project Proposal
https://lw2.issarice.com/posts/KgFrtaajjfSnBSZoH/ai-safety-research-camp-project-proposal
<h2>AI Safety Research Camp - Project Proposal</h2><p><em>→ Give your feedback on our plans below or in the <a href="https://docs.google.com/document/d/1QlKruAZuuc5ay0ieuzW5j5Q100qNuGNCTHgC4bIEXsg/edit?ts=5a651a00#">google doc</a></em><br/><em>→ <u><a href="https://docs.google.com/forms/d/e/1FAIpQLScL9QaM5vLQSpOWxxb-Y8DUP-IK4c8DZlYSkL6pywz8OSQY1g/viewform">Apply</a></u> to take part in the Gran Canaria camp on 12-22 April (deadline: 12 February)</em><br/><em>→ <u><a href="https://www.facebook.com/groups/348759885529601/">Join</a></u> the Facebook group</em></p><h3>Summary</h3><p><strong>Aim: </strong>Efficiently launch aspiring AI safety and strategy researchers into concrete productivity by creating an ‘on-ramp’ for future researchers.</p><p>Specifically:</p><ol><li>Get people started on and immersed into concrete research work intended to lead to papers for publication.</li><li>Address the bottleneck in AI safety/strategy of few experts being available to train or organize aspiring researchers by efficiently using expert time.</li><li>Create a clear path from ‘interested/concerned’ to ‘active researcher’.</li><li>Test a new method for bootstrapping talent-constrained research fields.</li></ol><p><strong>Method: </strong>Run an online research group culminating in a two week intensive in-person research camp. Participants will work in groups on tightly-defined research projects on the following topics: </p><ul><li>Agent foundations</li><li>Machine learning safety</li><li>Policy & strategy</li><li>Human values</li></ul><p>Projects will be proposed by participants prior to the start of the program. Expert advisors from AI Safety/Strategy organisations will help refine them into proposals that are tractable, suitable for this research environment, and answer currently unsolved research questions. This allows for time-efficient use of advisors’ domain knowledge and research experience, and ensures that research is well-aligned with current priorities.</p><p>Participants will then split into groups to work on these research questions in online collaborative groups over a period of several months. This period will culminate in a two week in-person research camp aimed at turning this exploratory research into first drafts of publishable research papers. This will also allow for cross-disciplinary conversations and community building, although the goal is primarily research output. Following the two week camp, advisors will give feedback on manuscripts, guiding first drafts towards completion and advising on next steps for researchers.</p><p><strong>Example: </strong>Multiple participants submit a research proposal or otherwise express an interest in interruptibility during the application process, and in working on machine learning-based approaches. During the initial idea generation phase, these researchers read one another’s research proposals and decide to collaborate based on their shared interests. They decide to code up and test a variety of novel approaches on the relevant AI safety gridworld. These approaches get formalised in a research plan.</p><p>This plan is circulated among advisors, who identify the most promising elements to prioritise and point out flaws that render some proposed approaches unworkable. Participants feel encouraged by expert advice and support, and research begins on the improved research proposal.</p><p>Researchers begin formalising and coding up these approaches, sharing their work in a Github repository that they can use as evidence of their engineering ability. It becomes clear that a new gridworld is needed to investigate issues arising from research so far. After a brief conversation, their advisor is able to put them in touch with the relevant engineer at Deepmind, who gives them some useful tips on creating this.</p><p>At the research camp the participants are able to discuss their findings and put them in context, as well as solve some technical issues that were impossible to resolve part-time and remotely. They write up their findings into a draft paper and present it at the end of the camp. The paper is read and commented on by advisors, who give suggestions on how to improve the paper’s clarity. The paper is submitted to NIPS 2018’s Aligned AI workshop and is accepted.</p><p><strong>Expected outcome: </strong>Each research group will aim to produce results that can form the kernel of a paper at the end of the July camp. We don’t expect every group to achieve this, as research progress is hard to predict.</p><ol><li>At the end of the camp, from five groups, we would expect three to have initial results and a first draft of a paper that the expert advisors find promising.</li><li>Within six months following the camp, three or more draft papers have been written that are considered to be promising by the research community.</li><li>Within one year following the camp, three or more researchers who participated in the project obtain funding or research roles in AI safety or strategy.</li></ol><p><strong>Next steps following the camp:</strong> When teams have produced promising results, camp organizers and expert advisors will endeavour to connect the teams to the right parties to help the research shape up further and be taken to conclusion.</p><p>Possible destinations for participants who wish to remain in research after the camp would likely be some combination of:</p><ol><li>Full-time internships in areas of interest, for instance <u><a href="https://deepmind.com/careers/476630/">Deepmind</a></u>, <u><a href="https://www.fhi.ox.ac.uk/vacancies/">FHI</a></u> or <u><a href="http://humancompatible.ai/jobs#internship">CHAI</a></u> </li><li>Full-time research roles at AI safety/strategy organisations</li><li>Obtaining research funding such as <u><a href="https://www.openphilanthropy.org/focus/global-catastrophic-risks/potential-risks-advanced-artificial-intelligence/open-philanthropy-project-ai-fellows-program">OpenPhil</a></u> or <u><a href="https://futureoflife.org/2017/12/20/2018-international-ai-safety-grants-competition/">FLI</a></u> research grants - successful publications may unlock new sources of funding</li><li>Independent remote research</li><li>Research engineering roles at technical AI safety organisations</li></ol><p>Research projects can be tailored towards participants’ goals - for instance researchers who are interested in engineering or machine learning-related approaches to safety can structure a project to include a significant coding element, leading to (for instance) a GitHub repo that can be used as evidence of engineering skill. This is also a relatively easy way for people who are unsure if research work is for them to try it out without the large time investment and opportunity cost of a PhD or masters program, although we do not see it as a full replacement for these.</p><h3>Plan</h3><p><strong><u><a href="https://docs.google.com/document/d/1l02CtyQo-sUiXv4RlwRB2DHOXyS6pC7_lGcx0vxOOtY/edit">Timeline</a></u>: </strong>We anticipate this project having 4 main phases (dates are currently open for discussion):</p><ol><li>Plan and develop the project, recruit researchers and look for advisors - December 2017 to April 2018</li><li>Testing and refinement of event design during a small-scale camp at Gran Canaria - April 12-22</li><li>Project selection, refinement and exploration (online) - April 2018 to July 2018</li><li>Research camp (in person) - July/August 2018</li></ol><p><strong>Recruiting: </strong>We plan to have approximately 20 researchers working in teams of 3-5 people, with projects in agent foundations, machine learning, strategy/policy and human values/cognition. Based on responses to a registration form we have already posted online (link <u><a href="https://drive.google.com/open?id=1xj7ffjihLyIqtPPEMozwnHA3HsKhzFD7ZxblVkA1O5g">here</a></u>) we expect to be able to easily meet this number of participants. </p><p>Each team will be advised by a more experienced researcher in the relevant area, however we expect this won’t be as tightly-coupled a relationship as that between PhD students and their supervisors - the aim is to maximise the usefulness of the relatively scarce advisor time and to develop as much independence in researchers as possible.</p><p><strong>Project selection and exploration: </strong>Once the initial recruitment phase is complete, researchers and advisors can choose a project to work on and refine it into a single question answerable within the timeframe. We recognise the need for strong project planning skills and careful project choice and refinement here, and this project choice is a potential point of failure (see Important Considerations below). Following project selection, researchers will begin exploring the research project they’ve chosen in the months between project choice and the research camp. This would probably require five to ten hours a week of commitment from researchers, mostly asynchronously but with a weekly ‘scrum’ meeting to share progress within a project team. Regular sharing of progress and forward planning will be important to keep momentum going.</p><p><strong>Research camp:</strong> Following the selection and exploration, we will have a two-week intensive camp assembling all participants in-person at a retreat to do focused work on the research projects. Exploratory work can be done asynchronously, but finishing research projects can be hard work and require intensive communication which can more easily be done in person. This also makes the full-time element of this project much more bounded and manageable for most potential participants. An in-person meeting also allows for much better communication between researchers on different projects, as well as helping form lasting and fruitful connections between researchers.</p><h3>Important Considerations</h3><p><strong>Shaping the research question: </strong>Selecting good research questions for this project will be challenging, and is one of the main potential points of failure. The non-traditional structure of the event brings with it some extra considerations. We expect that most projects will be:</p><ol><li>Tractable to allow progress to be made in a short period of time, rather than conceptually complex or open-ended</li><li>Closely related to current work, e.g. suggestions found in ‘further work’ or ‘open questions’ sections from recent papers</li><li>Parallelisable across multiple researchers, e.g. evaluating multiple possible solutions to a single problem or researching separate aspects of a policy proposal</li></ol><p>This biases project selection towards incremental research, i.e. extending previous work rather than finding completely new approaches. This is hard to avoid in these circumstances, and we are optimising at least partly for the creation of new researchers who can go on to do more risky, less incremental research in the future. Furthermore, a look at the ‘future work/open questions’ sections of many published safety papers will reveal a broad selection of interesting, useful questions that still meet the criteria above so although this is a tradeoff, we do not expect it to be overly limiting. A good example of this in the Machine Learning subfield would be evaluating multiple approaches to one of the problems listed in DeepMind’s recent <u><a href="https://arxiv.org/abs/1711.09883">AI Safety gridworlds paper</a></u>.</p><p><strong>Finding advisors: </strong>Although we intend this to be relatively self-contained, some amount of advice from active researchers will be beneficial at both the project selection and research stages, as well as at the end of the camp. The most useful periods for advisor involvement will be at the initial project selection/shaping phase and at the end of the camp - the former allows for better, more tractable projects as well as conveying previously unpublished relevant information and a sense of what’s considered interesting. The latter will be useful for preparing papers and integrating new researchers into the existing community. Informal enquiries suggest that it is likely to be possible to recruit advisors for these stages, but ongoing commitments will be more challenging.</p><p>The expected commitment during project selection and shaping would be one or two sessions of several hours spent evaluating and commenting on proposed research projects. This could be done asynchronously or by video chat. Commitment at the end of the research camp is likely to be similar - responding to initial drafts of papers with suggestions of improvements or further research in a similar way to the peer review process.</p><p><strong>Costs: </strong>The main costs for the Gran Canaria camp, the AirBnBs, meals and low-income travel reimbursements, have been covered now by two funders. The July camp will likely take place in the UK at the <u><a href="https://www.facebook.com/groups/1624791014242988/?ref=br_rs">EA Hotel</a></u>, a co-working hub planned by Greg Colbourn (for other options, see <u><a href="https://docs.google.com/spreadsheets/d/1cX4yrEH4Kw8-CdT9zRhV7EHD1xXLO6T9NqiK5FQ5pKk/edit#gid=0">here</a></u>). For this, we will publish a funding proposal around April. Please see <u><a href="https://docs.google.com/spreadsheets/d/1P5W8u8czOp_MEaZtb2iwPmmFs1CFkkH6RuI0NHrfP_I/edit#gid=781421298">here</a></u> for the draft budgets. </p><h3>Long-term and wider impacts</h3><p>If the camp proves to be successful, it could serve as the foundation for yearly recurring camps to keep boosting aspiring researchers into productivity. It could become a much-needed additional lever to grow the fields of AI safety and AI strategy for many years to come. The research camp model could also be used to grow AI safety research communities where none presently exist, but there is a strong need - in China, for instance. By using experienced coordinators and advisors in conjunction with local volunteers, it may be possible to organise a research camp without the need for pre-existing experts in the community. A camp provides a coordination point for interested participants, signals support for community building, and if previous camps have been successful provides social proof for participants.</p><p>In addition, scaling up research into relatively new cause areas is a problem that will need to be solved many times in the effective altruist community. This could represent an efficient way to ‘bootstrap’ a larger research community from a small pre-existing one, and so could be a useful addition to the tool set available to the EA community.</p><p>This project serves as a natural complement to other AI safety projects currently in development such as <u><a href="https://wiki.lesswrong.com/wiki/Road_to_AI_Safety_Excellence">RAISE</a></u> that aim to teach researchers the foundational knowledge they will need to begin research. Once an aspiring AI safety researcher completes one of these courses, they might consider a research camp as a natural next step on the road to become a practicing researcher.</p><h3>Acknowledgements</h3><p>Thanks to Ryan Carey, Chris Cundy, Victoria Krakovna and Matthijs Maas for reading and providing helpful comments on this document.</p><h3>Organisers</h3><h4><u><a href="https://tommcgrath.github.io/">Tom McGrath</a></u></h4><p>Tom is a maths PhD student in the <u><a href="http://wwwf.imperial.ac.uk/~nsjones/">Systems and Signals</a></u> group at Imperial College, where he works on statistical models of animal behaviour and physical models of inference. He will be interning at the <u><a href="https://www.fhi.ox.ac.uk/">Future of Humanity Institute</a></u> from Jan 2018, working with <u><a href="https://www.fhi.ox.ac.uk/team/owain-evans/">Owain Evans</a></u>. His previous organisational experience includes co-running Imperial’s <u><a href="http://mathshelpdesk.ma.ic.ac.uk/">Maths Helpdesk</a></u> and running a postgraduate deep learning study group.</p><p><u><a href="https://nl.linkedin.com/in/remmelt-ellen-19b88045">Remmelt Ellen</a></u></p><p><em>Operations</em></p><p>Remmelt is the Operations Manager of <u><a href="https://effectiefaltruisme.nl/en/effective-altruism-netherlands/">Effective Altruism Netherlands</a></u>, where he coordinates national events, works with organisers of new meetups and takes care of mundane admin work. He also oversees planning for the team at <u><a href="https://wiki.lesswrong.com/wiki/Accelerating_AI_Safety_Adoption_in_Academia">RAISE</a></u>, an online AI Safety course. He is a Bachelor intern at the <u><a href="https://www.cwi.nl/research/groups/intelligent-and-autonomous-systems">Intelligent & Autonomous Systems</a></u> research group.</p><p>In his spare time, he’s exploring how to improve the interactions within multi-layered networks of agents to reach shared goals – especially approaches to collaboration within the EA community and the representation of persons and interest groups by <u><a href="https://homepages.cwi.nl/~baarslag/pub/When_Will_Negotiation_Agents_Be_Able_to_Represent_Us-The_Challenges_and_Opportunities_for_Autonomous_Negotiators.pdf">negotiation agents</a></u> in <u><a href="http://www.oilcrash.com/articles/complex.htm">sub-exponential</a></u> takeoff scenarios.</p><p><u><a href="https://docs.google.com/document/d/1NkYDp3zns-cyasAk_WDrhj6DlJ9QjMP24fM7jTvWPqM/edit?usp=sharing">Linda Linsefors</a></u></p><p>Linda has a PhD in theoretical physics, which she obtained at <u><a href="https://doctorat.univ-grenoble-alpes.fr/en/doctoral-studies/research-fields/physics-630344.htm">Université Grenoble Alpes</a></u> for work on loop quantum gravity. Since then she has studied AI and AI Safety online for about a year. Linda is currently working at <u><a href="http://www.org.umu.se/icelab/english/?languageId=3">Integrated Science Lab</a></u> in Umeå, Sweden, developing tools for analysing information flow in networks. She hopes to be able to work full time on AI Safety in the near future.</p><p><u><a href="https://www.linkedin.com/in/nandi-schoots-70bba8125/?locale=en_US">Nandi Schoots</a></u></p><p>Nandi has a research master in pure mathematics and a minor in psychology from Leiden University. Her master was focused on algebraic geometry and her <u><a href="https://www.universiteitleiden.nl/binaries/content/assets/science/mi/scripties/masterschoots.pdf">thesis</a></u> was in category theory. Since graduating she has been steering her career in the direction of AI safety. She is currently employed as a data scientist in the Netherlands. In parallel to her work she is part of a study group on AI safety and involved with the reinforcement learning section of <u><a href="https://wiki.lesswrong.com/wiki/Accelerating_AI_Safety_Adoption_in_Academia">RAISE</a></u>.</p><p><u><a href="https://www.linkedin.com/in/davidkristoffersson/">David Kristoffersson</a></u></p><p>David has a background as R&D Project Manager at Ericsson where he led a project of 30 experienced software engineers developing many-core software development tools. He liaised with five internal stakeholder organisations, worked out strategy, made high-level technical decisions and coordinated a disparate set of subprojects spread over seven cities on two different continents. He has a further background as a Software Engineer and has a BS in Computer Engineering. In the past year, he has contracted for the <u><a href="https://www.fhi.ox.ac.uk/">Future of Humanity Institute</a></u>, has explored research projects in ML and AI strategy with FHI researchers, and is currently collaborating on existential risk strategy research with <u><a href="http://convergenceanalysis.org/">Convergence</a></u>.</p><p><u>Chris Pasek</u></p><p>After graduating from mathematics and theoretical computer science, Chris ended up touring the world in search of meaning and self-improvement, and finally settled on working as a freelance researcher focused on AI alignment. Currently also running a rationalist shared housing project on the tropical island of Gran Canaria and continuing to look for ways to gradually self-modify in the direction of a superhuman FDT-consequentialist entity with a goal to save the world.</p>david_kristofferssonKgFrtaajjfSnBSZoH2018-02-02T04:25:46.005ZComment by David_Kristoffersson on A Fable of Science and Politics
https://lw2.issarice.com/posts/6hfGNLf4Hg5DXqJCF/a-fable-of-science-and-politics?commentId=hDPDWGZdyWWBWxACy
<p>It's bleen, without a moment's doubt.</p>
david_kristofferssonhDPDWGZdyWWBWxACy2016-10-26T08:57:49.936ZComment by David_Kristoffersson on LessWrong 2.0
https://lw2.issarice.com/posts/givHhuPu6G43g8kWN/lesswrong-2-0?commentId=zi9yiPKrvHmoDjBjs
<p>Counterpoint: Sometimes, not moving <em>means</em> moving, because everyone else is moving away from you. Movement -- change -- is relative. And on the Internet, change is rapid.</p>
david_kristofferssonzi9yiPKrvHmoDjBjs2016-05-08T10:06:56.257ZComment by David_Kristoffersson on Meetup : First meetup in Stockholm
https://lw2.issarice.com/posts/S2r2bPK2TBtRWuRcP/meetup-first-meetup-in-stockholm?commentId=NafifgyNvi9gKLSPF
<p>Interesting. I might show up.</p>
david_kristofferssonNafifgyNvi9gKLSPF2015-10-09T19:11:29.667ZComment by David_Kristoffersson on Book Review: Naive Set Theory (MIRI research guide)
https://lw2.issarice.com/posts/FvA2qL6ChCbyi5Axk/book-review-naive-set-theory-miri-research-guide?commentId=QXdaw2X3YcayuNsct
<p>Thanks for the tip. Two other books on the subject that seem to be appreciated are <em>Introduction to Set Theory</em> by Karel Hrbacek and <em>Classic Set Theory: For Guided Independent Study</em> by Derek Goldrei.</p>
<p>Edit: math.se weighs in: <a href="http://math.stackexchange.com/a/264277/255573">http://math.stackexchange.com/a/264277/255573</a></p>
david_kristofferssonQXdaw2X3YcayuNsct2015-08-16T10:26:47.403ZComment by David_Kristoffersson on Book Review: Naive Set Theory (MIRI research guide)
https://lw2.issarice.com/posts/FvA2qL6ChCbyi5Axk/book-review-naive-set-theory-miri-research-guide?commentId=skYdkvxzMjDMTaYMe
<p>The author of the <a href="http://www.logicmatters.net/tyl/about-the-guide/">Teach Yourself Logic study guide</a> agrees with you about reading multiple sources:</p>
<blockquote>
<p>I very strongly recommend tackling an area of logic (or indeed any new area of mathematics) by reading a series of books which overlap in level (with the next one covering some of the same ground and then pushing on from the previous one), rather than trying to proceed by big leaps.</p>
<p>In fact, I probably can’t stress this advice too much, which is why I am highlighting it here. For this approach will really help to reinforce and deepen understanding as you re-encounter the same material from different angles, with different emphases.</p>
</blockquote>
david_kristofferssonskYdkvxzMjDMTaYMe2015-08-16T10:23:09.130ZComment by David_Kristoffersson on Book Review: Naive Set Theory (MIRI research guide)
https://lw2.issarice.com/posts/FvA2qL6ChCbyi5Axk/book-review-naive-set-theory-miri-research-guide?commentId=ywteeFnkStiBDdGzk
<p>My two main sources of confusion in that sentence are:</p>
<ol>
<li>He says "distinct elements <strong>onto</strong> distinct elements", which suggests both injection and surjection. </li>
<li>He says "is called one-to-one (usually a one-to-one correspondence)", which might suggest that "one-to-one" and "one-to-one correspondence" are synonyms -- since that is what he usually uses the parantheses for when naming concepts.</li>
</ol>
<p>I find Halmos somewhat contradictory here.</p>
<p>But I'm convinced you're right. I've edited the post. Thanks.</p>
david_kristofferssonywteeFnkStiBDdGzk2015-08-16T09:59:22.493ZComment by David_Kristoffersson on Book Review: Naive Set Theory (MIRI research guide)
https://lw2.issarice.com/posts/FvA2qL6ChCbyi5Axk/book-review-naive-set-theory-miri-research-guide?commentId=PwZPKaieSHhHRiumM
<p>You guys must be right. And wikipedia corroborates. I'll edit the post. Thanks.</p>
david_kristofferssonPwZPKaieSHhHRiumM2015-08-16T09:53:22.549ZBook Review: Naive Set Theory (MIRI research guide)
https://lw2.issarice.com/posts/FvA2qL6ChCbyi5Axk/book-review-naive-set-theory-miri-research-guide
<p>I'm David. I'm reading through the books in the <a href="https://intelligence.org/research-guide/">MIRI research guide</a> and will write a review for each as I finish them. By way of inspiration from how <a href="/user/So8res/">Nate</a> did it.</p>
<h2>Naive Set Theory<br /></h2>
<p style="padding-left: 210px;"><img src="http://img1.imagesbn.com/p/9781614271314_p0_v1_s260x420.JPG" alt="" width="260" height="391" /></p>
<p>Halmos <em>Naive Set Theory</em> is a classic and dense little book on axiomatic set theory, from a "naive" perspective.<br /><br />Which is to say, the book won't dig to the depths of formality or philosophy, it focuses on getting you productive with set theory. The point is to give someone who wants to dig into advanced mathematics a foundation in set theory, as set theory is a fundamental tool used in a lot of mathematics.</p>
<h3>Summary</h3>
<p>Is it a good book? Yes.<br /><br />Would I recommend it as a starting point, if you would like to learn set theory? No. The book has a terse presentation which makes it tough to digest if you aren't already familiar with propositional logic, perhaps set theory to some extent already and a bit of advanced mathematics in general. There are plenty of other books that can get you started there.<br /><br />If you do have a somewhat fitting background, I think this should be a very competent pick to deepen your understanding of set theory. The author shows you the nuts and bolts of set theory and doesn't waste any time doing it.</p>
<h3>Perspective of this review</h3>
<p>I will first refer you to <a title="Nate's review" href="/lw/ir6/book_review_na%C3%AFve_set_theory_miri_course_list/">Nate's review</a>, which I found to be a lucid take on it. I don't want to be redundant and repeat the good points made there, so I want to focus this review on the perspective of someone with a bit weaker background in math, and try to give some help to prospective readers with parts I found tricky in the book.<br /><br />What is my perspective? While I've always had a knack for math, I only read about 2 months of mathematics at introductory university level, and not including discrete mathematics. I do have a thorough background in software development.<br /><br />Set theory has eluded me. I've only picked up fragments. It's seemed very fundamental but school never gave me a good opportunity to learn it. I've wanted to understand it, which made it a joy to add Naive Set Theory to the top of my reading list.</p>
<h3>How I read Naive Set Theory</h3>
<p>Starting on Naive Set Theory, I quickly realized I wanted more meat to the explanations. What is this concept used for? How does it fit in to the larger subject of mathematics? What the heck is the author expressing here?</p>
<p>I supplemented heavily with wikipedia, math.stackexchange and other websites. Sometimes, I read other sources even before reading the chapter in the book. At two points, I laid down the book in order to finish two other books. The first was Gödel's Proof, which handed me some friendly examples of propositional logic. I had started reading it on the side when I realized it was contextually useful. The second was Concepts of Modern Mathematics, which gave me much of the larger mathematical context that Naive Set Theory didn't.<br /><br />Consequently, while reading Naive Set Theory, I spent at least as much time reading other sources!<br /><br />A bit into the book, I started struggling with the exercises. It simply felt like I hadn't been given all the tools to attempt the task. So, I concluded I needed a better introduction to mathematical proofs, ordered some books on the subject, and postponed investing into the exercises in Naive Set Theory until I had gotten that introduction.</p>
<h3>Chapters</h3>
<p>In general, if the book doesn't offer you enough explanation on a subject, search the Internet. Wikipedia has numerous competent articles, math.stackexchange is overflowing with content and there's plenty additional sources available on the net. If you get stuck, do try playing around with examples of sets on paper or in a text file. That's universal advice for math.<br /><br />I'll follow with some key points and some highlights of things that tripped me up while reading the book.</p>
<h4>Axiom of extension</h4>
<p>The axiom of extension tells us how to distinguish between sets: Sets are the same if they contain the same elements. Different if they do not.</p>
<h4>Axiom of specification</h4>
<p>The axiom of specification allows you to create subsets by using conditions. This is pretty much what is done every time <a title="set builder notation" href="https://en.wikipedia.org/wiki/Set-builder_notation">set builder notation</a> is employed.<br /><br />Puzzled by the bit about Russell's paradox at the end of the chapter? http://math.stackexchange.com/questions/651637/russells-paradox-in-naive-set-theory-by-paul-halmos</p>
<h4>Unordered pairs</h4>
<p>The axiom of pairs allows one to create a new set that contains the two original sets.</p>
<h4>Unions and intersections</h4>
<p>The axiom of unions allows one to create a new set that contains all the members of the original sets.</p>
<h4>Complements and powers</h4>
<p>The axiom of powers allows one to, out of one set, create a set containing all the different possible subsets of the original set.<br /><br />Getting tripped up about the "for some" and "for every" notation used by Halmos? Welcome to the club:<br />http://math.stackexchange.com/questions/887363/axiom-of-unions-and-its-use-of-the-existential-quantifier<br />http://math.stackexchange.com/questions/1368073/order-of-evaluation-in-conditions-in-set-theory<br /><br />Using natural language rather than logical notation is commmon practice in mathematical textbooks. You'd better get used to it:<br />http://math.stackexchange.com/questions/1368531/why-there-is-no-sign-of-logic-symbols-in-mathematical-texts<br /><br />The <a href="https://en.wikipedia.org/wiki/Existential_quantification">existential quantifiers</a> tripped me up a bit before I absorbed it. In math, you can freely express something like "Out of all possible x <em>ever</em>, give me the set of x that fulfill this condition". In programming languages, you tend to have to be much more... specific, in your statements.</p>
<h4>Ordered pairs</h4>
<p>Cartesian products are used to represent plenty of mathematical concepts, notably coordinate systems.</p>
<h4>Relations</h4>
<p><a href="https://en.wikipedia.org/wiki/Equivalence_relation">Equivalence relations</a> and <a href="https://en.wikipedia.org/wiki/Equivalence_class">equivalence classes</a> are important concepts in mathematics.</p>
<h4>Functions</h4>
<p>Halmos is using some dated terminology and is in my eyes a bit inconsistent here. In modern usage, we have: <a href="https://en.wikipedia.org/wiki/Bijection,_injection_and_surjection">injective, surjective, bijective</a> and functions that are none of these. Bijective is the combination of being both injective and surjective. Replace Halmos' "onto" with surjective, "one-to-one" with injective, and "one-to-one correspondence" with bijective.</p>
<p>He also confused me with his explanation of "characteristic function" - you might want to check <a href="https://en.wikipedia.org/wiki/Indicator_function">another source</a> there.</p>
<h4>Families</h4>
<p>This chapter tripped me up heavily because Halmos mixed in three things at the same time on page 36: 1. A confusing way of talking about sets. 2. Convoluted proof. 3. n-ary cartesian product.<br /><br />Families are an alternative way of talking about sets. An indexed family is a set, with an index and a function in the background. A family of sets means a collection of sets, with an index and a function in the background. For Halmos build-up to n-ary cartesian products, the deal seems to be that he teases out order without explicitly using ordered pairs. Golf clap. Try this one for the math.se treatment: http://math.stackexchange.com/questions/312098/cartesian-products-and-families</p>
<h4>Inverses and composites</h4>
<p>The inverses Halmos defines here are more general than the inverse functions described on wikipedia. Halmos' inverses work even when the functions are not bijective.</p>
<h4>Numbers</h4>
<p>The axiom of infinity states that there is a set of the natural numbers.</p>
<h4>The Peano axioms</h4>
<p>The peano axioms can be modeled on the the set-theoretic axioms. The recursion theorem guarantees that recursive functions exist.</p>
<h4>Arithmetic</h4>
<p>The principle of mathematical induction is put to heavy use in order to define arithmetic.</p>
<h4>Order</h4>
<p><a href="https://en.wikipedia.org/wiki/Order_theory">Partial orders, total orders, well orders</a> -- are powerful mathematical concepts and are used extensively.<br /><br />Some help on the way:<br />http://math.stackexchange.com/questions/1047409/sole-minimal-element-why-not-also-the-minimum<br />http://math.stackexchange.com/questions/367583/example-of-partial-order-thats-not-a-total-order-and-why<br />http://math.stackexchange.com/questions/225808/is-my-understanding-of-antisymmetric-and-symmetric-relations-correct<br />http://math.stackexchange.com/questions/160451/difference-between-supremum-and-maximum<br /><br />Also, keep in mind that infinite sets like subsets of w can muck up expectations about order. For example, a totally ordered set can have multiple elements without a predecessor.</p>
<h4>Axiom of choice</h4>
<p>The axiom of choice lets you, from any collection of non-empty sets, select an element from every set in the collection. The axiom is necessary to do these kind of "choices" with infinite sets. In finite cases, one can construct functions for the job using the other axioms. Though, the axiom of choice often makes the job easier in finite cases so it is used where it isn't necessary.</p>
<h4>Zorn's lemma</h4>
<p>Zorn's lemma is used in similar ways to the axiom of choice - making infinite many choices at once - which perhaps is not very strange considering ZL and AC have been proven to be equivalent.</p>
<p>robot-dreams <a href="/lw/ir6/book_review_na%C3%AFve_set_theory_miri_course_list/b9uv">offers some help</a> in following the massive proof in the book.</p>
<h4>Well ordering</h4>
<p>A well-ordered set is a totally ordered set with the extra condition that every non-empty subset of it has a smallest element. This extra condition is useful when working with infinite sets.<br /><br />The principle of transfinite induction means that if the presence of all strict predecessors of an element always implies the presence of the element itself, then the set must contain everything. Why does this matter? It means you can make conclusions about infinite sets beyond w, where mathematical induction isn't sufficient.</p>
<h4>Transfinite recursion</h4>
<p>Transfinite recursion is an analogue to the ordinary recursion theorem, in a similar way that transfinite induction is an analogue to mathematical induction - recursive functions for infinite sets beyond w.<br /><br />In modern lingo, what Halmos calls a "similarity" is an "order isomorphism".</p>
<h4>Ordinal numbers</h4>
<p>The axiom of substitution is called the axiom (schema) of replacement in modern use. It's used for extending counting beyond w.</p>
<h4>Sets of ordinal numbers</h4>
<p>The counting theorem states that each well ordered set is order isomorphic to a unique ordinal number.</p>
<h4>Ordinal arithmetic</h4>
<p>The misbehavior of commutativity in arithmetic with ordinals tells us a natural fact about ordinals: if you tack on an element in the beginning, the result will be order isomorphic to what it is without that element. If you tack on an element at the end, the set now has a last element and is thus not order isomorphic to what you started with.</p>
<h4>The Schröder-Bernstein theorem</h4>
<p>The Schröder-Bernstein theorem states that if X dominates Y, and Y dominates X, then X ~ Y (X and Y are equivalent).</p>
<h4>Countable sets</h4>
<p>Cantor's theorem states that every set always has a smaller cardinal number than the cardinal number of its power set.</p>
<h4>Cardinal arithmetic</h4>
<p>Read this chapter after Cardinal numbers.<br /><br />Cardinal arithmetic is an arithmetic where just about all the standard operators do nothing (beyond the finite cases).</p>
<h4>Cardinal numbers</h4>
<p>Read this chapter before Cardinal arithmetic.<br /><br />The continuum hypothesis asserts that there is no cardinal number between that of the natural numbers and that of the reals. The generalized continuum hypothesis asserts that, for all cardinal numbers including aleph-0 and beyond aleph-0, the next cardinal number in the sequence is the power set of the previous one.</p>
<h3>Concluding reflections</h3>
<p>I am at the same time humbled by the subject and empowered by what I've learned in this episode. Mathematics is a truly vast and deep field. To build a solid foundation in proofs, I will now go through one or two books about mathematical proofs. I may return to Naive Set Theory after that. If anyone is interested, I could post my impressions of other mathematical books I read.</p>
<p>I think Naive Set Theory wasn't the optimal book for me at the stage I was. And I think Naive Set Theory probably should be replaced by another introductory book on set theory in the MIRI research guide. But that's a small complaint on an excellent document.</p>
<p>If you seek to get into a new field, know the prerequisites. Build your knowledge in solid steps. Which I guess, sometimes requires that you do test your limits to find out where you really are.<br /><br />The next book I start on from the research guide is bound to be Computability and Logic.</p>david_kristofferssonFvA2qL6ChCbyi5Axk2015-08-14T22:08:37.028ZComment by David_Kristoffersson on Welcome to Less Wrong! (7th thread, December 2014)
https://lw2.issarice.com/posts/eqaro7sMe5xw2kJWc/welcome-to-less-wrong-7th-thread-december-2014?commentId=yi5zfZruoauaP439t
<p>Hello.</p>
<p>I'm currently attempting to read through the MIRI research guide in order to contribute to one of the open problems. Starting from Basics. I'm emulating many of <a href="http://lesswrong.com/user/So8res/">Nate</a>'s techniques. I'll post reviews of material in the research guide at lesswrong as I work through it.</p>
<p>I'm mostly posting here now just to note this. I can be terse at times.</p>
<p>See you there.</p>
david_kristofferssonyi5zfZruoauaP439t2015-07-16T22:14:54.816ZComment by David_Kristoffersson on Dark Arts of Rationality
https://lw2.issarice.com/posts/4DBBQkEQvNEWafkek/dark-arts-of-rationality?commentId=Q8Tnj7sQTi887kF5H
<p>First, appreciation: I love that calculated modification of self. These, and similar techniques, can be very useful if put to use in the right way. I recognize myself here and there. You did well to abstract it all out this clearly.</p>
<p>Second, a note: You've described your techniques from the perspective of how they deviate from epistemic rationality - "Changing your Terminal Goals", "Intentional Compartmentalization", "Willful inconsistency".
I would've been more inclined to describe them from the perspective of their central effect, e.g. something to the style of: "Subgoal ascension", "Channeling", "Embodying".
Perhaps not as marketable to the lesswrong crowd. Multiple perspectives could be used as well.</p>
<p>Third, a question: How did you create that gut feeling of urgency?</p>
david_kristofferssonQ8Tnj7sQTi887kF5H2015-07-11T19:26:08.177ZComment by David_Kristoffersson on MIRI's technical research agenda
https://lw2.issarice.com/posts/d3gMZmSSAHXaGisyJ/miri-s-technical-research-agenda?commentId=hZG27WiXWDieAz4cc
<blockquote>
<p>And boxing, by the way, means giving the AI zero power.</p>
</blockquote>
<p>No, hairyfigment's answer was entirely appropriate. Zero power would mean zero effect. Any kind of interaction with the universe means some level of power. Perhaps in the future you should say <em>nearly zero</em> power instead so as to avoid misunderstanding on the parts of others, as taking you literally on the "zero" is apparently "legalistic".</p>
<p>As to the issues with <em>nearly zero</em> power:</p>
<ul>
<li>A superintelligence with <em>nearly zero</em> power could turn to be a heck of a lot more power than you expect.</li>
<li>The incentives to tap more perceived utility by unboxing the AI or building other unboxed AIs will be huge.</li>
</ul>
<p>Mind, I'm not arguing that there is anything wrong with boxing. What's I'm arguing is that it's wrong to rely only on boxing. I recommend you read some more material on <a href="http://wiki.lesswrong.com/wiki/AI_boxing">AI boxing</a> and <a href="http://wiki.lesswrong.com/wiki/Oracle_AI">Oracle AI</a>. Don't miss out on the references.</p>
david_kristofferssonhZG27WiXWDieAz4cc2015-01-27T19:42:01.922ZComment by David_Kristoffersson on MIRI's technical research agenda
https://lw2.issarice.com/posts/d3gMZmSSAHXaGisyJ/miri-s-technical-research-agenda?commentId=2EnPrQ7XhTqkfAFLM
<p>So you disagree with the premise of the orthogonality thesis. Then you know a central concept to probe to understand the arguments put forth here. For example, check out Stuart's Armstrong's paper: <a href="http://lesswrong.com/lw/cej/general_purpose_intelligence_arguing_the/">General purpose intelligence: arguing the Orthogonality thesis</a></p>
david_kristoffersson2EnPrQ7XhTqkfAFLM2015-01-27T18:49:43.523ZComment by David_Kristoffersson on MIRI's technical research agenda
https://lw2.issarice.com/posts/d3gMZmSSAHXaGisyJ/miri-s-technical-research-agenda?commentId=oLM82iJMWTr8P9F2G
<p>There's no guarantee that boxing will ensure the safety of a soft takeoff. When your boxed AI starts to become drastically smarter than a human -- 10 times --- 1000 times -- 1000000 times -- the sheer enormity of the mind may slip out of human possibility to understand. All the while, a seemingly small dissonance between the AI's goals and human values -- or a small misunderstanding on our part of what goals we've imbued -- could magnify to catastrophe as the power differential between humanity and the AI explodes post-transition.</p>
<p>If an AI goes through the intelligence explosion, its goals will be what orchestrates <em>all</em> resources (as Omohundro's point 6 implies). If the goals of this AI does not align with human values, all we value will be lost.</p>
david_kristofferssonoLM82iJMWTr8P9F2G2015-01-23T19:12:35.342ZComment by David_Kristoffersson on MIRI's technical research agenda
https://lw2.issarice.com/posts/d3gMZmSSAHXaGisyJ/miri-s-technical-research-agenda?commentId=wAa7FZ9kS82CvvRKM
<p>Mark: So you think human-level intelligence <em>by principle</em> does not combine with goal stability. Aren't you simply disagreeing with the <a href="http://wiki.lesswrong.com/wiki/Orthogonality_thesis">orthogonality thesis</a>, "that an artificial intelligence can have any combination of intelligence level and goal"?</p>
david_kristofferssonwAa7FZ9kS82CvvRKM2015-01-23T17:38:29.786ZComment by David_Kristoffersson on Facing the Intelligence Explosion discussion page
https://lw2.issarice.com/posts/LEESyXYFuW7R3Q9G5/facing-the-intelligence-explosion-discussion-page?commentId=FEKckt6KKYDiTKbA7
<p><a href="http://intelligenceexplosion.com/en/2012/ai-the-problem-with-solutions/">http://intelligenceexplosion.com/en/2012/ai-the-problem-with-solutions/</a> links to <a href="http://lukeprog.com/SaveTheWorld.html">http://lukeprog.com/SaveTheWorld.html</a> - which redirects to <a href="http://lukemuehlhauser.comsavetheworld.html/">http://lukemuehlhauser.comsavetheworld.html/</a> - which isn't there anymore.</p>
david_kristofferssonFEKckt6KKYDiTKbA72014-08-10T20:31:14.099Z