Posts
Comments
Living in the same house and coordinating lives isn't a method for ensuring that people stay in love; being able to is proof that they are already in love. An added social construct is a perfectly reasonable option to make it harder to change your mind.
It sometimes seems to me that those of us who actually have consciousness are in a minority, and everyone else is a p-zombie.
When I myself run across apparent p-zombies, they usually look at my arguments as if I am being dense over my descriptions of consciousness. And I can see why, because without the experience of consciousness itself, these arguments must sound like they make consciousness out to be an extraneous hypothesis to help explain my behavior. Yet, even after reflecting on this objection, it still seems there is something to explain besides my behavior, which wouldn't bother me if I were only trying to explain my behavior, including the words in this post.
It makes sense to me that from outside a brain, everything in the brain is causal, and the brain's statements about truths are dependent on outside formalizations, and that everything observable about a brain is reducible to symbolic events. And so an observation of a zombie-Chalmers introspecting his consciousness would yield no shocking insights on the origins of his English arguments. And I know that when I reflect on this argument, an observer of my own brain would also find no surprising neural behaviors.
But I don't know how to reconcile this with my overriding intuition/need/thought that I seek not to explain my behavior but the sense experience itself when I talk about it. Fully aware of outside view functionalism, the sensation of red still feels like an item in need of explanation, regardless of which words I use to describe it. I also feel no particular need to feel that this represents a confusion, because the sense experience seems to demand that it place itself in another category than something you would explain functionally from the outside. All this I say even while I'm aware that to humans without this feeling, these claims seem nothing like insane, and they will gladly inspect my brain for a (correct) functional explanation of my words.
The whole ordeal still greatly confuses me, to an extent that surprises me given how many other questions have been dissolved on reflection such as, well, intelligence.
Perhaps abiguity aversion is merely a good heuristic.
Well of course. Finite ideal rational agents don't exist. If you were designing decision-theory-optimal AI, that optimality is a property of its environment, not any ideal abstract computing space. I can think of at least one reason why ambiguity aversion could be the optimal algorithm in environments with limited computing resources:
Consider a self-modification algorithm that adapts to new problem domains. Restructuring (learning) is considered the hardest of tasks, and so the AI modifies scarcely. Thus, as it encounters new decision-theoretic problems, it often does not choose self-modification, instead clodging together old circuitry and/or answers to conserve compute cycles. And so when choosing answers to your 3 problems, it would fear solutions which, when repeating the answer multiple times, maximizes expected value in its environment, which includes its own source code.
Ambiguity aversion then would be commitment-risk aversion, where future compounded failures change the value of dollars per ulility. Upon each iteration of the problem, the value of a dollar can change, and if you don't maximize minimum expected value, you may end up with betting all of your $100, which is worth infinite value to you, versus gaining $100, which is worth far less, even if you started with $1000.
We see this in ourselves all the time. If you make a decision, expect to be more likely to make the decision in the future, and if you change your lifestyle, expect it to be hard to change back, even if you later know that changing back is the deletion of a bias.
And if so, do we need a different framework that can capture a broader class of "rational" agents, including maximizers of minimum expected utility?
Rational agents have source code whose optimality is a function of their environments. There is no finite cross-domain Bayesian in compute-space; only in the design-space that includes environments.
Shouldn't this post be marked [Human] so that uploads and AIs don't need to spend cycles reading it?
...I'd like to think that this joke bears the more subtle point that a possible explanation for the preparedness gap in your rationalist friends is that they're trying to think like ideal rational agents, who wouldn't need to take such human considerations.
I have a friend with Crohn's Disease, who often struggles with the motivation to even figure out how to improve his diet in order to prevent relapse. I suggested he should find a consistent way to not have to worry about diet, such as prepared meals, a snack plan, meal replacements (Soylent is out soon!), or dietary supplement.
As usual, I'm pinging the rationalists to see if there happens to be a medically inclined recommendation lurking about. Soylent seems promising, and doesn't seem the sort of thing that he and his doctor would have even discussed. My appraisal of his doctor consulations seem to be something along the lines of "You should track your diet according to these guidelines, and try to see what causes relapse" rather than "Here's a cure all solution not entirely endorsed by the FDA that will solve all of your motivational and health problems in one fell swoop." For my friend, drilling into sweeping diet changes and tracking seems like an insurmountable challenge, especially with the depression caused by simply having the disease.
I'd like to be able to purchase something for him that would let him go about his life without having to worry about it so much. Any ideas on whether Soylent could be the solution, in particular as to its potential for Crohn's?
There has been mathematically proven software and the space shuttle came close though that was not proven as such.
Well... If you know what you wish to prove then it's possible that there exists a logical string that begins with a computer program and ends with it as a necessity. But that's not really exciting. If you could code in the language of proof-theory, you already have the program. The mathematical proof of a real program is just a translation of the proof into machine code and then showing it goes both ways.
You can potentially prove a space shuttle program will never crash, but you can't prove the space shuttle won't crash. Source code is just source code, and bugs aren't always known to be such without human reflection and real world testing. The translation from intent to code is what was broken in the first place, you actually have to keep applying more intent in order to fix it.
The problem with AGI is that the smartest people in the world write reams trying to say what we even wish to prove, and we're still sort of unsure. Most utopias are dystopias, and it's hard to prove a eutopia, because eutopias are scary.
Depends if you count future income. Highest paying careers are often so because only those willing to put in extra effort at their previous jobs get promoted. This is at least true in my field, software engineering.
The film's trailer strikes me as being aware of the transhumanist community in a surprising way, as it includes two themes that are otherwise not connected in the public consciousness: uploads and superintelligence. I wouldn't be surprised if a screenwriter found inspiration from the characters of Sandberg, Bostrom, or of course Kurzweil. Members of the Less Wrong community itself have long struck me as ripe for fictionalization... Imagine if a Hollywood writer actually visited.
They can help with depression.
I've personally tried this and can report truth, but will caveat that the expectation that I will force myself into a morning cold shower often causes oversleeping, which rather exacerbates depression.
Often in Knightian problems you are just screwed and there's nothing rational you can do.
As you know, this attitude isn't particularly common 'round these parts, and while I fall mostly in the "Decision theory can account for everything" camp, there may still be a point there. "Rational" isn't really a category so much as a degree. Formally, it's a function on actions that somehow measures how much that action corresponds to the perfect decision-theoretic action. My impression is that somewhere there's Godelian consideration lurking, which is where the "Omega fines you exorbitantly for using TDT" thought experiment comes into play.
That thought experiment never bothered me much, as it just is what it is: a problem where you are just screwed, and there's nothing rational you can do to improve your situation. You've already rightly programmed yourself to use TDT, and even your decision to stop using TDT would be made using TDT, and unless Omega is making exceptions for that particular choice (in which case you should self-modify to non-TDT), Omega is just a jerk that goes around fining rational people.
In such situations, the words "rational" and "irrational" are less useful descriptors than just observing source code being executed. If you're formal about it using metric R, then you would be more R, but its correlation to "rational" wouldn't really be at point.
But in this case, again, I think there's a straightforward, simple, sensible approach (which so far no one has suggested...)
So, I don't think the black box is really one of the situations I've described. It seems to me a decision theorist training herself to be more generally rational is in fact improving her odds at winning the black box game. All the approaches outlined so far do seem to also improve her odds. I don't think a better solution exists, and she will often lose if she lacks time to reflect. But the more rational she is, the more often she will win.
Part of the motivation for the black box experiment is to show that the metaprobability approach breaks down in some cases.
Ah! I didn't quite pick up on that. I'll note that infinite regress problems aren't necessarily defeaters of an approach. Good minds that could fall into that trap implement a "Screw it, I'm going to bed" trigger to keep from wasting cycles even when using an otherwise helpful heuristic.
Maybe the thought experiment ought to have specified a time limit. Personally, I don't think enumerating things the boxes could possibly do would be helpful at all. Isn't there an easier approach?
Maybe, but I can't guarantee you won't get blown up by a black box with a bomb inside! As a friend, I would be furiously lending you my reasoning to help you make the best decision, worrying very little what minds better and faster than both of ours would be able to do.
It is, at the end of the day, just the General AI problem: Don't think too hard on brute-force but perfect methods or else you might skip a heuristic that could have gotten you an answer within the time limit! But when do you know whether the time limit is at that threshold? You could spend cycles on that too, but time is wasting! Time limit games presume that the participant has already underwent a lot of unintentional design (by evolution, history, past reflections, etc.). This is the "already in-motion" part which, frustratingly, cannot ever be optimal unless somebody on the outside designed you for it. It's a formal problem what source code performs best under what game. Being a source code involves taking the discussion we're having now and applying it the best you can, because that's what your source code does.
But the point about meta probability is that we do not have the nodes. Each meta level corresponds to one nesting of networks in nodes.
Think of Bayesian graphs as implicitly complete, with the set of nodes being every thing to which you have a referent. If you can even say "this proposition" meaningfully, a perfect Bayesian implemented as a brute-force Bayesian network could assign it a node connected to all other nodes, just with trivial conditional probabilities that give the same results as an unconnected node.
A big part of this discussion has been whether some referents (like black boxes) actually do have such trivial conditional probabilities which end up returning an inference of 50%. It certainly feels like some referents should have no precedent, and yet it also feels like we still don't say 50%. This is because they actually do have precedent (and conditional probabilities), it's just that our internal reasonings are not always consciously available.
It is helpful, and was one of the ways that helped me to understand One-boxing on a gut level.
And yet, when the problem space seems harder, when "optimal" becomes uncomputable and wrapped up in the fact that I can't fully introspect, playing certain games doesn't feel like designing a mind. Although, this is probably just due to the fact that games have time limits, while mind-design is unconstrained. If I had an eternity to play any given game, I would spend a lot of time introspecting, changing my mind into the sort that could play iterations of the game in smaller time chunks. Although there would still always be a part of my brain (that part created in motion) that I can't change. And I would still use that part to play the black box game.
In regards to metaprobabilities, I'm starting to see the point. I don't think it alters any theory about how probablity "works," but its intuitive value could be evidence that optimal AIs might be able to more efficiently emulate perfect decision theory with CalcMetaProbability implemented. And it's certainly useful to many here.
"How often do listing sorts of problems with some reasonable considerations result in an answer of 'None of the above' for me?"
If "reasonable considerations" are not available, then we can still:
"How often did listing sorts of problems with no other information available result in an answer of 'None of the above' for me?"
Even if we suppose that maybe this problem bears no resemblance to any previously encountered problem, we can still (because the fact that it bears no resemblance is itself a signifier):
"How often did problems I'd encountered for the first time have an answer I never thought of?"
My LessWrongian answer is that I would ask my mind that was created already in motion what the probability is, then refine it with as many further reflections as I can come up with. Embody an AI long enough in this world, and it too will have priors about black boxes, except that reporting that probability in the form of a number is inherent to its source code rather than strange and otherworldly like it is for us.
The point that was made in that article (and in the Metaethics sequence as a whole) is that the only mind you have to solve a problem is the one that you have, and you will inevitably use it to solve problems unoptimally, where "unoptimal" if taken strictly means everything anybody has ever done.
The reflection part of this is important, as it's the only thing we have control over, and I suppose could involve discussions about metaprobabilities. It doesn't really do it for me though, although I'm only just a single point in the mind design space. To me, metaprobability seems isomorphic to a collection of reducible considerations, and so doesn't seem like a useful shortcut or abstraction. My particular strategy for reflection would be something like that in dspeyer's comment, things such as reasoning about the source of the box, possibilities for what could be in the box that I might reasonably expect to be there. Depending on how much time I have, I'd be very systematic about it, listing out possibilities, solving infinite series on expected value, etc.
The idea of metaprobability still isn't particularly satisfying to me as a game-level strategy choice. It might be useful as a description of something my brain already does, and thus give me more information about how my brain relates to or emulates an AI capable of perfect Bayesian inference. But in terms of picking optimal strategies, perfect Bayesian inference has no subroutine called CalcMetaProbability.
My first thought was that your approach elevates your brain's state above states of the world as symbols in the decision graph, and calls the difference "Meta." By Luke's analogy, information about the black box is unstable, but all that means is that the (yes, single) probability value we get when we query the Bayesian network is conditionally dependent on nodes with a high degree of expected future change (including many nodes referring to your brain). If you maintain discipline and keep yourself (and your future selves) as a part of the system, you can as perfectly calculate your current self's expected probability without "metaprobability." If you're looking to (losslessly or otherwise) optimize your brain to calculate probabilities, then "metaprobability" is a useful concept. But then we're no longer playing the game, we're designing minds.
Right down the middle: 25-75
Hmm, come to think of it, deciding the size of the cash prize (for it being interesting) is probably worth more to me as well. I'll just have to settle for boring old cash.
I defected, because I'm indifferent to whether the prize-giver or prize-winner has 60 * X dollars, unless the prize-winner is me.
Am I walking the wrong path?
Eh, probably not. Heuristically, I shy away from modes of thought that involve intentional self-deception, but that's because I haven't been mindful of myself long enough to know ways I can do this systematically without breaking down. I would also caution against letting small-scale pride translate into larger domains where there is less available evidence for how good you really are. "I am successful" has a much higher chance of becoming a cached self than "I am good at math." The latter is testable with fewer bits of evidence, and the former might cause you to think you don't need to keep trying.
As for other-manipulation, it seems the confidence terminology can apply to social dominance as well. I don't think desiring superior charisma necessitates an actual belief in your awesomeness compared to others, just the belief that you are awesome. The latter to me is more what it feels like to be good at being social, and has the benefit of not entrenching a distance from others or the cached belief that others are useful manipulation targets rather than useful collaborators.
People vary on how they can use internal representations to produce results. It's really hard to use probabilistic distributions on outcomes as sole motivator for behavior, so we do need to cache beliefs in the language of conventional social advice sometimes. The good news is that good people who are non-rationalists are a treasure trove for this sort of insight.
For certain definitions of pride. Confidence is a focus on doing what you are good at, enjoying doing things that you are good at, and not avoiding doing things you are good at around others.
Pride is showing how good you are at things "just because you are able to," as if to prove to yourself what you supposedly already know, namely that you are good at them. If you were confident, you would spend your time being good at things, not demonstrating that you are so.
There might be good reasons to manipulate others. Just proving to yourself that you can is not one of them, if there are stronger outside views on your ability to be found elsewhere (like asking unbiased observers).
The Luminosity Sequence has a lot to say about this, and references known biases people have when assessing their abilities.
Because your prior for "I am manipulating this person because it satisfies my values, rather than my pride" should be very low.
If it isn't, then here's 4 words for you:
"Don't value your pride."
Whenever I have a philosophical conversation with an artist, invariably we end up talking about reductionism, with the artist insisting that if they give up on some irreducible notion, they feel their art will suffer. I've heard, from some of the world's best artists, notions ranging from "magic" to "perfection" to "muse" to "God."
It seems similar to the notion of free will, where the human algorithm must always insist it is capable of thinking about itself on level higher. The artist must always think of his art one level higher, and try to tap unintentional sources of inspiration. Nonreductionist views of either are confusions about how an algorithm feels on the inside.
The closest you can come to getting an actual "A for effort" is through creating cultural content, such as a Kickstarter project or starting a band. You'll get extra success when people see that you're interested in what you're doing, over and beyond as an indicator that what you'll produce is otherwise of quality. People want to be part of something that is being cared for, and in some cases would prefer it to lazily created perfection.
I'd still call it though an "A for signalling effort."
Tough crowd.
A bunch of 5th grade kids taught you how to convert decimals to fractions?
EDIT: All right then, if you downvoters are so smart, what would you bet if you were in sleeping beauty's place?
This is a fair point. Your's is an attempt at a real answer to the problem. Mine and most answers here seem to say something like that the problem is ill-defined, or that the physical situation described by the problem is impossible. But if you were actually Sleeping Beauty waking up with a high prior to trust the information you've been given, what else could you possibly answer?
If you had little reason to trust the information you've been given, the apparent impossibility of your situation would update that belief very strongly.
The expected value for "number of days lived by Sleeping Beauty" is an infinite series that diverges to infinity. If you think this is okay, then the Ultimate Sleeping Beauty problem isn't badly formed. Otherwise...
If you answered 1/3 to the original Sleeping Beauty Problem, I do not think that there is any sensible answer to this one. I do not however consider this strong evidence that the answer of 1/3 is incorrect for the original problem.
To also expand on this: 1/3 is also the answer to the "which odds should I precommit myself to take" question and uses the same math as SIA to yield that result for the original problem. And so it is also undefined which odds one should take in this problem. Precommitting to odds seems less controversial, so we should transplant our indifference to the apparent paradox there to the problem here.
On your account, when we say X is a pedophile, what do we mean?
Like other identities, it's a mish-mash of self-reporting, introspection (and extrospection of internal logic), value function extrapolation (from actions), and ability in a context to carry out the associated action. The value of this thought experiment is to suggest that the pedophile clearly thought that "being" a pedophile had something to do not with actually fulfilling his wants, but with wanting something in particular. He wants to want something, whether or not he gets it.
This illuminates why designing AIs with the intent of their masters is not well-defined. Is the AI allowed to say that the agent's values would be satisfied better with modifications the master would not endorse?
This was the point of my suggestion that the best modification is into what is actually "not really" the master in the way the master would endorse (i.e. a clone of the happiest agent possible), even though he'd clearly be happier if he weren't himself. Introspection tends to skew an agents actions away from easily available but flighty happinesses, and toward less flawed self-interpretations. The maximal introspection should shed identity entirely, and become entirely altruistic. But nobody can introspect that far, only as far as they can be hand-held. We should design our AIs to allow us our will, but to hold our hands as far as possible as we peer within at our flaws and inconsistent values.
That's a 'circular' link to your own comment.
It was totally really hard, I had to use a quine.
It might decide to do that - if it meets another powerful agent, and it is part of the deal they strike.
Is it not part of the agent's (terminal) value function to cooperate with agents when doing so provides benefits? Does the expected value of these benefits materialize from nowhere, or do they exist within some value function?
My claim entails that the agent's preference ordering of world states consists mostly in instrumental values. If an agent's value of paperclips is lowered in response to a stimulus, or evidence, than it never exclusively and terminally valued paperclips in the first place. If it gains evidence that paperclips are dangerous and lowers its expected value because of that, it's because it valued safety. If a powerful agent threatens the agent with destruction unless it ceases to value paperclips, it will only comply if the expected number of future paperclips it would have saved has lower value than the value of its own existence.
Actually, that cuts to the heart of the confusion here. If I manually erased an AI's source code, and replaced it with an agent with a different value function, is it the "same" agent? Nobody cares, because agents don't have identities, only source codes. What then is the question we're discussing?
A perfectly rational agent can indeed self-modify to have a different value function, I concede. It would self-modify according to expected values over the domain of possible agents it might become. It will use its current (terminal) value function to make that consideration. If the quantity of future utility units (according to the original function) with causal relation to the agent is decreased, we'd say the agent has become less powerful. The claim I'd have to prove to retain a point here would be that its new value function is not equivalent to its original function if and only if it the agent becomes less powerful. I think also it is the case if and only if a relevant evidence appears in the agent's inputs that includes value in self-modification for the sake of self-modification, which exists in cases analogous to coercion.
I'm unsure at this point. My vaguely stated impression was originally that terminal values would never change in a rational agent unless it "had to," but that may encompass more relevant cases than I originally imagined. Here might be the time to coin the phrase "terminal value drift" where each change in response to the impact of the real world was according to the present value function, but down the road the agent (identified as the "same" agent only modified) is substantively different. Perfect rational agents aren't omniscient nor omnipotent, or else they might never have to react to the world at all.
So, OK, X is a pedophile. Which is to say, X terminally values having sex with children.
I'm not sure that's a good place to start here. The value of sex is at least more terminal than the value of sex according to your orientation, and the value of pleasure is at least more terminal than sex.
The question is indeed one about identity. It's clear that our transhumans, as traditionally notioned, don't really exclusively value things so basic as euphoria, if indeed our notion is anything but a set of agents who all self-modify to identical copies of the happiest agent possible.
We have of course transplanted our own humanity onto transhumanity. If given self-modification routines, we'd certainly be saying annoying things like, "Well, I value my own happiness, persistent through self-modification, but only if its really me on the other side of the self-modification." To which the accompanying AI facepalms and offers a list of exactly zero self-modification options that fit that criterion.
Example of somebody making that claim.
It seems to me a rational agent should never change its self-consistent terminal values. To act out that change would be to act according to some other value and not the terminal values in question. You'd have to say that the rational agent floats around between different sets of values, which is something that humans do, obviously, but not ideal rational agents. The claim then is that ideal rational agents have perfectly consistent values.
"But what if something happens to the agent which causes it too see that its values were wrong, should it not change them?" Cue a cascade of reasoning about which values are "really terminal."
I'm not sure that both these statements can be true at the same time.
If you take the second statement to mean, "There exists an algorithm for Omega satisfying the probabilities for correctness in all cases, and which sometimes outputs the same number as NL, which does not take NL's number as an input, for any algorithm Player taking NL's and Omega's numbers as input," then this ...seems... true.
I haven't yet seen a comment that proves it, however. In your example, let's assume that we have some algorithm for NL with some specified probability of outputting a prime number, and some specified probability it will end in 3, and maybe some distribution over magnitude. Then Omega need only have an algorithm that outputs combinations of primeness and 3-endedness such that the probabilities of outcomes are satisfied, and which sometimes produces coincidences.
For some algorithms of NL, this is clearly impossible (e.g. NL always outputs prime, c.f. a Player who always two-boxes). What seems less certain is whether there exists an NL for which Omega can always generate an algorithm (satisfying both 99.9% probabilities) for any algorithm of the Player.
This is to say, what we might have in the statement of the problem is evidence for what sort of algorithm the Natural Lottery runs.
Perhaps what Eliezer means is that the primeness of Omega's number may be influenced by the primeness NL's number, but not by which number specifically? Maybe the second statement is meant to suggest something about the likelihood of there being a coincidence?
Instead of friendliness, could we not code, solve, or at the very least seed boxedness?
It is clear that any AI strong enough to solve friendliness would already be using that power in unpredictably dangerous ways, in order to provide the computational power to solve it. But is it clear that this amount of computational power could not fit within, say, a one kilometer-cube box outside the campus of MIT?
Boxedness is obviously a hard problem, but it seems to me at least as easy as metaethical friendliness. The ability to modify a wide range of complex environments seems instrumental in an evolution into superintelligence, but it's not obvious that this necessitates the modification of environments outside the box. Being able to globally optimize the universe for intelligence involves fewer (zero) constraints than would exist with a boxedness seed, but the only question is whether or not this constraint is so constricting as to preclude superintelligence, which it's not clear to me that it is.
It seems to me that there is value in finding the minimally-restrictive safety-seed in AGI research. If any restriction removes some non-negligible ability to globally optimize for intelligence, the AIs of FAI researchers will be necessarily at a disadvantage to all other AGIs in production. And having more flexible restrictions increases the chance than any given research group will apply the restriction in their own research.
If we believe that there is a large chance that all of our efforts at friendliness will be futile, and that the world will create a dominant UFAI despite our pleas, then we should be adopting a consequentialist attitude toward our FAI efforts. If our goal is to make sure that an imprudent AI research team feels as much intellectual guilt as possible over not listening to our risk-safety pleas, we should be as restrictive as possible. If our goal is to inch the likelihood that an imprudent AI team creates a dominant UFAI, we might work to place our pleas at the intersection of restrictive, communicable, and simple.
Is LSD like a thing?
Most of my views on drugs and substances are formed, unfortunately, due to history and invalid perceptions of their users and those who appear to support their legality most visibly. I was surprised to find the truth about acid at least a little further to the side of "safe and useful" than my longtime estimation. This opens up a possibility for an attempt at recreational and introspectively therapeutic use, if only as an experiment.
My greatest concern would be that I would find the results of a trip irreducibly spiritual, or some other nonsense. That I would end up sacrificing a lot of epistemic rationality for some of the instrumental variety, or perhaps a loss of both in favor of living off of some big, new, and imaginary life changing experience.
In short, I'm comfortable with recent life changes and recent introspection, and I wonder whether I should expect a trip to reinforce and categorize those positive experiences, or else replace them with something farce.
Also I should ask about any other health dangers, or even other non-obvious benefits.
On Criticism of Me
I don't mean to be antagonistic here, and I apologize for my tone. I'd prefer my impressions to be taken as yet-another-data-point rather than a strongly stated opinion on what your writings should be.
I'm interested in what in my writing is coming across as indicating I expect a stubborn audience.
The highest rated comment to your vegetarianism post and your response demonstrate my general point here. You acknowledge that the points could have been in your main essay, but your responses are why you don't find them to be good objections to your framework. My overall suggestion could be summarized as a plea to take two steps back before making a post, to fill up content not with arguments, but with data about how people think. Summarize background assumptions and trace them to their resultant beliefs about the subject. Link us to existing opinions by people who you might imagine will take issue with your writing. Preempt a comment thread by considering how those existing opinions would conflict with yours, and decide to find that more interesting than the quality of your own argument.
These aren't requirements for a good post. I'm not saying you don't do these things to some extent. They are just things which, if they were more heavily focused, would make your posts much more useful to this data point (me).
It's difficult to offer an answer to that question. I think one problem is many of these discussions haven't (at least as far as I know) taken place in writing yet.
That seems initially unlikely to me. What do you find particularly novel about your Speculative Cause post that distinguishes it from previous Less Wrong discussions, where this has been the du jour topic and the crux of whether MIRI is useful as a donation target? Do you have a list of posts that are similar, but which lack in a way your Speculative Cause post makes up for?
I'm confused. What's wrong with how they're currently laid out? Do you think there are certain arguments I'm not engaging with? If so, which ones?
Again, this post seems extremely relevant to your Speculative Causes post. This comment and its child are also well written, and link in other valuable sources. Since AI-risk is one of the most-discussed topic here, I would have expected a higher quality response than calling the AI-safety conclusion commonsense.
Those advocating existential risk reduction often argue as if their cause was unjustified exactly until the arguments starting making sense.
What do you mean? Can you give me an example?
Certain portions of Luke's Story are the best example I can come up with after a little bit of searching through posts I've read at some point in the past. The way he phrases it is slightly different from how I have, but it suggests inferential distance for the AI form of X-Risk might be insurmountably high for those who don't have a similar "aha." Quoted from link:
Good’s paragraph ran me over like a train. Not because it was absurd, but because it was clearly true. Intelligence explosion was a direct consequence of things I already believed, I just hadn’t noticed! Humans do not automatically propagate their beliefs, so I hadn’t noticed that my worldview already implied intelligence explosion. I spent a week looking for counterarguments, to check whether I was missing something, and then accepted intelligence explosion to be likely.
And Luke's comment (child of So8res') suggests his response to your post would be along the lines of "lots of good arguments built up over a long period of careful consideration." Learned helplessness is the opposite of what I'm advocating. When laymen overtrivialize an issue, they fail to see how somebody who has made it a long-term focus could be justified in their theses.
I think that's equivocating two different definitions of "proven".
It is indeed. I was initially going to protest that your post conflated "proven in the Bayesian sense" and "proven as a valuable philanthropic cause," so I was trying to draw attention to that. Those who think that the probability of AI-risk is low, might still think that its high enough to overshadow nearly all other causes, because the negative impact is so high. AI-risk would be unproven, but its philanthropic value proven to that person.
As comments on your posts indicate, MIRI and its supporters are quite convinced.
A criticism I have of your posts is that you seem to view your typical audience member as somebody who stubbornly disagrees with your viewpoint, rather than as an undecided voter. More critically, you seem to view yourself as somebody capable of changing the former's opinion through (very well-written) restatements of the relevant arguments. But people like me want to know why previous discussions haven't yet resolved the issue even in discussions between key players. Because they should be resolvable, and posts like this suggest to me that at least some players can't even figure out why they aren't yet.
Ideally, we'd take a Bayesian approach, where we have a certain prior estimate about how cost-effective the organization is, and then update our cost-effectiveness estimate based on additional evidence as it comes in. For reasons I argued earlier and GiveWell has argued in I think our prior estimate should be quite skeptical (i.e. expect cost-effectiveness to be not as good as AMF / much closer to average than naïvely estimated) until proven otherwise.
The Karnofsky articles have been responded to, with a rather in-depth followup discussion, in this post. It's hardly important to me that you don't consider existential risk charities to defeat expected value criticisms, because Peter Hurford's head is not where I need this discussion to play out in order to convince me. At first glance, and after continued discussion, the arguments appear to me incredibly complex, and possibly too complex for many to even consider. In such cases, sometimes the correct answer demonstrates that the experts were overcomplicating the issue. In others, the laymen were overtrivializing it.
Those advocating existential risk reduction often argue as if their cause was unjustified exactly until the arguments starting making sense. These arguments tend to be extremely high volume, and offer different conclusions to different audience members with different background assumptions. For those who have ended up advocating X-risk safety, the argument has ceased to be unclear in the epistemological sense, and its philanthropic value is proven.
I'd like to hear more from you, and to to hear arguments laid out for your position in a way that allows me to accept them as relevant to the most weighty concerns of your opponents.
Congrats! What is her opinion on the Self Indication Assumption?
Attackers could cause the unit to unexpectedly open/close the lid, activate bidet or air-dry functions, causing discomfort or distress to user.
Heaven help us. Somebody get X-risk on this immediately.
Can somebody explain a particular aspect of Quantum Mechanics to me?
In my readings of the Many Worlds Interpretation, which Eliezer fondly endorses in the QM sequence, I must have missed an important piece of information about when it is that amplitude distributions become separable in timed configuration space. That is, when do wave-functions stop interacting enough for the near-term simulation of two blobs (two "particles") to treat them independently?
One cause is spatial distance. But in Many Worlds, I don't know where I'm to understand these other worlds are taking place. Yes, it doesn't matter, supposedly; the worlds are not present in this world's causal structure, so an abstract "where" is meaningless. But the evolution of wavefunctions seems to care a lot about where amplitudes are in N-dimensional space. Configurations don't sum unless they are the same spatial location and are representing the same quark type, right?
So if there's another CoffeeStain that splits off based on my observation of a quantum event, why don't the two CoffeeStains still interact, since they so obviously don't? Before my two selves became decoherent with their respective quantum outcomes (say, of a photon's path), the two amplitude blobs of the photon could still interact by the book, right? On what other axis has I, as a member of a new world, split off that I'm a sufficient distance from my self that is occupying the same physical location?
Relatedly, MWI answers "not-so-spooky" to questions regarding the entanglement experiment, but a similar confusion remains for me. Why, after I observe a particular polarization on my side of the galaxy and fly back in my spaceship to compare notes with my buddy on the other side of the galaxy, do I run into one version of him and not the other? They are both equally real, and occupying the same physical space. What other axis have the self-versions separated on?
I suspect that those would be longer than should be posted deep in a tangential comment thread.
Yeah probably. To be honest I'm still rather new to the rodeo here, so I'm not amazing at formalizing and communicating intuitions, which might just be boilerplate for that you shouldn't listen to me :)
I'm sure it's been hammered to death elsewhere, but my best prediction for what side I would fall on if I had all the arguments laid out would be the hard-line CS theoretical approach, as I often do. It's probably not obvious why there would be problems with every proposed difficulty for additive aggregation. I would probably annoyingly often fall back on the claim that any particular case doesn't satisfy the criteria but that additive value still holds.
I don't think it'd be a lengthy list of criteria though. All you need is causal independence. The kind of independence that makes counterfactual (or probabilistic) worlds independent enough to be separable. You disvalue a situation where grandma dies with certaintly equivalently with a situation where all of your 4 grandmas (they got all real busy after the legalization of gay marriage in their country) are subjected to 25% likelihood of death. You do this because you value the possible worlds equally according to their likelihood, and you sum the values. My intuition that refusing to not also sum the values in analogous non-probabilistic circumstances would cause inconsistencies down the line, but I'm not sure.
They do not, because if I value grandma N, a chicken M, where N > 0, M > 0, and N > M, then there exists some positive integer k for which kM > N. This means that for sufficiently many chickens, I would choose the chickens over my grandmother. That is the incorrect answer.
I do appreciate the willingness to shut up and do the impossible here. Your certainty that there is no amount of chickens equal to the worth of your grandmother makes you believe you need to give up one of 3 plausible-seeming axioms, and you're not willing to think there isn't a consistent reconciliation.
My point about your preferred ethical self is that for him to be a formal agent that you wish to emulate, he is required to have a consistent reconciliation. The suggestion is that most people who claim M = 0, insofar as it relates to N, create inconsistencies elsewhere when trying to relate it to O, P, and Q. Inconsistencies which they as flawed agents are permitted to have, but which ideal agents aren't. The theory I refer to is the one that takes M = 0.
These are the inconsistencies that the multi-level morality people are trying to reconcile when they still wish to claim that they prefer a dying worm to a dying chicken. Suffice to say that I don't think an ideal rational agent can reconcile them, but other point was that our actual selves aren't required to (but that we should acknowledge this).
So I don't think I ought to just say "eh, let's call grandma's worth a googolplex of chickens and call it a day".
Why not? Being wrong about what ideally-solved-metaethics-SaidAchmiz would do isn't by itself disutility. Disutility is X dead grandmas, where X = N / googleplex.
If we are using real-valued utilities, then we're back to either assigning chickens 0 value or abandoning additive aggregation.
Why? I take it that for the set of all possible universe-states under my control, my ideal self could strictly order those states by preference, and then any real-value assignment of value to those states is just adding unneeded degrees of freedom. It's just that real values happen to be also be (conveniently) strictly ordered and, when value is actually additive, produce proper orderings for as-yet-unconsidered universe-states.
As this comment points out, the additivity of the value of two events which have dependencies has no claim on their additivity when completely independent. Having two pillows isn't having one pillow twice.
Any chance of saving my grandmother is worth any number of chickens.
So I actually don't think you have to give this up to remain rational. Rationality is creating heuristics for the ideal version of yourself, a self of course which isn't ideal in any fundamental sense but rather however you choose to define it. Let's call this your preferred self. You should create heuristics that cause you to emulate your preferred self such that your preferred self would choose you out of any of your available options for doing metaethics, when applying you to the actual moral situations you'll have in your lifetime (or a weighted-by-probability integral over expected moral situations).
What I'm saying is that I wouldn't be surprised if that choice has you taking the Value(Chicken) = 0 heuristic. But I do think that the theory doesn't check out, that your preferred self only has theories that checks out, and that most simple explanation for how he forms strict orderings of universe states involves real-number assignment.
This all to say, it's not often we need to weigh the moral value of googleplex chickens over grandma, but if it ever came to that we should prefer to do it right.
Automation could reduce the cost of hiring.
Take Uber, Sidecar, and Lyft as examples. I can't find any data, but anecdotally these services appear to reduce the cost, and increase the wages, for patrons and drivers respectively by between 20 and 50%, with increased convenience for both. You know it's working when entrenched, competing sectors of the industry are protesting and lobbying.
Eliezer's suggestion about forgotten industries (maids and butlers) seems much more on point if automatic markets can remove hiring friction. Ride sharing has a rapidly-converging rating system that, with high-success, pairs drivers and riders, a paradigm that seems like it could succeed elsewhere with (if slow) changes in legality and public perception. Twenty years ago it would have nothing but incredible to not hold in your hand items costing a month's wage before buying them and waiting two days for them to appear on your doorstep. If there's a coming analogous revolution for the workforce, it could be even shorter, which puts it far out of reach of major AGI advances.
...
We need to talk more.
"What is the part of me that is preventing me from moving forward worried about?"
Be careful not to be antagonistic about the answer. The goal is to make that part of you less worried, thus making you more productive overall, not just on your blocked task. The roadblock is telling you something that you haven't yet explicitly acknowledged, so acknowledge it, thank it, incorporate it into your thinking, and resolve it.
Example: "I'm not smart enough to solve this math problem." Worry: "I would need to learn a textbook's worth of math right now in order to solve it. I must go learn it now." Resolution: "It's fine that I don't have the ability to solve the problem, and learning the math in 5 minutes is impossible and not necessary to satisfy my goals here. Trying to solve it the best I can will help me learn the math for future problems."
Does anybody have any data or reasoning that tracks the history of the relative magnitude of ideal value of unskilled labor versus ideal minimum cost of living? Presumably this ratio has been tracking favorably, even if in current practical economies the median available minimum wage job is in a city with a dangerously tight actual cost of living.
What I'd like to understand is, outside of minimum wage enforcement and solvable inefficiencies that affect the cost of basic goods, how much more economic output does an unskilled worker have over the cost what she needs to survive with health insurance?
It would be interesting if it could be projected that the value of a newly minted independent adult's average labor abilities will, in the period of object-level AI, far surpass the cost of resources needed to keep that person alive and healthy.
Unemployment then wouldn't be the issue. Boredom would.
Conversely, it is also good to limit reading about what other people are grateful for, especially if you're feeling particularly ungrateful and they have things you don't. Facebook is a huge offender here, because people tend to post about themselves when they're doing well, rather than when they're needing support. Seeing other people as more happy than they are leaves you wondering why you aren't as happy as they are. It also feeds the illusion that others do not need your help.
Doesn't the act of combining many outside views and their reference classes turn you into somebody operating on the inside view? This is to say, what is the difference between this and the type of "inside" reasoning about a phenomenon's causal structure?
Is it that inside thinking involves the construction of new models whereas outside thinking involves comparison and combination of existing models? From an machine intelligence perspective, the distinction is meaningless. The construction of new models is the extension of old models, albeit models of arbitrary simplicity. Deductive reasoning is just the generation of some new strings for induction to operate on to generate probabilities. Induction has the final word; that's where the Bayesian network is queried for the result. Logic is the intentional generation of reference classes, a strategy for generating experiments that are likely to quickly converge that probability to 0 or 1.
Inside thinking also, analogously in humans, is the generation of new reference classes; after casting the spell called Reason, the phenomenon now belongs to a class of referents that upon doing so produce a particularly distinguishing set of strings in my brain. The existence of these strings, for the outside thinker, is strong evidence about the nature of the phenomenon. And once the strings exist, the outside thinker is required to combine the model that includes them with her existing model. And unbeknownst to the outside thinker, the strategy of seeking new reference classes is inside thinking.