Posts
Comments
Thanks. In case this additional detail helps anybody: my brother says he'd take valacyclovir two times a day, and it seemed like his rash (and possibly cognitive symptoms, hard to tell) would get worse a couple hours after, then subside.
While I (a year late) tentatively agree with you (though a million years of suffering is a hard thing to swallow compared to the instinctually almost mundane matter of death) I think there's an assumption in your argument that bears inspection. Namely, I believe you are maximizing happiness at a given instance in time - the present, or the limit as time approaches infinity, etc. (Or, perhaps, you are predicating the calculations on the possibility of escaping the heat death of the universe, and being truly immortal for eternity.) A (possibly) alternate optimization goal - maximize human happiness, summed over time. See, I was thinking, the other day, and it seems possible we may never evade the heat death of the universe. In such a case, if you only value the final state, nothing we do matters, whether we suffer or go extinct tomorrow. At the very least, this metric is not helpful, because it cannot distinguish between any two states. So a different metric must be chosen. A reasonable substitute seems to me to be to effectively take the integral of human happiness over time, sum it up. The happy week you had last week is not canceled out by a mildly depressing day today, for instance - it still counts. Conversely, suffering for a long time may not be automatically balanced out the moment you stop suffering (though I'll grant this goes a little against my instincts). If you DO assume infinite time, though, your argument may return to being automatically true. I'm not sure that's an assumption that should be confidently made, though. If you don't assume infinite time, I think it matters again what precise value you put on death, vs incredible suffering, and that may simply be a matter of opinion, of precise differences in two people's terminal goals.
(Side note: I've idly speculated about expanding the above optimization criteria for the case of all-possible-universes - I forget the exact train of thought, but it ended up more or less behaving in a manner such that you optimize the probability-weighted ratio of good outcomes to bad outcomes (summed across time, I guess). Needs more thought to become more rigorous etc.)
"Complex" doesn't imply "hard to emulate". We likely won't need to understand the encoded systems, just the behavior of the neurons. In high school I wrote a simple simulator of charged particles - the rules I needed to encode were simple, but it displayed behavior I hadn't programmed in, nor expected, but which were, in fact, real phenomena that really happen.
I agree with you almost perfectly. I'd been working on a (very long-shot) plan for it, myself, but having recently realized that other people may be working on it, too, I've started looking for them. Do you (or anyone reading this) know of anybody seriously willing to do, or already engaged in, this avenue of work? Specifically, working towards WBE, with the explicit intent of averting unaligned AGI disaster.
It bears mention that, compared to the median predicted unaligned AGI, I'd hands-down accept Hitler as supreme overlord. It seems probable that humans would still exist under Hitler, and in a fairly recognizable form, even if there were many troubling things about their existence. Furthermore, I suspect that an average human would be better than Hitler, and I'm fairly optimistic that most individuals striving to prevent the AGI apocalypse would make for downright pleasant overseers (or whatever).
I'm not convinced "want to modify their utility functions" is the perspective most useful. I think it might be more helpful to say that we each have multiple utility functions, which conflict to varying degrees and have voting power in different areas of the mind. I've had first-hand experience with such conflicts (as essentially everyone probably has, knowingly or not), and it feels like fighting yourself. I wish to describe a hypothetical example. "Do I eat that extra donut?". Part of you wants the donut; the part feels like more of an instinct, a visceral urge. Part of you knows you'll be ill afterwards, and will feel guilty about cheating your diet; this part feels more like "you", it's the part that thinks in words. You stand there and struggle, trying to make yourself walk away, as your hand reaches out for the donut. I've been in similar situations where (though I balked at the possible philosophical ramifications) I felt like if I had a button to make me stop wanting the thing, I'd push it - yet often it was the other function that won. I feel like if you gave an agent the ability to modify their utility functions, the one that would win depends on which one had access to the mechanism (do you merely think the thought? push a button?), and whether they understand what the mechanism means. (The word "donut" doesn't evoke nearly as strong a reaction as a picture of a donut, for instance; your donut-craving subsystem doesn't inherently understand the word.)
Contrarily, one might argue that cravings for donuts are more hardwired instincts than part of the "mind", and so don't count...but I feel like 1. finding a true dividing line is gonna be real hard, and 2. even that aside, I expect many/most people have goals localized in the same part of the mind that nevertheless are not internally consistent, and in some cases there may be reasonable sounding goals that turn out to be completely incompatible with more important goals. In such a case I could imagine an agent deciding it's better to stop wanting the thing they can't have.
I feel like the concept of "neural address" is incompletely described, and the specifics may matter. For example, a specific point in the skull, yeah, is a bad way to address a specific concept, between individuals. However, there might be, say, particular matching structures that tend to form around certain ideas, and searching on those structures might be a better way of addressing a particular concept. (Probably still not good, but it hints in the direction that there may be better ways of formulating a neural address that maybe WOULD be sufficiently descriptive. I don't know any particularly good methods, of the top of my head, though, and your point may turn out correct.)
"When you are finished reading this, you will see Bayesian problems in your dreams."
Whaddaya know; he was right.
Also, yes the other version (on Arbital) is better, with more information - though this one has a point or two that aren't in the other version, like the discussion of degrees of freedom.
Ah, thanks; that looks pretty relevant. I'll try to read it in the next day or so.
Yeah, and it would also cut out close contact with a number of people. It's actually looking pretty likely the second of the mentioned alternatives will happen (assuming I go at all) - it's possible I'll stay at home with an immediate family member, rather than go to Utah. This reduces my close contact from ~20 people (in ~4 families) down to like 3 people (1 family), and should significantly reduce the odds of my catching it. I started writing a huge response, but moved it into an update.
I think my own spin on the incorrectness of the article would be, I think some forms of procrastination and laziness are valuable. Sweeping every day will only make the floor so clean. Some tasks truly DO go away if you ignore them long enough.
...But overall, I do firmly agree with the intent of your article.
It's surprisingly hard to consciously notice a bad thing as being a solvable problem, I agree. I will once in a while notice that I've been inconvenienced by X for a long time, but only now am I actively aware of it as a discrete entity. It's a skill I think would be particularly worth learning/improving.
Similarly, with opportunities - this has happened a couple times in my life, but the first I remember was that beforehand, I would hear about people volunteering at animal shelters, and I'd think "that sounds nice/fun"...end of thought. One day, all of a sudden, something clicked, and the thought occurred, "There's nothing fundamentally special about those people. That thing is a Thing I Could Do. There's nothing actually stopping me from doing it." And there wasn't; I did a Google and made a few calls, and in a few weeks I was cleaning kestrel poop out of a concrete enclosure XP. (These days I hand-feed baby squirrels/possums/rabbits, mostly.) But I remember how much of a revelation it was: those people on the other side of the screen are by-and-large the same kind of people I am - if I want to do what they do I can. If I wanted to drop everything and build thatch houses in a third world country, I could. (I just don't really want to, haha.)
It's still a hard skill to learn - I doubt I do it nearly as often as I could, or even perhaps should.
This is my objection to the conclusion of the post: yes, you're unlikely to be able to patch all the leaks, but the more leaks you patch, the less likely it is that a bad solution occurs. The way the Device was described was such that "things happen, and time is reset until a solution occurs". This favors probable things over improbable things, since probable things will more likely happen before improbable things. If you add caveats - mother safe, whole, uninjured, mentally sound, low velocity - at some point the "right" solutions become significantly more probable than the "wrong" ones. As for the stated "bad" solutions - how probable is a nuclear bomb going off, or aliens abducting her, compared to firefighters showing up?
I don't even think the timing of the request matters, since the device isn't actively working to bring the events to fruition - meaning, any outcome where the device resets will have always been prohibited, from the beginning of time. Which means that the firefighters may have left the building five minutes ago, having seen some smoke against the skyline. Etc. ...Or, perhaps more realistically, the device was never discovered in the first place, considering the probabilistic weight it would have to bear over all its use, compared to the probability of its discovery.