Posts
Comments
I would actually give concrete evidence in favor of what I think you're calling "Philosophy," although of course there's not much dramatic evidence we'd expect to see if the theory is wholly true.
Here, however, is a YouTube video that should really be called, Why AI May Already Be Killing You. It uses hard data to show that an algorithm accidentally created real-world money and power, that it did so by working just as intended in the narrowest sense, and that its creators opposed the logical result so much that they actively tried to stop it. (Of course, this last point had to do with their own immediate profits, not the long-term effects.)
I'd be mildly surprised, but not shocked, to find that this creation of real-world power has already unbalanced US politics, in a way which could still destroy our civilization.
What do you think of this observation, which Leah McElrath recently promoted a second time? Here are some other tweets that she's made, on January 21 & 26, 2020:
https://twitter.com/leahmcelrath/status/1219693585731391489
https://twitter.com/leahmcelrath/status/1221316758281293825
Bonus link: https://twitter.com/gwensnyderPHL/status/1479166811220414464
Yes. There's a reason why I would specifically tell young people not to refrain from action because they fear other students' reactions, but I emphatically wouldn't tell them to ignore fear or go against it in general.
Really! I just encountered this feature, and have been more reluctant to agree than to upvote. Admittedly, the topic has mostly concerned conversations which I didn't hear.
Not sure what you just said, but according to the aforementioned physics teacher people have absolutely brought beer money, recruited a bunch of guys, and had them move giant rocks around in a manner consistent with the non-crazy theory of pyramid construction. (I guess the brand of beer used might count as "modern technology," and perhaps the quarry tools, but I doubt the rest of it did.) You don't, in fact, need to build a full pyramid to refute crackpot claims.
I just summarized part of it, and was downvoted for doing so. Have you tried to correct that and encourage me numerically?
Ahem. You cannot argue neurotypicals into being neurodivergent, period. If they cared about logic and fairness, they'd already be allistic ND, almost by definition.
Ahem. As with F=ma (I think) it's not so much wrong, or useless, as asking the wrong question on a different level.
https://threadreaderapp.com/thread/1368966115804598275.html
When Scott pointed out (for the first time, AFAICT) that Vassar is a dangerous person in the orbit of this community, it was the second report about such a person in as many weeks. This doesn't look to me like a pair of isolated incidents. It looks like part of a pattern.
Therefore, while your technique sounds like a good one in isolation, I suspect you're encouraging your overwhelmingly neurodivergent audience to double down on their inherent flaws, making these worse. (Another clue here is the fact that actions which could be 'justified' under very different models could be dishonest, or they could be plain old expected value maximization.) The added suggestion to think of an experiment to distinguish between hypotheses - indeed, the idea of doing anything at all - is a good addition.
I should note that, as an outsider, the main point I recall Eliezer making in that vein is that he used Michael Vassar as a model for the character who was called Professor Quirrell. As an outsider, I didn't see that as an unqualified endorsement - though I think your general message should be signal-boosted.
That the same 50% of the unwilling believe both that vaccines have been shown to cause autism and that the US government is using them to microchip the population is suggestive that such people are not processing such statements as containing words that possess meanings.
Yes, but you're missing the obvious. Respondents don't have a predictive model that literally says Bill Gates wants to inject them with a tracking microchip. They do, however, have a rational expectation that he or his company will hurt them in some technical way, which they find wholly opaque.
Likewise: do you think that the mistake you mention stemmed from your impatience, which makes you seem blasé about the lives of immunocompromised people like myself? Because, those lawmakers you chose to bully were all vaccinated, so they were engaging in the exact same behavior you just criticized LA for trying to ban. You also just implied, earlier in the post, that if people were less impatient, we'd be largely done.
What do you believe would happen to a neurotypical forced to have self-awareness and a more accurate model of reality in general?
The idea that they become allistic neurodivergents like me is, of course, a suspicious conclusion, but I'm not sure I see a credible alternative. CEV seems like an inherently neurodivergent idea, in the sense that forcing people (or their extrapolated selves) to engage in analysis is baked into the concept.
It's suspicious, but I think people's views from long ago on whether or not they would commit suicide are very weak evidence.
Also, we know for a fact the FBI threatened Martin Luther King, Jr, and I don't think they wound up killing him?
Twitter is saying there's literally a Delta Plus variant already. We don't know what it does.
we can probably figure something out that holds onto the majority of the future’s value, and it’s unlikely that we all die’ camp
This disturbs me the most. I don't trust their ability to distinguish "the majority of the future's value," from "the Thing you just made thinks Thamiel is an amateur."
Hopefully, similar reasoning accounts for the bulk of the fourth camp.
How likely is it that a research lab finds a new bat coronavirus and then before publishing anything about it, decides that it's the perfect testbed for dramatic gain of function research?
In China? We're talking about a virus based on RNA, which mutates more easily, making it harder to control. China's government craves control to the point of trying to censor all mentions of Winnie the Pooh, possibly because a loss of control could mean the guillotine.
And you also want, in that last tweet, to put (at least some of) the burden on these random institutions to then allocate the vaccine based on who has suffered disproportionately? While also obeying all the official restrictions or else, and also when did that become a priority?
You know that we decide which groups are at risk largely by looking at how they've died or suffered so far. Presumably she's hoping we protect people who otherwise seem likely to die, since her remarks seem sensible given her premises.
You're not exactly wrong, but OP does tell us people are being irrational in ways that you could use to get cold hard cash.
People who have more difficulty than most - like me* - in losing weight constitute about 20% of the community. The hard cases are quite rare.
What? What community? My impression is that, if people try to lose weight, the expected result is for them to become more unhealthy due to weight fluctuation. It's not that they'll "have difficulty losing weight," rather their weight will go up and down in a way that harms their health or their expected lifespans. I thought I read statements by Scott - an actual doctor, we think - supporting this. Are you actually disputing the premise? If so, where'd you get the data?
This seems like vastly more people masked than I've observed in a crowded suburb. I can't give precise numbers, but it feels like less than a third wearing them, despite distancing being very difficult if you walk in some areas.
No. First, people thinking of creating an AGI from scratch (i.e., one comparable to the sort of AI you're imagining) have already warned against this exact issue and talked about measures to prevent a simple change of one bit from having any effect. (It's the problem you don't spot that'll kill you.)
Second, GPT-2 is not near-perfect. It does pretty well at a job it was never intended to do, but if we ignore that context it seems pretty flawed. Naturally, its output was nowhere near maximally bad. The program did indeed have a silly flaw, but I assume that's because it's more of a silly experiment than a model for AGI. Indeed, if I try to imagine making GPT-N dangerous, I come up with the idea of an artificial programmer that uses vaguely similar principles to auto-complete programs and could thus self-improve. Reversing the sign of its reward function would then make it produce garbage code or non-code, rendering it mostly harmless.
Again, it's the subtle flaw you don't spot in GPT-N that could produce an AI capable of killing you.
The only way I can see this happening with non-negligible probability is if we create AGI along more human lines - e.g, uploaded brains which evolve through a harsh selection process that wouldn't be aligned with human values. In that scenario, it may be near certain. Nothing is closer to a mind design capable of torturing humans than another human mind - we do that all the time today.
As others point out, though, the idea of a sign being flipped in an explicit utility function is one that people understand and are already looking for. More than that, it would only produce minimal human-utility if the AI had a correct description of human utility. Otherwise, it would just use us for fuel and building material. The optimization part also has to work well enough. Everything about the AGI, loosely speaking, has to be near-perfect except for that one bit. This naively suggests a probability near zero. I can't imagine a counter-scenario clearly enough to make me change this estimate, if you don't count the previous paragraph.
This shows why I don't trust the categories. The ability to let talented people go in whatever direction seems best will almost always be felt as freedom from pressure.
One, the idea was to pick a fictional character I preferred, but could not easily come to believe in. (So not Taylolth, may death come swiftly to her enemies.) Two, I wanted to spend zero effort imagining what this character might say or do. I had the ability to picture Kermit.
I meant that as a caution - though it is indeed fictional evidence, and my lite version IRL seems encouraging.
I really think you'll be fine taking it slow. Still, if you have possible risk factors, I would:
- Make sure you have the ability to speak with a medical professional on fairly short notice.
- Remind yourself that you are always in charge inside your own head. People who might know tell me that hearing this makes you safer. It may be a self-proving statement.
Sy, is that you?
I started talking to Kermit the Frog, off and on, many months ago. I had this idea after seeing an article by an ex-Christian who appeared never to have made predictions about her life using a truly theistic model, but who nevertheless missed the benefits she recalls getting from her talks with Jesus. Result: Kermit has definitely comforted me once or twice (without the need for 'belief') and may have helped me to remember useful data/techniques I already knew, but mostly nothing much happens.
Now, as an occasional lucid dreamer who once decided to make himself afraid in a dream, I tend not to do anything that I think is that dumb. I have not devoted much extra effort or time to modelling Kermit the Frog. However, my lazy experiment has definitely yielded positive results. Perhaps you could try your own limited experiment first?
Pitch Meeting is at least bringing up actual problems with Game of Thrones season 8. But I dare you to tell if early Game of Thrones was better or worse than season 8, based on the Pitch Meeting.
That's gonna be super easy, barely an inconvenience. The video for S8 not only feels more critical than usual, it gives specific examples of negative changes from previous seasons, plus a causal explanation (albeit partial) that the video keeps coming back to. One might even say the characters harp on their explanation for S8 being worse than early seasons.
Also, looking at the PM video on GOT up to that point, the chief criticism I find (and this once it seems like pretty explicit criticism of the art in question) seems to be about the show getting old and also worse as it goes on. Actual quote:
Well, eventually we're going to start running out of main characters, in later seasons, so they're all going to develop pretty thick plot armor.
Of course, whether that makes the early seasons better or whether their failures took a while to become obvious is a difficult question. PM seems to suggest the former, given their fairly blatant assertion that S8 is worse due to being inexplicably rushed. However, that is technically a different question. It leaves open the possibility that the showrunners' ultimate failure was determined by their approach at the start, as the earlier video suggested.
I have to add that Pitch Meeting does not, in fact, label itself "review," though it does contain implied assertions. (Some implied claims are even about media, rather than about screenwriters and studio executives.)
The specific example/challenge/instance of meta-shitting seems odd, though. For one, GOT Season 8 was a major news story rather than something you needed to learn from Pitch Meeting. I'm going to make the rest a separate comment.
Somewhat unsurprisingly, claim 1 had the least support.
Admittedly, this is what shminux objected to. Beforehand I would have expected more resistance based on people already believing the future is uncertain, casting doubt on claim 2 and especially the "tractable" part of claim 3. If I had to steelman such views, they might sound something like, 'The way to address this problem is to make sure sensible people are in charge, and a prerequisite for being sensible is not giving weird-sounding talks for 3 people.'
How sure are you that the people who showed up were objecting out of deeply-held disagreements, and not out of a sense that objections are good?
That number was presented as an example ("e.g.") - but more importantly, all the numbers in the range you offer here would argue for more AI alignment research! What we need to establish, naively, is that the probability is not super-exponentially low for a choice between 'inter-galactic civilization' and 'extinction of humanity within a century'. That seems easy enough if we can show that nothing in the claim contradicts established knowledge.
I would argue the probability for this choice existing is far in excess of 50%. As examples of background info supporting this: Bayesianism implies that "narrow AI" designs should be compatible on some level; we know the human brain resulted from a series of kludges; and the superior number of neurons within an elephant's brain is not strictly required for taking over the world. However, that argument is not logically necessary.
(Technically you'd have to deal with Pascal's Mugging. However, I like Hansonian adjustment as a solution, and e.g. I doubt an adult civilization would deceive its people about the nature of the world.)
I almost specified, 'what would it be without the confusing term "ought" or your gerrymandered definition thereof,' but since that was my first comment in this thread I thought it went without saying.
Do you have a thesis that you argue for in the OP? If so, what is that thesis?
Are you prepared to go down the other leg of the dilemma and say that the "true oughts" do not include any goal which would require you to, eg, try to have correct beliefs? Also: the Manhattan Project.
Why define goals as ethics (knowing that definitions are tools that we can use and replace depending on our goal of the moment)? You seem to be saying that 'ought' has a structure which can also be used to annihilate humanity or bring about unheard-of suffering. That does not seem to me like a useful perspective.
Seriously, just go and watch "Sorry to Bother You."
Arguably, the numbers we care about. Set theory helpfully adds that second-order arithmetic (arithmetic using the language of sets) has only a single model (up to what is called 'isomorphism', meaning a precise analogy) and that Godel's original sentence is 'true' within this abstraction.
In part IV, can you explain more about what your examples prove?
You say FDT is motivated by an intuition in favor of one-boxing, but apparently this is false by your definition of Newcomb's Problem. FDT was ultimately motivated by an intuition that it would win. It also seems based on intuitions regarding AI, if you read that post - specifically, that a robot programmed to use CDT would self-modify to use a more reflective decision theory if given the chance, because that choice gives it more utility. Your practical objection about humans may not be applicable to MIRI.
As far as your examples go, neither my actions nor my abstract decision procedure controls whether or not I'm Scottish. Therefore, one-boxing gives me less utility in the Scot-Box Problem and I should not do it.
Exception: Perhaps Scotland in this scenario is known for following FDT. Then FDT might in fact say to one-box (I'm not an expert) and this may well be the right answer.
Assuming you mean the last blockquote, that would be the Google result I mentioned which has text, so you can go there, press Ctrl-F, and type "must fail" or similar.
You can also read the beginning of the PDF, which talks about what can and can't be programmed while making clear this is about hardware and not algorithms. See the first comment in this family for context.
Again, he plainly says more than that. He's challenging "the conviction that the information processing underlying any cognitive performance can be formulated in a program and thus simulated on a digital computer." He asserts as fact that certain types of cognition require hardware more like a human brain. Only two out of four areas, he claims, "can therefore be programmed." In case that's not clear enough, here's another quote of his:
since Area IV is just that area of intelligent behavior in which the attempt to program digital computers to exhibit fully formed adult intelligence must fail, the unavoidable recourse in Area III to heuristics which presuppose the abilities of Area IV is bound, sooner or later, to run into difficulties. Just how far heuristic programming can go in Area III before it runs up against the need for fringe consciousness, ambiguity tolerance, essential/inessential discrimination, and so forth, is an empirical question. However, we have seen ample evidence of trouble in the failure to produce a chess champion, to prove any interesting theorems, to translate languages, and in the abandonment of GPS.
He does not say that better algorithms are needed for Area IV, but that digital computers must fail. He goes on to falsely predict that clever search together with "newer and faster machines" cannot produce a chess champion. AFAICT this is false even if we try to interpret him charitably, as saying more human-like reasoning would be needed.
I couldn't have written an equally compelling essay on biases in favor of long timelines without lying, I think,
Then perhaps you should start here.
Hubert Dreyfus, probably the most famous historical AI critic, published "Alchemy and Artificial Intelligence" in 1965, which argued that the techniques popular at the time were insufficient for AGI.
That is not at all what the summary says. Here is roughly the same text from the abstract:
Early successes in programming digital computers to exhibit simple forms of intelligent behavior, coupled with the belief that intelligent activities differ only in their degree of complexity, have led to the conviction that the information processing underlying any cognitive performance can be formulated in a program and thus simulated on a digital computer. Attempts to simulate cognitive processes on computers have, however, run into greater difficulties than anticipated. An examination of these difficulties reveals that the attempt to analyze intelligent behavior in digital computer language systematically excludes three fundamental human forms of information processing (fringe consciousness, essence/accident discrimination, and ambiguity tolerance). Moreover, there are four distinct types of intelligent activity, only two of which do not presuppose these human forms of information processing and can therefore be programmed. Significant developments in artificial intelligence in the remaining two areas must await computers of an entirely different sort, of which the only existing prototype is the little-understood human brain.
In case you thought he just meant greater speed, he says the opposite on PDF page 71. Here is roughly the same text again from a work I can actually copy and paste:
It no longer seems obvious that one can introduce search heuristics which enable the speed and accuracy of computers to bludgeon through in those areas where human beings use more elegant techniques. Lacking any a priori basis for confidence, we can only turn to the empirical results obtained thus far. That brute force can succeed to some extent is demonstrated by the early work in the field. The present difficulties in game playing, language translation, problem solving, and pattern recognition, however, indicate a limit to our ability to substitute one kind of "information processing** for another. Only experimentation can determine the extent to which newer and faster machines, better programming languages, and cleverer heuristics can continue to push back the frontier. Nonetheless, the dra- matic slowdown in the fields we have considered and the general failure to fulfill earlier predictions suggest the boundary may be near. Without the four assumptions to fall back on, current stagnation should be grounds for pessimism.
This, of course, has profound implications for our philosophical tradi- tion. If the persistent difficulties which have plagued all areas of artificial intelligence are reinterpreted as failures, these failures must be interpre- ted as empirical evidence against the psychological, epistemological, and ontological assumptions. In Heideggerian terms this is to say that if Western Metaphysics reaches its culmination in Cybernetics, the recent difficulties in artificial intelligence, rather than reflecting technological limitations, may reveal the limitations of technology.
If indeed Dreyfus meant to critique 1965's algorithms - which is not what I'm seeing, and certainly not what I quoted - it would be surprising for him to get so much wrong. How did this occur?
Meandering conversations were important to him, because it gave them space to actually think. I pointed to examples of meetings that I thought had gone well, that ended will google docs full of what I thought had been useful ideas and developments. And he said "those all seemed like examples of mediocre meetings to me – we had a lot of ideas, sure. But I didn't feel like I actually got to come to a real decision about anything important."
Interesting that you choose this as an example, since my immediate reaction to your opening was, "Hold Off On Proposing Solutions." More precisely, my reaction was that I recall Eliezer saying he recommended this before any other practical rule of rationality (to a specific mostly white male audience, anyway) and yet you didn't seem to have established that people agree with you on what the problem is.
It sounds like you got there eventually, assuming "the right path for the organization" is a meaningful category.
I think this is being presented because a treacherous turn requires deception.
As I've mentioned before, that is technically false (unless you want a gerrymandered definition).
Smiler AI: I'm focusing on self-improvement. A smarter, better version of me would find better ways to fill the world with smiles. Beyond that, it's silly for me to try predicting a superior intelligence.
Mostly agree, but I think an AGI could be subhuman in various ways until it becomes vastly superhuman. I assume we agree that no real AI could consider literally every possible course of action when it comes to long-term plans. Therefore, a smiler could legitimately dismiss all thoughts of repurposing our atoms as an unprofitable line of inquiry, right up until it has the ability to kill us. (This could happen even without crude corrigibility measures, which we could remove or allow to be absent from a self-revision because we trust the AI.) It could look deceptively like human beings deciding not to pursue an Infinity Gauntlet to snap our problems away.
I deny that your approach ever has an advantage over recognizing that definitions are tools which have no truth values, and then digging into goals or desires.
The core of the disagreement between Bostrom (treacherous turn) and Goertzel (sordid stumble) is about how long steps 2. and 3. will take, and how obvious the seed AI's unalignment will look like during these steps.
Really? Does Bostrom explicitly call this the crux?
I'm worried at least in part that AGI (for concreteness, let's say a smile-maximizer) won't even see a practical way to replace humanity with its tools until it far surpasses human level. Until then, it honestly seeks to make humans happy in order to gain reward. Since this seems more benevolent than most humans - who proverbially can't be trusted with absolute power - we could become blase about risks. This could greatly condense step 4.
Not every line in 37 Ways is my "standard Bayesian philosophy," nor do I believe much of what you say follows from anything standard.
This probably isn't our central disagreement, but humans are Adaptation-Executers, not Fitness-Maximizers. Expecting humans to always use words for Naive Bayes alone seems manifestly irrational. I would go so far as to say you shouldn't expect people to use them for Naive Bayes in every case, full stop. (This seems to border on subconsciously believing that evolution has a mind.) If you believe someone is making improper inferences, stop trying to change the subject and name an inference you think they'd agree with (that you consider false).
Comment status: I may change my mind on a more careful reading.
Other respondents have mentioned the Mathematical Macrocosm Hypothesis. My take differs slightly, I think. I believe you've subtly contradicted yourself. In order for your argument to go anywhere you had to assume that an abstract computation rule exists in the same sense as a real computer running a simulation. This seems to largely grant Tegmark's version of the MMH (and may be the first premise I reject here). ETA: the other branch of your dilemma doesn't seem to engage with the functionalist view of qualia, which says that the real internal behavior or relationships within a physical system are what matter.
Now, we're effectively certain that our world is fundamentally governed by mathematical laws of physics (whether we discover the true laws or not). Dualist philosophers like Chalmers seem to grant this point despite wanting to say that consciousness is different. I think Chalmers freely grants that your consciousness - despite being itself non-physical, on his view - is wholly controlled by physical processes in your brain. This seems undisputed among serious people. (You can just take certain chemicals or let yourself get hungry, and see how your thoughts change.)
So, on the earlier Tegmark IV premise, there's no difference between you and a simulation. You are a simulation within an abstract mathematical process, which exists in exactly the same way as an arithmetical sequence or the computational functions you discuss. You are isomorphic to various simulations of yourself within abstract computations.
Chalmers evidently postulates a "bridging law" in the nature of reality which makes some simulations conscious and not others. However, this seems fairly arbitrary, and in any case I also recall Chalmers saying that a person uploaded (properly) to a computer would be conscious. I certainly don't see anything in his argument to prevent this. If you don't like the idea of this applying to more abstract computations, I recommend you reject Tegmark and admit that the nature of reality is still technically an open problem.
Using epsilons can in principle allow you to update. However, the situation seems slightly worse than jimrandomh describes. It looks like you need P(E|h), or the probability if H is false, in order to get a precise answer. Also, the missing info that jim mentioned is already enough in principle to let the final answer be any probability whatsoever.
If we use log odds (the framework in which we could literally start with "infinite certainty") then the answer could be anywhere on the real number line. We have infinite (or at least unbounded) confusion until we make our assumptions more precise.
One is phrased or presented as knowledge. I don't know the best way to approach this, but to a first approximation the belief is the one that has an explicit probability attached. I know you talked about a Boolean, but there the precise claim given a Boolean value was "these changes have happened", described as an outside observer would, and in my example the claim is closer to just being the changes.
Your example could be brought closer by having mAIry predict the pattern of activation, create pointers to memories that have not yet been formed, and thus formulate the claim, "Purple looks like n<sub>p</sub>." Here she has knowledge beforehand, but the specific claim under examination is incomplete or undefined because that node doesn't exist.