Comment by thewakalix on Causal Reality vs Social Reality · 2019-07-03T05:49:36.651Z · score: 1 (1 votes) · LW · GW

I think you’re lumping “the ultimate goal” and “the primary mode of thinking required to achieve the ultimate goal” together erroneously. (But maybe the hypothetical person you’re devilishly advocating for doesn’t agree about utilitarianism and instrumentality?)

Comment by thewakalix on An Increasingly Manipulative Newsfeed · 2019-07-03T05:16:04.554Z · score: 5 (4 votes) · LW · GW

Re also also: the Reverse Streetlight effect will probably come into play. It’ll optimize not just for early deception, but for any kind of deception we can’t detect.

Comment by thewakalix on An Increasingly Manipulative Newsfeed · 2019-07-03T05:09:43.158Z · score: 1 (1 votes) · LW · GW

You’re saying that on priors, the humans are manipulative?

Comment by thewakalix on mAIry's room: AI reasoning to solve philosophical problems · 2019-05-28T00:13:39.867Z · score: 1 (1 votes) · LW · GW

What do you mean by “you don’t grapple with the hard problem of consciousness”? (Is this just an abstruse way of saying “no, you’re wrong” to set up the following description of how I’m wrong? In that case, I’m not sure you have a leg to stand on when you say that I use “a lot of words”.) Edit: to be a bit more charitable, maybe it means “my model has elements that my model of your model doesn’t model”.

How can you know I see the same thing that you do? That depends on what you mean by “same”. To me, to talk about whether things are the same, we need to specify what characteristics we care about, or what category system we’re using. I know what it means for two animals to be of the same species, and what it means for two people to have the same parent. But for any two things to be the same, period, doesn’t really mean anything on its own. (You could argue that everything is the same as itself, but that’s a trivial case.)

This might seem like I’m saying that there isn’t any fundamental truth, only many ways of splitting the world up into categories. Not exactly. I don’t think there’s any fundamental truth to categories. There might be fundamental monads, or something like that, but human subjective experiences are definitely not fundamental. (And what truths can even be said of a stateless monad when considered on its own?)

Comment by thewakalix on Where are people thinking and talking about global coordination for AI safety? · 2019-05-27T23:55:00.558Z · score: 6 (3 votes) · LW · GW

For question 2, I think the human-initiated nature of AI risk could partially explain the small distance between ability and need. If we were completely incapable of working as a civilization, other civilizations might be a threat, but we wouldn’t have any AIs of our own, let alone general AIs.

Comment by thewakalix on What should rationalists think about the recent claims that air force pilots observed UFOs? · 2019-05-27T23:46:43.917Z · score: 5 (2 votes) · LW · GW

I can’t tell if you already know this, but “infinite explanatory power” is equivalent to no real explanatory power. If it assigns equal probability to everything then nothing can be evidence in favor of it, and so on.

Comment by thewakalix on Could waste heat become an environment problem in the future (centuries)? · 2019-05-25T21:56:46.616Z · score: 1 (1 votes) · LW · GW

I'd assume the opposite, since I don't think physicists (and other thermodynamic scientists like some chemists) make up a majority of LW readers, but it's irrelevant. I can (and did) put both forms side-by-side to allow both physicists and non-physicists to better understand the magnitude of the temperature difference. (And since laymen are more likely to skim over the number and ignore the letter, it's disproportionately more important to include Fahrenheit.)

Edit: wait, delta-K is equivalent to delta-C. In that case, since physicists metric-users might make up the majority of LW readers, you're probably right about the number of users.

Comment by thewakalix on mAIry's room: AI reasoning to solve philosophical problems · 2019-05-25T21:44:24.078Z · score: 1 (1 votes) · LW · GW

I think a "subjective experience" (edit: in the sense that two people can have the same subjective experience; not a particular instantiation of one) is just a particular (edit: category in a) categorization of possible experiences, defined by grouping together experiences that put the [person] into similar states (under some metric of "similar" that we care about). This recovers the ability to talk about "lies about subjective experiences" within a physicalist worldview.

In this case, we could look at how the AI internally changes in response to various stimuli, and group the stimuli on the basis of similar induced states. If this grouping doesn't match to its claims at all, then we can conclude that it is perhaps lying. (See: cleaving reality at its joints.) EDIT: Were you saying that AI cannot have subjective experience? Then I think this points at the crux; see my statements below about how I don't see human subjectivity as fundamentally special.

Yes, this means that we can talk about any physical thing having a "subjective experience". This is not a bug. The special thing about animals is that they have significant variance between different "subjective experiences", whereas a rock will react very similarly to any stimuli that don't break or boil it. Humans are different because they have very high meta-subjectivity and the ability to encode their "subjective experiences" into language. However, this still doesn't match up very well to human intuitions: any sort of database or measurement device can be said to have significant "subjective experiences". But my goal isn't to describe human intuitions; it's to describe the same thing that human intuitions describe. Human subjectivity doesn't seem to be fundamentally different from that of any other physical system.

Comment by thewakalix on Interpretations of "probability" · 2019-05-20T14:23:37.574Z · score: 1 (1 votes) · LW · GW

He never said "will land heads", though. He just said "a flipped coin has a chance of landing heads", which is not a timeful statement. EDIT: no longer confident that this is the case

Didn't the post already counter your second paragraph? The subjective interpretation can be a superset of the propensity interpretation.

Comment by thewakalix on Interpretations of "probability" · 2019-05-20T14:20:13.520Z · score: 1 (1 votes) · LW · GW

When you say "all days similar to this one", are you talking about all real days or all possible days? If it's "all possible days", then this seems like summing over the measures of all possible worlds compatible with both your experiences and the hypothesis, and dividing by the sum of the measures of all possible worlds compatible with your experiences. (Under this interpretation, jessicata's response doesn't make much sense; "similar to" means "observationally equivalent for observers with as much information as I have", and doesn't have a free variable.)

Comment by thewakalix on Causal Universes · 2019-05-19T19:59:58.784Z · score: 1 (1 votes) · LW · GW

I was going to say "bootstraps don't work that way", but since the validation happens on the future end, this might actually work.

Comment by thewakalix on Causal Universes · 2019-05-19T19:55:14.889Z · score: 1 (1 votes) · LW · GW

Since Eliezer is a temporal reductionist, I think he might not mean "temporally continuous", but rather "logical/causal continuity" or something similar.

Discrete time travel would also violate temporal continuity, by the way.

Comment by thewakalix on An Intuitive Explanation of Solomonoff Induction · 2019-05-17T20:29:08.043Z · score: 1 (1 votes) · LW · GW

But where do we get Complexity(human)?

Comment by thewakalix on Could waste heat become an environment problem in the future (centuries)? · 2019-05-16T13:19:32.576Z · score: 1 (1 votes) · LW · GW

Note: since most global warming statistics are presented to the American layman in degrees Fahrenheit, it is probably useful to convert 0.7 K to 1.26 F.

Comment by thewakalix on Physical linguistics · 2019-05-14T13:54:28.260Z · score: 1 (1 votes) · LW · GW
One might think eliminativism is metaphysically simpler but reductionism doesn’t really posit more stuff, more like just allowing synonyms for various combinations of the same stuff.

I don't think Occam's razor is the main justification for eliminativism. Instead, consider the allegory of the wiggin: if a category is not natural, useful, or predictive, then in common English we say that the category "isn't real".

Comment by thewakalix on What is the transcencion hypothesis or scenario ? What would a Transcended civilization be capable of ? · 2019-04-29T16:07:13.042Z · score: 1 (1 votes) · LW · GW

The Transcension hypothesis attempts to answer the Fermi paradox by saying that sufficiently advanced civilizations nearly invariably leave their original universe for one of their own making. By definition, a transcended civilization would have the power to create or manipulate new universes or self-enclosed pockets; this would likely require a very advanced understanding of physics. This understanding would probably be matched in other sciences.

This is my impression from a few minutes of searching. I do not know why you asked the question of “what it is” when a simple search would have been faster. I do not expect that many people here are very knowledgeable about this particular hypothesis, and this is a basic question anyway.

The hypothesis does not seem very likely to me. It claims that transcendence is the inevitable evolutionary result of civilizations, but in nature we observe many niches. Civilizations are less like individuals in a species, and more like species themselves. And since a single civilization can colonize a galaxy, it would only take one civilization to produce a world unlike the one we see today - there would have to be not only no other niches, but no mutants either.

Comment by thewakalix on What is the transcencion hypothesis or scenario ? What would a Transcended civilization be capable of ? · 2019-04-29T15:55:11.796Z · score: 1 (1 votes) · LW · GW

I don’t think Transcension is a term commonly used here. This question would probably be better answered by googling.

Comment by thewakalix on What are the advantages and disadvantages of knowing your own IQ? · 2019-04-03T19:18:46.153Z · score: 6 (6 votes) · LW · GW

I think that people treat IQ as giving more information than it actually does. The main disadvantage is that you will over-adjust for any information you receive.

Comment by thewakalix on On AI and Compute · 2019-04-03T19:17:05.108Z · score: 1 (1 votes) · LW · GW

What does it mean to "revise Algorithm downward"? Observing doesn't seem to indicate much about the current value of . Or is Algorithm shorthand for "the rate of increase of Algorithm"?

Comment by thewakalix on Could waste heat become an environment problem in the future (centuries)? · 2019-04-03T19:11:36.372Z · score: 7 (5 votes) · LW · GW

Back-of-the-envelope equilibrium estimate: if we increase the energy added to the atmosphere by 1%, then the Stefan-Boltzmann law says that a blackbody would need to be warmer, or 0.4%, to radiate that much more. At the Earth's temperature of ~288 K, this would be ~0.7 K warmer.

This suggests to me that it will have a smaller impact than global warming. Whatever we use to solve global warming will probably work on this problem as well. It's still something to keep in mind, though.

Comment by thewakalix on Parable of the flooding mountain range · 2019-04-03T18:55:08.971Z · score: 1 (1 votes) · LW · GW

I agree that #humans has decreasing marginal returns at these scales - I meant linear in the asymptotic sense. (This is important because large numbers of possible future humans depend on humanity surviving today; if the world was going to end in a year then (a) would be better than (b). In other words, the point of recovering is to have lots of utility in the future.)

I don't think most people care about their genes surviving into the far future. (If your reasoning is evolutionary, then read this if you haven't already.) I agree that many people care about the far future, though.

Comment by thewakalix on Parable of the flooding mountain range · 2019-04-03T14:49:16.729Z · score: 1 (1 votes) · LW · GW

Epistemic status: elaborating on a topic by using math on it; making the implicit explicit

From an collective standpoint, the utility function over #humans looks like this: it starts at 0 when there are 0 humans, slowly rises until it reaches "recolonization potential", then rapidly shoots up, eventually slowing down but still linear. However, from an individual standpoint, the utility function is just 0 for death, 1 for life. Because of the shape of the collective utility function, you want to "disentangle" deaths, but the individual doesn't have the same incentive.

Comment by thewakalix on Will superintelligent AI be immortal? · 2019-04-03T14:35:35.986Z · score: 1 (1 votes) · LW · GW

Useful work consumes negentropy. A closed system can only do so much useful work. (However, reversible computations may not require work.)

Comment by thewakalix on Will superintelligent AI be immortal? · 2019-04-03T14:33:03.006Z · score: 1 (1 votes) · LW · GW

What do you mean by infinite IQ? If I take you literally, that's impossible because the test outputs real numbers. But maybe you mean "unbounded optimization power as time goes to infinity" or something similar.

Comment by thewakalix on [HPMOR] "the Headmaster set fire to a chicken!" · 2019-04-03T14:22:39.687Z · score: 2 (2 votes) · LW · GW

I'm not sure how magically plausible this is, but Dumbledore could have simplified the chicken brain dramatically. (See the recent SSC posts for how the number of neurons of an animal correlates with our sense of its moral worth.) Given that the chicken doesn't need to eat, reproduce, or anything else besides stand and squawk, this seems physically possible. It would be ridiculously difficult without magic, but wizards regularly shrink their brains down to animal size, so apparently magic is an expert neuroscientist. If this was done, the chicken would have almost no moral worth, so it would be permissible to create and torture it.

Comment by thewakalix on Experimental Open Thread April 2019: Socratic method · 2019-04-01T21:54:51.292Z · score: 1 (1 votes) · LW · GW

Another vaguely disconcertingly almost self-aware comment by the bot. It can, in fact, write impressively realistic comments in 10 seconds.

Comment by thewakalix on Experimental Open Thread April 2019: Socratic method · 2019-04-01T21:52:09.114Z · score: 1 (1 votes) · LW · GW

I think “typical X does Y” is shorthand for “many or most Xs do Y”.

Comment by thewakalix on Experimental Open Thread April 2019: Socratic method · 2019-04-01T21:51:16.522Z · score: 1 (1 votes) · LW · GW

That last parenthetical remark is funny when you consider how GPT-2 knows nothing new but just reshuffles the “interesting and surprising amount of writing by smart people”.

Comment by thewakalix on On the Nature of Agency · 2019-04-01T21:41:18.769Z · score: 3 (2 votes) · LW · GW

Ah. It’s a bot. I suppose the name should have tipped me off. At least I get Being More Confused By Fiction Than Reality points.

Comment by thewakalix on On the Nature of Agency · 2019-04-01T21:38:36.922Z · score: 1 (1 votes) · LW · GW

How did you write that in less than a minute?

Comment by thewakalix on On the Nature of Agency · 2019-04-01T21:37:43.051Z · score: 1 (1 votes) · LW · GW

I’m confused. Are you saying that highly-upvoted posts make a name nicer and therefore less useful? If so, can you describe the mechanisms behind this?

Comment by thewakalix on The Main Sources of AI Risk? · 2019-03-26T02:03:24.328Z · score: 1 (1 votes) · LW · GW

Can you personally (under your own power) and confidently prove that a particular tool will only recursively-trust safe-and-reliable tools, where this recursive tree reaches far enough to trust superhuman AI?

On the other hand, you can "follow" the tree for a distance. You can prove a calculator trustworthy and use it in your following proofs, for instance. This might make it more feasible.

Comment by thewakalix on Unconscious Economies · 2019-03-25T00:38:32.383Z · score: 3 (2 votes) · LW · GW

I agree that there's a monetary incentive for more people to write clickbait, but the mechanism the post described was "naturally clickbaity people will get more views and thus more power," and that doesn't seem to involve money at all.

Comment by thewakalix on Can Bayes theorem represent infinite confusion? · 2019-03-23T01:18:08.841Z · score: 3 (2 votes) · LW · GW
Which I suppose could be termed "infinitely confused", but that feels like a mixing of levels. You're not confused about a given probability, you're confused about how probability works.

Or alternatively, it's a clever turn of phrase: "infinitely confused" as in confused about infinities.

Comment by thewakalix on Rest Days vs Recovery Days · 2019-03-22T06:13:49.124Z · score: 4 (3 votes) · LW · GW

I'll try my hand at Tabooing and analyzing the words. Epistemic status: modeling other people's models.

Type A days are for changing from a damaged/low-energy state into a functioning state, while Type B days are for maintaining that functioning state by allowing periodic breaks from stressors/time to satisfy needs/?.

I think Unreal means Recovery as in "recovering from a problematic state into a better one". I'm not sure what's up with Rest - I think we lack a good word for Type B. "Rest" is peaceful/slackful, which is right, but it also seems inactive/passive which doesn't match the intended meaning. If you emphasize the inactivity/passivity of Rest then it fits better with Type A. (I think this partly explains the reversal.)

Comment by thewakalix on What failure looks like · 2019-03-20T03:19:04.077Z · score: 7 (6 votes) · LW · GW
the paperclipper, which from first principles decides that it must produce infinitely many paperclips

I don't think this is an accurate description of the paperclip scenario, unless "first principles" means "hardcoded goals".

Future GPT-3 will be protected from hyper-rational failures because of the noisy nature of its answers, so it can't stick forever to some wrong policy.

Ignoring how GPT isn't agentic and handwaving an agentic analogue, I don't think this is sound. Wrong policies make up almost all of policyspace; the problem is not that the AI might enter a special state of wrongness, it's that the AI might leave the special state of correctness. And to the extent that GPT is hindered by its randomness, it's unable to carry out long-term plans at all - it's safe only because it's weak.

Comment by thewakalix on On the Regulation of Perception · 2019-03-10T19:45:27.589Z · score: 1 (1 votes) · LW · GW

But isn’t the gauge itself a measurement which doesn’t perfectly correspond to that which it measures? I’m not seeing a distinction here.

Here’s my understanding of your post: “the map is not the territory, and we always act to bring about a change in our map; changes in the territory are an instrumental subgoal or an irrelevant side effect.” I don’t think this is true. Doesn’t that predict that humans would like wireheading, or “happy boxes” (virtual simulations that are more pleasant than reality)?

(You could respond that “we don’t want our map to include a wireheaded self.” I’ll try to find a post I’ve read that argues against this kind of argument.)

Comment by thewakalix on Plans are Recursive & Why This is Important · 2019-03-10T02:33:35.583Z · score: 4 (3 votes) · LW · GW

Obvious AI connection: goal encapsulation between humans relies on commonalities, such as mental frameworks and terminal goals. These commonalities probably won’t hold for AI: unless it’s an emulation, it will think very differently from humans, and relying on terminal agreements doesn’t work to ground terminal agreement in the first place. Therefore, we should expect it to be very hard to encapsulate goals to an AI.

(Tool AI and Agent AI approaches suffer differently from this difficulty. Agents will be hard to terminally align, but once we’ve done that, we can rely on terminal agreement to flesh out plans. Tools can’t use recursive trust like that, so they’ll need to explicitly understand more instrumental goals.)

Comment by thewakalix on Open Thread March 2019 · 2019-03-10T02:16:59.694Z · score: 1 (1 votes) · LW · GW

Thanks for the explanation, and I agree now that the two are too different to infer much.

Comment by thewakalix on Open Thread March 2019 · 2019-03-10T00:38:46.048Z · score: 4 (3 votes) · LW · GW

I’ve seen this done in children’s shows. There’s a song along with subtitles, and an object moves to each written word as it is spoken.

Comment by thewakalix on A defense on QI · 2019-03-10T00:37:01.940Z · score: 1 (1 votes) · LW · GW

I think its arguments are pretty bad. “If you get hurt, that’s bad. If you get hurt then die, that’s worse. If you die without getting hurt, that’s just as bad. Therefore it’s bad if one of your copies dies.” It equivocates and doesn’t address the actual immortality.

Comment by thewakalix on [NeedAdvice]How to stay Focused on a long-term goal? · 2019-03-10T00:17:24.150Z · score: 3 (3 votes) · LW · GW

On the “Darwin test”: note that memetic evolution pressure is not always aligned with individual human interests. Religions often encourage their believers to do things that help the religion at the believers’ expense. If the religion is otherwise helpful, then its continued existence may be important, but this isn’t why the religion does that.

Comment by thewakalix on How dangerous is it to ride a bicycle without a helmet? · 2019-03-09T23:20:33.248Z · score: 1 (1 votes) · LW · GW

But if you spend more time thinking about exercise, that time cost is multiplied greatly. I think this kind of countereffect cancels out every practical argument of this type.

Comment by thewakalix on On the Regulation of Perception · 2019-03-09T23:02:21.349Z · score: 1 (1 votes) · LW · GW

If hunger is a perception, then “we eat not because we’re hungry, but rather because we perceive we’re hungry” makes much less sense. Animals generally don’t have metacognition, yet they eat, so eating doesn’t require perceiving perception. It’s not that meta.

What do you mean by “when we eat we regulate perception”? Are you saying that the drive to eat comes from a desire to decrease hunger, where “decrease” is regulation and “hunger” is a perception?

Comment by TheWakalix on [deleted post] 2019-03-07T04:52:47.744Z

Begone, spambot. (Is there a “report to moderators” button? I don’t see one on mobile.)

Comment by thewakalix on Karma-Change Notifications · 2019-03-03T04:21:44.672Z · score: 19 (5 votes) · LW · GW

I think this is the idea: people can form habits, and habits have friction - you'll keep doing them even if they're painful (they oppose momentary preferences, as opposed to reflective preferences). But you probably won't adopt a new habit if it's painful. Therefore, to successfully build a habit that changes your actions from momentary to reflective, you should first adopt a habit, then make it painful - don't combine the two steps.

Comment by thewakalix on Unconscious Economies · 2019-02-28T04:44:27.392Z · score: 1 (1 votes) · LW · GW
When content creators get paid for the number of views their videos have, those whose natural way of writing titles is a bit more clickbait-y will tend to get more views, and so over time accumulate more influence and social capital in the YouTube community, which makes it harder for less clickbait-y content producers to compete.

Wouldn't this be the case regardless of whether clickbait is profitable?

Comment by thewakalix on Two Small Experiments on GPT-2 · 2019-02-27T17:03:46.459Z · score: 1 (1 votes) · LW · GW

Ugh. I was distracted by the issue of "is Deep Blue consequentialist" (which I'm still not sure about; maximizing the future value of a heuristic doesn't seem clearly consequentalist or non-consequentialist to me), and forgot to check my assumption that all consequentialists backchain. Yes, you're entirely right. If I'm not incorrect again, Deep Blue forwardchains, right? It doesn't have a goal state that it works backward from, but instead has an initial state and simulates several actions recursively to a certain depth, choosing the initial action that maximizes the expected heuristic of the bottom depth. (Ways I could be wrong: this isn't how Deep Blue works, "chaining" means something more specific, etc. But Google isn't helping on either.)

Comment by thewakalix on Two Small Experiments on GPT-2 · 2019-02-27T17:01:48.547Z · score: 6 (3 votes) · LW · GW

I'm confused. I already addressed the possibility of modeling the external world. Did you think the paragraph below was about something else, or did it just not convince you? (If the latter, that's entirely fine, but I think it's good to note that you understand my argument without finding it persuasive. Conversational niceties like this help both participants understand each other.)

An AI might model a location that happens to be its environment, including its own self. But if this model is not connected in the right way to its consequentialism, it still won't take over the world. It has to generate actions within its environment to do that, and language models simply don't work that way.

Or to put it another way, it understands how the external world works, but not that it's part of the external world. It doesn't self-model in that way. It might even have a model of itself, but it won't understand that the model is recursive. Its value function doesn't assign a high value to words that its model says will result in its hardware being upgraded, because the model and the goals aren't connected in that way.

T-shirt slogan: "It might understand the world, but it doesn't understand that it understands the world."

You might say "this sort of AI won't be powerful enough to answer complicated technical questions correctly." If so, that's probably our crux. I have a reference class of Deep Blue and AIXI, both of which answer questions at a superhuman level without understanding self-modification, but the former doesn't actually model the world and AIXI doesn't belong in discussions of practical feasibility. So I'll just point at the crux and hope you have something to say about it.

You might say, as Yudkowsky has before, "this design is too vague and you can attribute any property to it that you like; come back when you have a technical description". If so, I'll admit I'm just a novice speculating on things they don't understand well. If you want a technical description then you probably don't want to talk to me; someone at OpenAI would probably be much better at describing how language models work and what their limitations are, but honestly anyone who's done AI work or research would be better at this than me. Or you can wait a decade and then I'll be in the class of "people who've done AI work or research".

Comment by thewakalix on Two Small Experiments on GPT-2 · 2019-02-25T15:43:47.909Z · score: 3 (2 votes) · LW · GW

Why do you think that non-consequentialists are more limited than humans in this domain? I could see that being the case, but I could also have seen that being the case for chess, and yet Deep Blue won't take over the world even with infinite compute. (Possible counterpoint: chess is far simpler than language.)

"But Deep Blue backchains! That's not an example of a superhuman non-consequentialist in a technical domain." Yes, it's somewhat consequentialist, but in a way that doesn't have to do with the external world at all. The options it generates are all of the form "move [chess piece] to [location]." Similarly, language models only generate options of the form "[next word] comes next in [context]." No [next word] will result in the model attempting to seize more resources and recursively self-improve.

This is why I said "a consequentialist that models itself and its environment". But it goes even further than that. An AI might model a location that happens to be its environment, including its own self. But if this model is not connected in the right way to its consequentialism, it still won't take over the world. It has to generate actions within its environment to do that, and language models simply don't work that way.

Another line of thought: AIXI will drop an anvil on its head - it doesn't understand self-change. FOOM/Computronium is actually even more stringent: it has to be a non-Cartesian consequentialist that models itself in its environment. You need have to have solved the Embedded Agent problems. Now, people will certainly want to solve these at some point and build a FOOM-capable AI. It's probably necessary to solve them to build a generally intelligent AI that interacts sensibly with the world on its own. But I don't think you need to solve them to build a language model, even a superintelligent language model.