Comment by charlie-steiner on Training human models is an unsolved problem · 2019-05-18T22:37:08.265Z · score: 2 (1 votes) · LW · GW

> 2. I'm not saying seatbelt-free driving is always rational, but on what grounds is it irrational?

Individual actions are not a priori irrational, because we're always talking about a conflict between at least two things. Furthermore, you can always describe humans as perfect rational agents - just use the microphysical description of all their atoms to predict their next action, and say that they assign that action high utility. (This is basically Rohin's point)

Ways of choosing actions are what we think can be irrational (there are problems with this but I'd rather ignore them), but these ways of choosing actions are only associated with humans within some particular way of describing humans (the intentional stance). Like, if you're describing humans as collections of atoms, your description will never label anything as a value conflict or an inconsistency. You have to describe humans in terms of values and choices and so on before you can say that an action "conflicts with their values and is therefore irrational" or whatever.

Long story short, when I say that driving without a seatbelt is usually dumb because people don't want to die, there is no further or more universal sense I know of in which anything is irrational. I do not mean that you cannot assign values to humans in which driving without a seatbelt is always right - in fact, the problem is that I'm worried that a poor AI design might do just such a thing! But in the values that I actually do assign to humans, driving without a seatbelt is usually dumb.

Comment by charlie-steiner on Which scientific discovery was most ahead of its time? · 2019-05-17T07:23:15.243Z · score: 2 (1 votes) · LW · GW

There are plenty of accidental discoveries that we might imagine happening much later - but I don't feel like this should be enough, because it's not that they were surprisingly early, they were just drawn out of a very broad probability distribution.

I'm more satisfied with disoveries that not only could have happened later, but happened when they did for sensible local reasons. Example: Onnes' discovery of superconductivity. Not just because superconductivity was discovered very rapidly (3 years) after the necessary liquefaction of helium, when it conceivably could have taken a lot longer to properly measure the resistance of mercury or lead at low temperatures. But because Onnes' lab in Leiden was the first place to ever make liquid helium to cool superconductors with, and it took 15 years for anyone else in the world (in this case, Toronto) to start liquefying helium!

In short, to my mind being ahead of your time is the opposite of multiple discovery - we push back the luck one step by asking not for a lucky break, but for a sensible and straightforward discovery that could only have happened in a very unusual place.

Comment by charlie-steiner on Programming Languages For AI · 2019-05-12T00:52:53.554Z · score: 2 (1 votes) · LW · GW

Disclaimer: I don't know anything about designing programming languages.

I don't think this programming language can neatly precede the AI design, like your example of chesslang. In fact, it might be interesting to look at history to see which tended to come first - support for features in programming languages, or applications that implemented those features in a more roundabout way.

Like the proof reasoning support, for example, might or might not be in any particular AI design.

Another feature is support for reasoning about probability distributions, which shows up in probabilistic programming languages. Maybe your AI is a big neural net and doesn't need to use your language to do its probabilistic reasoning.

Or maybe it's some trained model family that's not quite neural nets - in which case it's probably still in python, using some packages designed to make it easy to write these models.

Basically, I think there are a lot of possible ways to end up with AI, and if you try to shove all of them into one programming language, it'll end up bloated - your choice of what to do well has to be guided by what the AI design will need. Maybe just making good python packages is the answer.

Comment by charlie-steiner on Open Thread May 2019 · 2019-05-12T00:11:10.932Z · score: 2 (1 votes) · LW · GW

Well, you are free to use the word "color" in such a way that there is no fact of the matter about whether Neptune has a color if you wish. But I think that this directs us into a definitional argument that I don't feel like having - in fact, almost a perfect analogy to "If a tree falls in a forest and there's no one to hear it, does it make a sound?"

So yeah, it's been fun :)

Comment by charlie-steiner on Open Thread May 2019 · 2019-05-11T23:26:47.551Z · score: 2 (1 votes) · LW · GW
What does philosophical sophistication matter? Are the quoted facts false? Is “A blue-violet laser blasts the atom, which then absorbs and re-emits enough light particles to be photographed with conventional equipment” a lie?

The sentence you quote is true. And "The sun blasts the cover of the book, which then absorbs and re-emits photons to be photographed with conventional equipment" is an equally valid description of taking a picture of a book. I'm honestly confused as to what you're expecting me to find inconsistent here. Are you implying that since you can describe scattering of light using scientific talk like "photons" and "re-emission," you can't describe it using unscientific talk like "the atom was purple?"

Ah, or maybe you're expecting me to know that you mean something like "What if the atom is only purple because it's being 'blasted' with purple photons, making this photo useless in terms of understanding its color?" That's a reasonable thing to think if you've never had reason to study atomic spectra, but it turns out the direction of causation is just the opposite - scientists chose a purple laser because that's one of the few colors the atom would absorb and re-emit.

I think what this shows is that my perception of what's "everyday" has definitely been skewed by a physics degree, and things that just seem like the way everything works to me might seem like unusual and rare sciencey phenomena to everyone else.

Like, you ask me to refute this fact or that fact, but I already quoted precisely the statements I thought deserved highlighting from the articles, and none of them were facts about how light or atoms work. That should tell you that I have zero intention of overthrowing our basic understanding of optics. But since we might have different understandings of the same sentences, I will go through the specific ones you highlight and try to read your mind.

the purple speck at the center of this photo is not the true size of the strontium atom itself

This is due to the limitations of the camera - even if the light came from a single point, the camera wouldn't be able to focus it back down to a single pixel on the detector. This is due to a combination of imperfections in the lenses plus physics reasons that make all pictures a little blurry. The same thing happens in the human eye - even if we were looking at a point source of purple light, it would look to our eye like a little round dot of purple light.

Mind reading: Maybe you are implying that if the camera can't resolve the true size of the atom, it's not "really" a picture of the atom. This is sort of true - if the atom was blue on the left and red on the right (let's ignore for a moment the physical impossibility), it would still look purple in this photo despite never emitting purple photons. Of course, we do know that it emits purple photons. But I think the thing that is actually mistaken about worrying about whether this is "really" the atom is that it's forgetting the simplicity of just looking at things to tell what color they are.

Even when I argue that things invisible to the planet eye, like the planet Neptune, can have color, it's not necessarily because of complicated sciencey reasoning, but more some heuristic arguments about process that preserve color. You can look at Neptune through a telescope and it has color then - so since telescopes preserve color, Neptune is blue. Or if I took a red teapot and shot it into space somewhere between the orbit of Mars and Jupiter and we never saw it again, it would still be red, because in color-logic, moving something somewhere else is a process that preserves color.

In short, I didn't bother to actually do the math on whether you could see an atom with the naked eye until today because I think of time lapse photography or using lenses to look at something as process that preserve "intuitive color."

When bathed in a specific wavelength of blue light, strontium creates a glow hundreds of times wider than the radius of the atom itself (which is about a quarter of a nanometer, or 2.5x10 to the −7 meters, Nadlinger said).

I'm not sure what the reporter is getting at here. I think they are comparing the wavelength of the light to the electron radius of the atom, and the wavelength of the light is what they mean by "creates a glow." It's definitely inaccurate. But I'm not sure I could do better to explain what's going on to an audience that doesn't know quantum mechanics. Maybe a water analogy? The atom emitting a photon is like a tiny stone tossed into a pond. The size of the ripples emitted by the tiny stone is a lot bigger than the size of the stone itself.

Mind reading: Maybe you read this as something like "We're not actually seeing the atom, we're just seeing 'a glow created by the atom' that's a lot bigger than the atom, therefore this isn't a picture of the atom." That's false. This goes back to why I was making fun of the reporter who said "We're not seeing the atom, we're just seeing the light emitted by the atom." Seeing the light emitted by something is what seeing is. You could just as well say "I'm not actually seeing my hand, I'm just seeing the light emitted by my hand," and you'd be just as wrong. Learning about photons should not destroy your ability to see things, and if it does you're doing philosophy very wrong.

This glow would be barely perceptible with the naked eye but becomes apparent with a little camera manipulation.

Yup, this is that "long exposure to the atom + short flash of the surroundings" technique I talked about in the previous comment. Maybe you see the word "manipulation" and assume that everything is meaningless and this "doesn't count"? I think this question stops being useful when you understand what was actually done to get the photo.

“The apparent size you see in the picture is what we’d call optical aberration,” Nadlinger said.

See above about the limitations of cameras.

Comment by charlie-steiner on Open Thread May 2019 · 2019-05-11T21:13:04.383Z · score: 2 (1 votes) · LW · GW
So, technically, you’re seeing light emitted from an atom and not the atom itself.

Hm. Not sure you want to pick this reporter as your example of philosophical sophistication. Next.

Without the long exposure effect, the atom wouldn't be visible to the naked eye.

I think this is an important caveat in the context of talking about how the photograph was taken - a long exposure (on the order of 10 seconds) was used to capture the atom, and then a flash was used to illuminate the surroundings. Without this technique, the atom would not be visible in an image that also included the surroundings. What it is not, however, is a pronouncement on the inherent invisibility of atoms. But the next quote is.

So, seeing a single atom with the naked eye is impossible.

Congrats, you found a reporter saying the thing. However, this random reporter's opinion isn't necessarily true. To be honest, I'm not really sure if an atom really is visible to the naked eye (at least not as a continuous object, since humans can detect even single photons but only stochastically), or if you'd need binoculars to see it. Let's do some math. From here, let's suppose our trapped atom emits 20 million photons per second, spread across all angles. The faintest visible stars are about magnitude +6. How many photons per second is that?

Well, the sun shines down with about 1000 W/m^2 and it's magnitude -27. Since magnitude is a logarithmic scale, this means a magnitude +6 star shines about 6 × 10^-11 W/m^2 on us.

For the 400nm light used in the stackexchange comment, 20 million photons per second is just about 10^-11 Watts. This means that the surface of the sphere around the atom is only allowed to be 1/6th of a m^2, which means radius of 12 cm (about 5"). So if you were to put your eye 12 cm away from the atom, it would be as bright as the faintest stars humans can see in the sky. Farther than that and I'm happy to agree you couldn't see it with your naked eye.

Comment by charlie-steiner on Open Thread May 2019 · 2019-05-11T20:21:35.408Z · score: 2 (1 votes) · LW · GW

Trapped atoms are always illuminated by a laser that picks out one single wavelength emitted by the atom. This isn't necessarily the same color you'd see if these atoms were scattering sunlight - in addition to the color used in the lab, you might see a few other wavelengths as well, along with a generic bluish color due to Rayleigh scattering. But since each atom emits / absorbs different wavelengths, each will look different both under sunlight and when trapped in the lab.

Here's an example of trapped atoms emitting green light - figure 3 is a photo taken through an optical microscope:

Comment by charlie-steiner on Open Thread May 2019 · 2019-05-11T19:20:15.224Z · score: 0 (2 votes) · LW · GW

Turns out, yes.

Comment by charlie-steiner on Open Thread May 2019 · 2019-05-11T19:09:10.293Z · score: 2 (1 votes) · LW · GW

Suppose you are handed a book. How do you figure out what color the cover of the book is?

Suppose you were handed a single atom. Could you judge it the same way that you did book?

Comment by charlie-steiner on Disincentives for participating on LW/AF · 2019-05-10T21:39:05.331Z · score: 28 (10 votes) · LW · GW

As someone basically thinking alone (cue George Thoroughgood), I definitely would value more comments / discussion. But if someone has access to research retreats where they're talking face to face as much as they want, I'm not surprised that they don't post much.

Talking is a lot easier than writing, and more immediately rewarding. It can be an activity among friends. It's more high-bandwidth to have a discussion face to face than it is over the internet. You can assume a lot more about your audience which saves a ton of effort. When talking, you are more allowed to bullshit and guess and handwave and collaboratively think with the other person, and still be interesting, wheras when writing your audience usually expects you to be confident in what you've written. Writing is hard, reading is hard, understanding what people have written is harder than understanding what people have said and if you ask for clarification that might get misunderstood in turn. This all applies to comments almost as much as to posts, particularly on technical subjects.

The two advantages writing has for me is that I can communicate in writing with people who I couldn't talk to, and that when you write something out you get a good long chance to make sure it's not stupid. When talking it's very easy to be convincing, including to yourself, even when you're confused. That's a lot harder in writing.

To encourage more discussion in writing one could try to change the format to reduce these barriers as much as possible - trying to foster one-to-one or small group threads rather than one-to-many, forstering/enabling knowledge about other posters, creating a context that allows for more guesswork and collaborative thinking. Maybe one underutilized tool on current LW is the question thread. Question threads are great excuses to let people bullshit on a topic and then engage them in small group threads.

Training human models is an unsolved problem

2019-05-10T07:17:26.916Z · score: 16 (6 votes)
Comment by charlie-steiner on Open Thread May 2019 · 2019-05-07T20:57:49.844Z · score: 2 (1 votes) · LW · GW

If you cover a brown chair with blue paint, it becomes a blue chair. There is no answer to the question "What color is a chair?", because how a chair scatters light depends on context.

Chairs are non-fundamentally colored, so the only question you can even try to answer is "What color is this chair?"

Y'all are trying to rely on a dichotomy between "Fundamental particles are fundamentally colored" and "Fundamental particles have no color." That is a false dichotomy. The color of an electron depends on context - congrats, you have shown that it is not fundamentally colored, we agree.

Comment by charlie-steiner on Open Thread May 2019 · 2019-05-07T01:43:51.849Z · score: -6 (5 votes) · LW · GW

I think this is becoming much too abstract. "Can an atom be a color?" was not supposed to be one of those troll questions like "Is a hotdog a sandwich?"

If you want to know whether an atom can have color, you can just look at one. Here. That's an atom. As you can see, it is purple. If you wish to claim that this atom is not purple, because color depends on context, and if you encased a blade of grass in tungsten and threw it into the sun, would it still be green?, please jump in a lake whose color depends on context.

If you wish to claim that this atom is not purple because the objects invoked to explain the color of materials must themselves be colorless, please forward your mail to:

Robert Nozick

242 Emerson Hall

Harvard University


Comment by charlie-steiner on Value learning for moral essentialists · 2019-05-06T22:02:09.653Z · score: 5 (3 votes) · LW · GW

Yeah, I'm not 100% my caricature of a person actually exists or is worth addressing. They're mostly modeled on Robert Nozick, who is dead and cannot be reached for comment on value learning. But I had most of these thoughts and the post was really easy to write, so I decided to post it. Oh well :)

The person I am hypothetically thinking about is not very systematic - on average, they would admit that they don't know where morality comes from. But they feel like they learn about morality by interacting in some mysterious way with an external moral reality, and that an AI is going to be missing something important - maybe even be unable to do good - if they don't do that too. (So 90% overlap with your description of strong moral essentialism.)

I think these people plausibly should be for value learning, but are going to be dissatisfied with it and feel like it sends the wrong philosophical message.

Comment by charlie-steiner on Open Thread May 2019 · 2019-05-06T21:29:25.026Z · score: 2 (1 votes) · LW · GW
Consider the possibility that you are not being as clear as you think you are.

Fair enough.

are atoms colorless? No! Very no! They are colored in exactly the mundane way that everything is, which is rather the entire point.

Maybe it would help if I said that atoms are non-fundamentally colored, in exactly the same way that a chair, or grass, or the sky is non-fundamentally colored. The everyday meaning of "the grass is green" or "the sun is yellowish-white" extends nicely down to the atomic scale. Even a free electron (being a thing that scatters light) has a color.

When you make a reductionist explanation of color, it's the explanation that lacks color, not the atoms.

Value learning for moral essentialists

2019-05-06T09:05:45.727Z · score: 13 (5 votes)
Comment by charlie-steiner on Open Thread May 2019 · 2019-05-05T20:09:11.113Z · score: 2 (1 votes) · LW · GW
If sodium is intrinsically orange, why is sodium chloride white?

Are you expecting an answer other than the obvious one? Are you just trying to smuggle fundamental color back into the question by using the word "intrinsically?"

Try investing the claim that you can explain a property in terms of itself.

Are you willfully misreading me, or is it just accidental?

Comment by charlie-steiner on Open Thread May 2019 · 2019-05-05T09:41:54.359Z · score: 2 (7 votes) · LW · GW

I'm finishing up Nozick's big book 'o philosophy (mostly motivated by spite at this point), and I ran into an interesting misunderstanding of reductionism, disguised as a misunderstanding of physics.

Here's the whole quote:

To explain why something has a certain feature or property, we can refer to something else with that property, but a fundamental explanation of (the nature of) the property, or what gives rise to it or how it functions, will not refer to other things with the very same property; the possession and functioning of that property is what is to be explained, not only in this instance but in all its occurrences. This point was made by the philosopher of science N. R. Hanson, who pointed out that we should expect atoms and subatomic particles to lack the features that they fundamentally explain - only something itself colorless could constituted a fundamental explanation of color.

Ignore the philosophy for a second, and think about the science - are atoms colorless? No! Very no! They are colored in exactly the mundane way that everything is, which is rather the entire point.

The conclusion being false, then, what's wrong with the premises? Are you really allowed to use colored things in a reductionist explanation of color?

Turns out, yes you are. It's perfectly fine for the fundamental things to be colored. As long as none of the fundamental things are color itself, it's fine. We explain the orangeness of sodium lamps in terms of orange sodium atoms, and we explain the orangeness of sodium atoms in terms of properties (energy levels) that are not color. But the properties are, in a sense, just part of our map, just bookkeeping, and the actual stuff, the atoms, remain orange.

Perhaps Nozick knew this, and was just being a bit broad with the word "things." But given the scientific howler of an example, I'm not sure. Also, the rest of this chapter is basically just him making the same mistake repeatedly, except rather than "Reductionist explanation of color means modeling the world as made of colorless things," it's "reductionist explanation of values means modeling the world as made of valueless things."

This doesn't sound howlingly wrong, until you remember that he's going to pull the "and therefore reductionists think atoms are literally colorless" thing on you once or twice per paragraph.

Comment by charlie-steiner on Nash equilibriums can be arbitrarily bad · 2019-05-01T17:30:53.379Z · score: 4 (2 votes) · LW · GW

Yup. Though you don't necessarily need to imagine the money "changing hands" - if both people get paid 2 extra pennies if they tie, and the person who bids less gets paid 4 extra pennies, the result is the same.

The point is exactly what it says in the title. Relative to the maximum cooperative payoff, the actual equilibrium payoff can be arbitrarily small. And as you change the game, the transition from low payoff to high payoff can be sharp - jumping straight from pennies to millions just by changing the payoffs by a few pennies.

Comment by charlie-steiner on Book review: The Sleepwalkers by Arthur Koestler · 2019-04-24T23:26:37.626Z · score: 2 (1 votes) · LW · GW

I think using ellipses only really gets you good mileage once you have the planets moving around the sun. If, like Aristotle, you have the planets moving around the earth, then epicycles are just being a very general way of representing periodic motion phenomenologically.

Comment by charlie-steiner on AI Alignment Problem: “Human Values” don’t Actually Exist · 2019-04-24T09:59:30.772Z · score: 4 (2 votes) · LW · GW

Thanks for this! Definitely some themes that are in the zeitgeist right now for whatever reason.

One thing I'll have to think about more is the idea of natural limits (e.g. the human stomach's capacity for tasty food) as a critical part of "human values," that keeps them from exhibiting abstractly bad properties like monomania. At first glance one might think of this as an argument for taking abstract properties (meta-values) seriously, or taking actual human behavior (which automatically includes physical constraints) seriously, but it might also be regarded as an example of where human values are indeterminate when we go outside the everyday regime. If someone wants to get surgery to make their stomach 1000x bigger (or whatever), and this changes the abstract properties of their behavior, maybe we shouldn't forbid this a priori.

Comment by charlie-steiner on Conditional revealed preference · 2019-04-17T00:14:09.371Z · score: 5 (3 votes) · LW · GW

In other words, "real preferences" are a functional part of a larger model of humans that supports counterfactual reasoning, and if you want to infer the preferences, you should also make sure that your larger model is a good model of humans. (Where "good" doesn't just mean highly predictive, it includes some other criteria that involve making talking a bout preferences a good idea, and maybe not deviating too far from our intuitive model).

Comment by charlie-steiner on The Cacophony Hypothesis: Simulation (If It is Possible At All) Cannot Call New Consciousnesses Into Existence · 2019-04-15T08:38:27.943Z · score: 17 (8 votes) · LW · GW

Comment status: long.

Before talking about your (quite fun) post, I first want to point out a failure mode exemplified by Scott's "The View From Ground Level." Here's how he gets into trouble (or begins trolling): first he is confused about consciousness. Then he postulates a unified thing - "consciousness" proper - that he's confused about. Finally he makes an argument that manipulates this thing as if it were a substance or essence. These sorts of arguments never work. Just because there's a cloud, doesn't mean that there's a thing inside the cloud precisely shaped like the area obscured by the cloud.

Okay, on to my reaction to this post.

When trying to ground weird questions about point-of-view and information, one useful question is "what would a Solomonoff inductor think?" The really short version of why we can take advice from a Solomonoff inductor is that there is no such thing as a uniform prior over everything - if you try to put a uniform prior over everything, you're trying to assign each hypothesis a probability of 1/infinity, which is zero, which is not a good probability to give everything. (You can play tricks that effectively involve canceling out this infinite entropy with some source of infinite information, but let's stick to the finite-information world). To have a probability distribution over infinite hypotheses, you need to play favorites. And this sounds a lot like Solomonoff's "hypotheses that are simple to encode for some universal Turing machine should be higher on the list."

So what would a Solomonoff inductor think about themselves? Do they think they're the "naive encoding," straightforwardly controlling a body in some hypothesized "real world?" Or are they one of the infinitely many "latent encodings," where the real world isn't what it seems and the inductor's perceptions are instead generated by some complicated mapping from the state of the world to the memories of the inductor?

The answer is that the Solomonoff inductor prefers the naive encoding. We're pretty sure my memories are (relatively) simple to explain if you hypothesize my physical body. But if you hypothesize that my memories are encoded in the spray from a waterfall, the size of the Turing machine required to translate waterfall-spray into my memories gets really big. One of the features of Solomonoff inductors that's vital to their nice properties is that hypotheses become more unlikely faster than they become more numerous. There are an infinite number of ways that my memories might be encoded in a waterfall, or in the left foot of George Clooney, or even in my own brain. But arranged in order of complexity of the encoding, these infinite possibilities get exponentially unlikely, so that their sum remains small.

So the naive encoding comes out unscathed when it comes to myself. But what about other people? Here I agree the truth has to be unintuitive, but I'd be a bit more eliminitavist than you. You say "all those experiences exist," I'd say "in that sense, none of them exist."

From the point of view of the Solomonoff inductor, there is just the real world hypothesized to explain our data. Other people are just things in the real world. We presume that they exist because that presumption has explanatory power.

You might say that the Solomonoff inductor is being hypocritical here. It assumes that my body has some special bridging law to some sort of immaterial soul, some Real Self that is doing the Solomonoff-inducting, but it doesn't extend that assumption to other people. To be cosmopolitan, you'd say, we should speculate about the bridging laws that might connect experiences to our world like hairs on a supremely shaggy dog.

I'd say that maybe this is the point where me and the Solomonoff inductor part ways, because I don't think I actually have an immaterial soul, it's just a useful perspective to take sometimes. I'd like to think I'm actually doing some kind of naturalized induction that we don't quite know how to formalize yet, that allows for the fact that the thing doing the inducting might actually be part of the real world, not floating outside it, attached only by an umbilical cord.

I don't just care about people because I think they have bridging laws that connect them to their Real Experiences; any hypotheses about Real Experiences in my description of the world are merely convenient fictions that could be disposed of if only I was Laplace's demon.

I think that in the ultimate generalization of how we care about things, the one that works even when all the weirdnesses of the world are allowed, things that are fictional will not be made fundamental. Which is to say, the reason I don't care about all the encodings of me that could be squeezed into every mundane object I encounter isn't because they all cancel out by some phenomenal symmetry argument, it's because I don't care about those encodings at all. They are, in some deep sense, so weird I don't care about them, and I think that such a gradient that fades off into indifference is a fundamental part of any realistic account of what physical systems we care about.

Comment by charlie-steiner on Defeating Goodhart and the "closest unblocked strategy" problem · 2019-04-03T22:07:37.497Z · score: 2 (1 votes) · LW · GW

One further issue is that if the AI deduces this within one human-model (as in CIRL), it may follow this model off a metaphorical cliff when trying to maximize modeled reward.

Merely expanding the family of models isn't enough because the best-predicting model is something like a microscopic, non-intentional model of the human. A "nearest unblocked model" problem. The solution should be similar - get the AI to score models so that the sort of model we want it to use is scored highly. (Or perhaps more complicated where human morality is undefined.) This isn't just a prior - we want predictive quality to only be one of several (as yet ill-defined) criteria.

Comment by charlie-steiner on Announcing the Center for Applied Postrationality · 2019-04-02T06:31:04.038Z · score: 15 (9 votes) · LW · GW

Train postrationality by commenting on Tumblr. By figuring out how Donald Trump's latest move was genius. By living a virtuous life. By defecting in a Prisoner's Dilemma against yourself. By starting your own political campaign. By reading Kierkegaard. By regretting defecting against yourself in the Prisoner's Dilemma and finding a higher power to punish you for it. By humming "The Ballad of Big Yud" to yourself in the shower. By becoming a Scientologist for 89 days and getting your money back with the 90-day money-back guarantee.

Comment by charlie-steiner on User GPT2 is Banned · 2019-04-02T06:23:28.983Z · score: 4 (2 votes) · LW · GW

If GPT2 was from the mod team, 5/10, with mod tools we could have upped the absurdity game a lot. If it was an independent effort, 8/10, you got me :)

Comment by charlie-steiner on Humans aren't agents - what then for value learning? · 2019-03-19T23:33:00.158Z · score: 3 (2 votes) · LW · GW

Sure. It describes how humans aren't robust to distributional shift.

Comment by charlie-steiner on Humans aren't agents - what then for value learning? · 2019-03-18T16:56:21.431Z · score: 2 (1 votes) · LW · GW

I hope so! IRL and CIRL are really nice frameworks for learning from general behavior, and as far as I can tell, learning from verbal behavior requires a simultaneous model of verbal and general behavior, with some extra parts that I don't understand yet.

Comment by charlie-steiner on Humans aren't agents - what then for value learning? · 2019-03-17T21:55:17.297Z · score: 2 (1 votes) · LW · GW

I mostly agree, though you can really tell me we have the right answer once we can program it into a computer :) Human introspection is good at producing verbal behavior, but is less good at giving you a utility function on states of the universe. Part of the problem is that it's not like we have "a part of ourselves that does introspection" like it's some kind of orb inside our skulls - breaking human cognition into parts like that is yet another abstraction that has some free parameters to it.

Comment by charlie-steiner on Humans aren't agents - what then for value learning? · 2019-03-16T19:06:11.806Z · score: 2 (1 votes) · LW · GW
Does it seem clear to you that if you model a human as a somewhat complicated thermostat (perhaps making decisions according to some kind of flowchart) then you aren't going to predict that a human would write a post about humans being somewhat complicated thermostats?

Is my flowchart model complicated enough to emulate a RNN? Then I'm not sure.

Or one might imagine a model that has psychological parts, but distributes the function fulfilled by "wants" in an agent model among several different pieces, which might conflict or reinforce each other depending on context. This model could reproduce human verbal behavior about "wanting" with no actual component in the model that formalizes wanting.

If this kind of model works well, it's a counterexample (less compute-intensive than a microphysical model) of the idea I think you're gesturing towards, which is that the data really privileges models in which there's an agent-like formalization of wanting.

Comment by charlie-steiner on Humans aren't agents - what then for value learning? · 2019-03-16T18:01:46.344Z · score: 2 (1 votes) · LW · GW

Person A isn't getting it quite right :P Humans want things, in the usual sense that "humans want things" indicates a useful class of models I use to predict humans. But they don't Really Want things, the sort of essential Wanting that requires a unique, privileged function from a physical state of the human to the things Wanted.

So here's the dialogue with A's views more of an insert of my own:

A: Humans aren't agents, by which I mean that humans don't Really Want things. It would be bad to make an AI that assumes they do.

B: What do you mean by "bad"?

A: I mean that there wouldn't be such a privileged Want for the AI to find in humans - humans want things, but can be modeled as wanting different things depending on the environment and level of detail of the model.

B: No, I mean how could you cash out "bad" if not in terms of what you Really Want?

A: Just in terms of what I regular, contingently want - how I'm modeling myself right now.

B: But isn't that a privileged model that the AI could figure out and then use to locate your wants? And since these wants so naturally privileged, wouldn't that make them what you Really Want?

A: The AI could do something like that, but I don't like to think of that as finding out what I Really Want. The result isn't going to be truly unique because I use multiple models of myself, and they're all vague and fallible. And maybe more importantly, programming an AI to understand me "on my own terms" faces a lot of difficult challenges that don't make sense if you think the goal is just to translate what I Really Want into the AI's internal ontology.

B: Like what?

A: You remember the Bay Area train analogy at the end of The Tails Coming Apart as Metaphor for Life? When the train lines diverge, thinking of the problem as "figure out what train we Really Wanted" doesn't help, and might divert people from the possible solutions, which are going to be contingent and sometimes messy.

B: But eventually you actually do follow one of the train lines, or program it into the AI, which uniquely specifies that as what you Really Want! Problem solved.

A: "Whatever I do is what I wanted to do" doesn't help you make choices, though.

Comment by charlie-steiner on Humans aren't agents - what then for value learning? · 2019-03-16T02:09:31.888Z · score: 5 (3 votes) · LW · GW

Could you elaborate on what you mean by "if your model of humans is generative enough to generate itself, then it will assign agency to at least some humans?" I think the obvious extreme is a detailed microscopic model that reproduces human behavior without using the intentional stance - is this a model that doesn't generate itself, or is this a model that assigns agency to some humans?

It seems to me that you're relying on the verb "generate" here to involve some sort of human intentionality, maybe? But the argument of this post is that our intentionality is inexact and doesn't suffice.

Suppose you are building an AI and want something from it. Then you are an agent with respect to that thing, since you want it.

There's wanting, and then there's Wanting. The AI's model of me isn't going to regenerate my Real Wanting, which requires the Essence of True Desire. It's only going to regenerate the fact that I can be modeled as wanting the thing. But I can be modeled as wanting lots of things, is the entire point.

Comment by charlie-steiner on A theory of human values · 2019-03-15T22:55:59.178Z · score: 2 (1 votes) · LW · GW

This has prompted me to get off my butt and start publishing the more useful bits of what I've been thinking about. Long story short, I disagree with you while still almost entirely agreeing with you.

This isn't really the full explanation of why I think the AI can't just be given a human model and told to fill it in, though. For starters, there's also the issue about whether the human model should "live" in the AI's native ontology, or whether it should live in its own separate, "fictional" ontology.

I've become more convinced of the latter - that if you tell the AI to figure out "human values" in a model that's interacting with whatever its best-predicting ontology is, it will come up with values that include things as strange as "Charlie wants to emit CO2" (though not necessarily in the same direction). Instead, its model of my values might need to be described in a special ontology in which human-level concepts are simple but the AI's overall predictions are worse, in order for a predictive human model to actually contain what I'd consider to be my values.

Comment by charlie-steiner on A cognitive intervention for wrist pain · 2019-03-15T22:28:50.898Z · score: 4 (2 votes) · LW · GW

Sure. And my comment is more aimed at the audience than at Richard - I don't know him, and I agree that reducing stress can help, and can help more the more you're stressed. Maybe some parts of his story seem like they could also fit with a story of injury and healing (did you know that wrists feeling strange, swollen or painful at night or after other long periods of stillness can be because of reduced flow of lymph fluid through inflamed wrists?), but they could also fit with his story of stress. I think this is one of those posts that has novelty precisely because the common view is actually right most of the time, and my past self probably needed to take the common view into account more.

Humans aren't agents - what then for value learning?

2019-03-15T22:01:38.839Z · score: 20 (6 votes)
Comment by charlie-steiner on A cognitive intervention for wrist pain · 2019-03-15T20:15:46.610Z · score: 6 (6 votes) · LW · GW

You say " there would be an epidemic of wrist pain at typing-heavy workplaces" as if there isn't a ton of wrist pain at typing-heavy workplaces. And, like, funny how stress is making your wrists hurt rather than your toes or elbows, right?

I think, as one grows old, one gets a better sense that the human body just breaks down sometimes, and doesn't repair itself perfectly. Those horribly injured solders you bring up probably had aches and pains sometimes for the rest of their life that they never really talked about, because other people wouldn't understand. My mom has pain in her left foot sometimes from where she broke it 40 years ago. And eventually, our bodies will just accumulate injuries more and more until we die.

If you have pain that you think is due to wrist inflammation, check out the literature and take action to the degree you can. The mind can control pain quite well, and the human body is tough, but if you do manage to injure yourself you'll regret it.

Comment by charlie-steiner on Open Thread February 2019 · 2019-03-05T18:08:55.086Z · score: 4 (2 votes) · LW · GW

Definitely depends on the field. For experimental papers in the field I'm already in, it only takes like half an hour, and then following up on the references for things I need to know the context for takes an additional 0.5-2 hours. For theory papers 1-4 hours is more typical.

Comment by charlie-steiner on Syntax vs semantics: alarm better example than thermostat · 2019-03-04T20:32:09.354Z · score: 2 (1 votes) · LW · GW

Sure. "If it's smart, it won't make simple mistakes." But I'm also interested in the question of whether, given the first few in this sequence of approximate agents, one could do a good job at predicting the next one.

It seems like you could - like there is a simple rule governing these systems ("check whether there's a human in the greenhouse") that might involve difficult interaction with the world in practice but is much more straightforward when considered from the omniscient third-person view of imagination. And given that this rule is (arguendo) simple within a fairly natural (though not by any means unique) model of the world, and that it helps predict the sequence, one might be able to guess that this rule was likely just from looking at the sequence of systems.

(This also relies on the distinction between just trying to find likely or good-enough answers, and the AI doing search to find weird corner cases. The inferred next step in the sequence might be expected to give similar likely answers, with no similar guarantee for corner-case answers.)

Comment by charlie-steiner on To understand, study edge cases · 2019-03-03T14:00:29.777Z · score: 5 (3 votes) · LW · GW

Is this contra ?

To repeat my example from there, to understand superconductivity it doesn't help much to smash them into their components, even though it helps a lot for understanding atoms. A non-philosophical example from your list where people went to the "extremist" view for a little too long might be mental health before the rise of positive psychology.

Comment by charlie-steiner on So You Want to Colonize The Universe Part 4: Velocity Changes and Energy · 2019-02-27T23:28:04.012Z · score: 2 (1 votes) · LW · GW

Minor nitpick: diamond is only metastable, especially at high temperatures. It will slowly turn to graphite. After sufficient space travel, all diamond parts will be graphite parts.

Comment by charlie-steiner on So You Want to Colonize the Universe Part 2: Deep Time Engineering · 2019-02-27T19:11:45.015Z · score: 5 (3 votes) · LW · GW
There are exceptions. The sea walls in the Netherlands are sized for 10,000-year flood numbers, and I got a pleasant chill up my back when I read that, because there's something really nice about seeing a civilization build for thousands of years in the future.

Or at least people taking expected value mildly seriously and not just copying the engineering standards acceptable for roads.

Comment by charlie-steiner on So You Want to Colonize The Universe · 2019-02-27T18:59:43.916Z · score: 3 (2 votes) · LW · GW

How about deliberately launching probes that you could catch up with by expending more resources, but are still adequate to reach our local group of galaxies? That sounds like a pretty interesting idea. Like "I'm pretty sure moral progress has more or less converged, but in case I really want to change my mind about the future of far-off galaxies, I'll leave myself the option to spend lots of resources to send a faster probe."

If we imagine that the ancient Greeks had the capability to launch a Von Neumann probe to a receding galaxy, I'd rather they do it (and end up with a galaxy full of ancient Greek morality) than they let it pass over the cosmic horizon.

Comment by charlie-steiner on Is LessWrong a "classic style intellectual world"? · 2019-02-27T18:54:16.772Z · score: 4 (3 votes) · LW · GW

Different people write with different goals. Writing is useful for forcing me to think, and to the extent I want attention I want it from a fairly small group. On the other hand, high readership and karma naturally accrues to people who write the sorts of things that get high reader counts.

I have literally zero problems with this natural scale of karma-numbers as long as it's not actively interfering with my goals on the site. Maybe if I was a reader who wanted to use karma-number to sort posts, I would be inconvenienced by the stratification by topic.

Comment by charlie-steiner on How to get value learning and reference wrong · 2019-02-27T15:32:33.855Z · score: 2 (1 votes) · LW · GW

Definitely a parallel sort of move! I would have already said that I was pretty rationality anti-realist, but I seem to have become even more so.

I think if I had to describe how I've changed my mind briefly, it would be something about before, I thought that to learn an AI stand-in for human preferences you should look at the effect on the real world. Now, I take much more seriously the idea that human preferences "live" in a model that is itself a useful fiction.

How to get value learning and reference wrong

2019-02-26T20:22:43.155Z · score: 35 (10 votes)
Comment by charlie-steiner on The Case for a Bigger Audience · 2019-02-23T20:07:05.097Z · score: 5 (3 votes) · LW · GW

I think people are just writing about less accessible things on average. I wouldn't want to have more comments just by not talking about abstruse topics, at the moment. I love you all, but I also love AI safety :P

I briefly looked at the comments within 1 year on some old LW posts, and the numbers seem to match from then too - only ~25 comments on the more rareified meta-ethics posts, well below average.

Comment by charlie-steiner on Can an AI Have Feelings? or that satisfying crunch when you throw Alexa against a wall · 2019-02-23T19:27:53.956Z · score: 5 (3 votes) · LW · GW

I'm reminded of Dennett's passage on the color red. Paraphrased:

To judge that you have seen red is to have a complicated visual discrimination process send a simple message to the rest of your brain. There is no movie theater inside the brain that receives this message and then obediently projects a red image on some inner movie screen, just so that your inner homunculus can see it and judge it all over again. Your brain only needs to make the judgment once!

Similarly, if I think I'm conscious, or feel emotion, it's not like our brain notices and then passes this message "further inside," to the "real us" (the homunculus). Your brain only has to go into the "angry state" once - it doesn't have to then send an angry message to the homunculus so that you can really feel anger.

Comment by charlie-steiner on Thoughts on Human Models · 2019-02-22T11:57:19.389Z · score: 4 (2 votes) · LW · GW

I actually think this is pretty wrong (posts forthcoming, but see here for the starting point). You make a separation between the modeled human values and the real human values, but "real human values" are a theoretical abstraction, not a basic part of the world. In other words, real human values were always a subset of modeled human values.

In the example of designing a transit system, there is an unusually straightforward division between things that actually make the transit system good (by concise human-free metrics like reliability or travel time), and things that make human evaluators wrongly think it's good. But there's not such a concise human-free way to write down general human values.

The pitfall of optimization here happens when the AI is searching for an output that has a specific effect on humans. If you can't remove the fact that there is a model of humans involved, then the AI has to be evaluating its output in some other way than modeling the human's reaction to it.

Comment by charlie-steiner on Probability space has 2 metrics · 2019-02-10T20:42:49.752Z · score: 3 (2 votes) · LW · GW

Awesome idea! I think there might be something here, but I think the difference between "no chance" and "0.01% chance" is more of a discrete change from not tracking something to tracking it. We might also expect neglect of "one in a million" vs "one in a trillion" in both updates and decision-making, which causes a mistake opposite that predicted by this model in the case of decision-making.

Comment by charlie-steiner on Philosophy as low-energy approximation · 2019-02-10T03:14:54.835Z · score: 2 (1 votes) · LW · GW

About 95%. Because philosophy is easy* and full of obvious confusions.

(* After all, anyone can do it well enough that they can't see their own mistakes. And with a little more effort, you can't even see your mistakes when they're pointed out to you. That's, like, the definition of easy, right?)

95% isn't all that high a confidence, if we put aside "how dare you rate yourself so highly?" type arguments for a bit. I wouldn't trust a parachute that had a 95% chance of opening. Most of the remaining 5% is not dualism being true or us needing a new kind of science, it's just me having misunderstood something important.

Comment by charlie-steiner on Philosophy as low-energy approximation · 2019-02-08T19:53:40.625Z · score: 4 (2 votes) · LW · GW

Anyhow, I agree that we have long since been rehashing standard arguments here :P

Comment by charlie-steiner on Philosophy as low-energy approximation · 2019-02-07T22:41:12.593Z · score: 7 (3 votes) · LW · GW

Seeing red is more than a role or disposition. That is what you have left out.

Suppose epiphenomenalism is true. We would still need two separate explanations - one explanation of your epiphenomenal activity in terms of made-up epiphenomenology, and a different explanation for how your physical body thinks it's really seeing red and types up these arguments on LessWrong, despite having no access to your epiphenomena.

The mere existence of that second explanation makes it wrong to have absolute confidence in your own epiphenomenal access. After all, we've just described approximate agents that think they have epiphenomenal access, and type and make facial expressions and release hormones as if they do, without needing any epiphenomena at all.

We can imagine the approximate agent made out of atoms, and imagine just what sort of mistake it's making when it says "no, really, I see red in a special nonphysical way that you have yet to explain" even when it doesn't have access to the epihpenomena. And then we can endeavor not to make that mistake.

If I, the person typing these words, can Really See Redness in a way that is independent or additional to a causal explanation of my thoughts and actions, my only honest course of action is to admit that I don't know about it.

Comment by charlie-steiner on Philosophy as low-energy approximation · 2019-02-06T19:19:07.482Z · score: 7 (3 votes) · LW · GW

I'm supposing that we're conceptualizing people using a model that has internal states. "Agency" of humans is shorthand for "conforms to some complicated psychological model."

I agree that I do see red. That is to say, the collection of atoms that is my body enters a state that plays the same role in the real world as "seeing red" plays in the folk-psychological model of me. If seeing red makes the psychological model more likely to remember camping as a child, exposure to a red stimulus makes the atoms more likely to go into a state that corresponds to remembering camping.

"No, no," you say. "That's not what seeing red is - you're still disagreeing with me. I don't mean that my atoms are merely in a correspondence with some state in an approximate model that I use to think about humans, I mean that I am actually in some difficult to describe state that actually has parts like the parts of that model."

"Yes," I say "- you're definitely in a state that corresponds to the model."

"Arrgh, no! I mean when I see red, I really see it!"

"When I see red, I really see it too."


It might at this point be good for me to reiterate my claim from the post, that rather than taking things in our notional world and asking "what is the true essence of this thing?", it's more philosophically productive to ask "what approximate model of the world has this thing as a basic object?"

Comment by charlie-steiner on Philosophy as low-energy approximation · 2019-02-06T18:31:45.357Z · score: 4 (2 votes) · LW · GW

Then the thought experiment is a useful negative result telling us we need something more comprehensive.

Paradigms also outline which negative results are merely noise :P I know it's not nice to pick on people, but look at the negative utilitarians. They're perfectly nice people, they just kept looking for The Answer until they found something they could see no refutation of, and look where that got them.

I'm not absolutely against thought experiments, but I think that high-energy philosophy as a research methodology is deeply flawed.

Philosophy as low-energy approximation

2019-02-05T19:34:18.617Z · score: 38 (20 votes)

Can few-shot learning teach AI right from wrong?

2018-07-20T07:45:01.827Z · score: 16 (5 votes)

Boltzmann Brains and Within-model vs. Between-models Probability

2018-07-14T09:52:41.107Z · score: 19 (7 votes)

Is this what FAI outreach success looks like?

2018-03-09T13:12:10.667Z · score: 53 (13 votes)

Book Review: Consciousness Explained

2018-03-06T03:32:58.835Z · score: 101 (27 votes)

A useful level distinction

2018-02-24T06:39:47.558Z · score: 26 (6 votes)

Explanations: Ignorance vs. Confusion

2018-01-16T10:44:18.345Z · score: 18 (9 votes)

Empirical philosophy and inversions

2017-12-29T12:12:57.678Z · score: 8 (3 votes)

Dan Dennett on Stances

2017-12-27T08:15:53.124Z · score: 8 (4 votes)

Philosophy of Numbers (part 2)

2017-12-19T13:57:19.155Z · score: 11 (5 votes)

Philosophy of Numbers (part 1)

2017-12-02T18:20:30.297Z · score: 25 (9 votes)

Limited agents need approximate induction

2015-04-24T21:22:26.000Z · score: 1 (1 votes)