TurnTrout's shortform feed

post by TurnTrout · 2019-06-30T18:56:49.775Z · score: 29 (6 votes) · LW · GW · 145 comments

145 comments

Comments sorted by top scores.

comment by TurnTrout · 2019-12-17T06:37:41.969Z · score: 30 (11 votes) · LW(p) · GW(p)

My maternal grandfather was the scientist in my family. I was young enough that my brain hadn't decided to start doing its job yet, so my memories with him are scattered and inconsistent and hard to retrieve. But there's no way that I could forget all of the dumb jokes he made; how we'd play Scrabble and he'd (almost surely) pretend to lose to me [? · GW]; how, every time he got to see me, his eyes would light up with boyish joy.

My greatest regret took place in the summer of 2007. My family celebrated the first day of the school year at an all-you-can-eat buffet, delicious food stacked high as the eye could fathom under lights of green, red, and blue. After a particularly savory meal, we made to leave the surrounding mall. My grandfather asked me to walk with him.

I was a child who thought to avoid being seen too close to uncool adults. I wasn't thinking. I wasn't thinking about hearing the cracking sound of his skull against the ground. I wasn't thinking about turning to see his poorly congealed blood flowing from his forehead out onto the floor. I wasn't thinking I would nervously watch him bleed for long minutes while shielding my seven-year-old brother from the sight. I wasn't thinking that I should go visit him in the hospital, because that would be scary. I wasn't thinking he would die of a stroke the next day.

I wasn't thinking the last thing I would ever say to him would be "no[, I won't walk with you]".

Who could think about that? No, that was not a foreseeable mistake. Rather, I wasn't thinking about how precious and short my time with him was. I wasn't appreciating how fragile my loved ones are. I didn't realize that something as inconsequential as an unidentified ramp in a shopping mall was allowed to kill my grandfather.

I miss you, Joseph Matt.

comment by TurnTrout · 2019-12-17T21:34:08.992Z · score: 11 (6 votes) · LW(p) · GW(p)

My mother told me my memory was indeed faulty. He never asked me to walk with him; instead, he asked me to hug him during dinner. I said I'd hug him "tomorrow".

But I did, apparently, want to see him in the hospital; it was my mother and grandmother who decided I shouldn't see him in that state.

comment by Raemon · 2019-12-17T22:44:45.087Z · score: 8 (3 votes) · LW(p) · GW(p)

<3

comment by habryka (habryka4) · 2019-12-17T18:44:48.154Z · score: 6 (3 votes) · LW(p) · GW(p)

Thank you for sharing.

comment by TurnTrout · 2019-06-30T18:57:46.543Z · score: 30 (8 votes) · LW(p) · GW(p)

comment by TurnTrout · 2020-04-26T22:24:15.587Z · score: 23 (9 votes) · LW(p) · GW(p)

If you want to read Euclid's Elements, look at this absolutely gorgeous online rendition:

comment by Ben Pace (Benito) · 2020-04-27T00:31:17.426Z · score: 6 (3 votes) · LW(p) · GW(p)

Wow.

comment by William Walker (william-walker) · 2020-04-28T22:38:44.714Z · score: 1 (5 votes) · LW(p) · GW(p)

Nice! Thanks!

comment by TurnTrout · 2020-02-12T01:51:02.670Z · score: 22 (6 votes) · LW(p) · GW(p)

For quite some time, I've disliked wearing glasses. However, my eyes are sensitive, so I dismissed the possibility of contacts.

Over break, I realized I could still learn to use contacts, it would just take me longer. Sure enough, it took me an hour and five minutes to put in my first contact, and I couldn't get it out on my own. An hour of practice later, I put in a contact on my first try, and took it out a few seconds later. I'm very happily wearing contacts right now, as a matter of fact.

I'd suffered glasses for over fifteen years because of a cached decision – because I didn't think to rethink something literally right in front of my face every single day.

What cached decisions have you not reconsidered?

comment by TurnTrout · 2020-06-29T00:46:47.566Z · score: 21 (8 votes) · LW(p) · GW(p)

For the last two years, typing for 5+ minutes hurt my wrists. I tried a lot of things: shots, physical therapy, trigger-point therapy, acupuncture, massage tools, wrist and elbow braces at night, exercises, stretches. Sometimes it got better. Sometimes it got worse.

No Beat Saber, no lifting weights, and every time I read a damn book I would start translating the punctuation into Dragon NaturallySpeaking syntax.

Text: "Consider a bijection "

My mental narrator: "Cap consider a bijection space dollar foxtrot colon cap x backslash tango oscar cap y dollar"

Have you ever tried dictating a math paper in LaTeX? Or dictating code? Telling your computer "click" and waiting a few seconds while resisting the temptation to just grab the mouse? Dictating your way through a computer science PhD?

And then.... and then, a month ago, I got fed up. What if it was all just in my head, at this point? I'm only 25. This is ridiculous. How can it possibly take me this long to heal such a minor injury?

I wanted my hands back - I wanted it real bad. I wanted it so bad that I did something dirty: I made myself believe something. Well, actually, I pretended to be a person who really, really believed his hands were fine and healing and the pain was all psychosomatic.

And... it worked, as far as I can tell. It totally worked. I haven't dictated in over three weeks. I play Beat Saber as much as I please. I type for hours and hours a day with only the faintest traces of discomfort.

What?

comment by DanielFilan · 2020-09-11T23:09:16.052Z · score: 4 (2 votes) · LW(p) · GW(p)

Is the problem still gone?

comment by TurnTrout · 2020-09-12T02:40:12.239Z · score: 6 (3 votes) · LW(p) · GW(p)

Totally 100% gone. Sometimes I go weeks forgetting that pain was ever part of my life. 

comment by Vanessa Kosoy (vanessa-kosoy) · 2020-06-29T12:12:17.028Z · score: 4 (2 votes) · LW(p) · GW(p)

I'm glad it worked :) It's not that surprising given that pain is known to be susceptible to the placebo effect. I would link the SSC post, but, alas...

comment by Teerth Aloke · 2020-06-29T01:46:42.987Z · score: 3 (2 votes) · LW(p) · GW(p)

This is unlike anything I have heard!

comment by mingyuan · 2020-06-29T01:54:14.151Z · score: 6 (3 votes) · LW(p) · GW(p)

It's very similar to what John Sarno (author of Healing Back Pain and The Mindbody Prescription) preaches, as well as Howard Schubiner. There's also a rationalist-adjacent dude who started a company (Axy Health) based on these principles. Fuck if I know how any of it works though, and it doesn't work for everyone. Congrats though TurnTrout!

comment by Teerth Aloke · 2020-06-29T03:52:52.824Z · score: 1 (1 votes) · LW(p) · GW(p)

My Dad it seems might have psychosomatic stomach ache. How to convince him to convince himself that he has no problem?

comment by mingyuan · 2020-06-29T04:52:34.336Z · score: 4 (2 votes) · LW(p) · GW(p)

If you want to try out the hypothesis, I recommend that he (or you, if he's not receptive to it) read Sarno's book. I want to reiterate that it does not work in every situation, but you're welcome to take a look.

comment by avturchin · 2020-06-29T10:46:34.826Z · score: 2 (1 votes) · LW(p) · GW(p)

Looks like reverse stigmata effect.

comment by Raemon · 2020-06-29T02:34:23.704Z · score: 2 (1 votes) · LW(p) · GW(p)

Woo faith healing! 

(hope this works out longterm, and doesn't turn out be secretly hurting still) 

comment by TurnTrout · 2020-06-29T03:16:21.709Z · score: 4 (2 votes) · LW(p) · GW(p)

aren't we all secretly hurting still?

comment by mingyuan · 2020-06-29T04:54:01.028Z · score: 2 (1 votes) · LW(p) · GW(p)

....D:

comment by TurnTrout · 2020-01-13T02:15:39.463Z · score: 18 (8 votes) · LW(p) · GW(p)

While reading Focusing today, I thought about the book and wondered how many exercises it would have. I felt a twinge of aversion. In keeping with my goal of increasing internal transparency, I said to myself: "I explicitly and consciously notice that I felt averse to some aspect of this book".

I then Focused on the aversion. Turns out, I felt a little bit disgusted, because a part of me reasoned thusly:

If the book does have exercises, it'll take more time. That means I'm spending reading time on things that aren't math textbooks. That means I'm slowing down.

(Transcription of a deeper Focusing on this reasoning)

I'm afraid of being slow. Part of it is surely the psychological remnants of the RSI I developed in the summer of 2018. That is, slowing down is now emotionally associated with disability and frustration. There was a period of meteoric progress as I started reading textbooks and doing great research, and then there was pain. That pain struck even when I was just trying to take care of myself, sleep, open doors. That pain then left me on the floor of my apartment, staring at the ceiling, desperately willing my hands to just get better. They didn't (for a long while), so I just lay there and cried. That was slow, and it hurt. No reviews, no posts, no typing, no coding. No writing, slow reading. That was slow, and it hurt.

Part of it used to be a sense of "I need to catch up and learn these other subjects which [Eliezer / Paul / Luke / Nate] already know". Through internal double crux, I've nearly eradicated this line of thinking, which is neither helpful nor relevant nor conducive to excitedly learning the beautiful settled science of humanity. Although my most recent post [LW · GW] touched on impostor syndrome, that isn't really a thing for me. I feel reasonably secure in who I am, now (although part of me worries that others wrongly view me as an impostor?).

However, I mostly just want to feel fast, efficient, and swift again. I sometimes feel like I'm in a race with Alex, and I feel like I'm losing.

comment by TurnTrout · 2019-07-05T23:00:58.761Z · score: 17 (12 votes) · LW(p) · GW(p)

I passed a homeless man today. His face was wracked in pain, body rocking back and forth, eyes clenched shut. A dirty sign lay forgotten on the ground: "very hungry".

This man was once a child, with parents and friends and dreams and birthday parties and maybe siblings he'd get in arguments with and snow days he'd hope for.

And now he's just hurting.

And now I can't help him without abandoning others. So he's still hurting. Right now.

Reality is still allowed to make this happen. This is wrong. This has to change.

comment by Said Achmiz (SaidAchmiz) · 2019-07-06T03:12:51.584Z · score: 8 (6 votes) · LW(p) · GW(p)

How would you help this man, if having to abandon others in order to do so were not a concern? (Let us assume that someone else—someone whose competence you fully trust, and who will do at least as good a job as you will—is going to take care of all the stuff you feel you need to do.)

What is it you had in mind to do for this fellow—specifically, now—that you can’t (due to those other obligations)?

comment by TurnTrout · 2019-07-06T05:02:37.715Z · score: 13 (5 votes) · LW(p) · GW(p)

Suppose I actually cared about this man with the intensity he deserved - imagine that he were my brother, father, or best friend.

The obvious first thing to do before interacting further is to buy him a good meal and a healthy helping of groceries. Then, I need to figure out his deal. Is he hurting, or is he also suffering from mental illness?

If the former, I'd go the more straightforward route of befriending him, helping him purchase a sharp business professional outfit, teaching him to interview and present himself with confidence, secure an apartment, and find a job.

If the latter, this gets trickier. I'd still try and befriend him (consistently being a source of cheerful conversation and delicious food would probably help), but he might not be willing or able to get the help he needs, and I wouldn't have the legal right to force him. My best bet might be to enlist the help of a psychological professional for these interactions. If this doesn't work, my first thought would be to influence the local government to get the broader problem fixed (I'd spend at least an hour considering other plans before proceeding further, here). Realistically, there's likely a lot of pressure in this direction already, so I'd need to find an angle from which few others are pushing or pulling where I can make a difference. I'd have to plot out the relevant political forces, study accounts of successful past lobbying, pinpoint the people I need on my side, and then target my influencing accordingly.

(All of this is without spending time looking at birds-eye research and case studies of poverty reduction; assume counterfactually that I incorporate any obvious improvements to these plans, because I'd care about him and dedicate more than like 4 minutes of thought).

comment by Said Achmiz (SaidAchmiz) · 2019-07-06T05:53:48.328Z · score: 2 (4 votes) · LW(p) · GW(p)

Well, a number of questions may be asked here (about desert, about causation, about autonomy, etc.). However, two seem relevant in particular:

First, it seems as if (in your latter scenario) you’ve arrived (tentatively, yes, but not at all unreasonably!) at a plan involving systemic change. As you say, there is quite a bit of effort being expended on this sort of thing already, so, at the margin, any effective efforts on your part would likely be both high-level and aimed in an at-least-somewhat-unusual direction.

… yet isn’t this what you’re already doing?

Second, and unrelatedly… you say:

Suppose I actually cared about this man with the intensity he deserved—imagine that he were my brother, father, or best friend.

Yet it seems to me that, empirically, most people do not expend the level of effort which you describe, even for their siblings, parents, or close friends. Which is to say that the level of emotional and practical investment you propose to make (in this hypothetical situation) is, actually, quite a bit greater than that which most people invest in their family members or close friends.

The question, then, is this: do you currently make this degree of investment (emotional and practical) in your actual siblings, parents, and close friends? If so—do you find that you are unusual in this regard? If not—why not?

comment by TurnTrout · 2019-07-06T06:08:46.662Z · score: 17 (4 votes) · LW(p) · GW(p)
… yet isn’t this what you’re already doing?

I work on technical AI alignment, so some of those I help (in expectation) don't even exist yet. I don't view this as what I'd do if my top priority were helping this man.

The question, then, is this: do you currently make this degree of investment (emotional and practical) in your actual siblings, parents, and close friends? If so—do you find that you are unusual in this regard? If not—why not?

That's a good question. I think the answer is yes, at least for my close family. Recently, I've expended substantial energy persuading my family to sign up for cryonics with me, winning over my mother, brother, and (I anticipate) my aunt. My father has lingering concerns which I think he wouldn't have upon sufficient reflection, so I've designed a similar plan for ensuring he makes what I perceive to be the correct, option-preserving choice. For example, I made significant targeted donations to effective charities on his behalf to offset (what he perceives as) a considerable drawback of cryonics: his inability to also be an organ donor.

A universe in which humanity wins but my dad is gone would be quite sad to me, and I'll take whatever steps necessary to minimize the chances of that.

I don't know how unusual this is. This reminds me of the relevant Harry-Quirrell exchange; most people seem beaten-down and hurt themselves, and I can imagine a world in which people are in better places and going to greater lengths for those they love. I don't know if this is actually what would make more people go to these lengths (just an immediate impression).

comment by Raemon · 2019-07-06T03:30:46.593Z · score: 2 (6 votes) · LW(p) · GW(p)

I predict that this comment is not helpful to Turntrout.

comment by Raemon · 2019-07-05T23:07:11.852Z · score: 7 (2 votes) · LW(p) · GW(p)

:(

Song I wrote about this once (not very polished)

comment by TurnTrout · 2020-07-07T23:04:03.243Z · score: 15 (5 votes) · LW(p) · GW(p)

I think instrumental convergence also occurs in the model space for machine learning. For example, many different architectures likely learn edge detectors in order to minimize classification loss on MNIST. But wait - you'd also learn edge detectors to maximize classification loss on MNIST (loosely, getting 0% on a multiple-choice exam requires knowing all of the right answers). I bet you'd learn these features for a wide range of cost functions. I wonder if that's already been empirically investigated?

And, same for adversarial features. And perhaps, same for mesa optimizers (understanding how to stop mesa optimizers from being instrumentally convergent seems closely related to solving inner alignment). 

What can we learn about this?

comment by evhub · 2020-07-07T23:36:08.711Z · score: 6 (3 votes) · LW(p) · GW(p)

A lot of examples of this sort of stuff show up in OpenAI clarity's circuits analysis work. In fact, this is precisely their Universality hypothesis. See also my discussion here [LW · GW].

comment by TurnTrout · 2019-12-04T00:50:30.859Z · score: 15 (4 votes) · LW(p) · GW(p)

Listening to Eneasz Brodski's excellent reading of Crystal Society, I noticed how curious I am about how AGI will end up working. How are we actually going to do it? What are those insights? I want to understand quite badly, which I didn't realize until experiencing this (so far) intelligently written story.

Similarly, how do we actually "align" agents, and what are good frames for thinking about that?

Here's to hoping we don't sate the former curiosity too early.

comment by TurnTrout · 2020-04-24T15:38:04.997Z · score: 13 (4 votes) · LW(p) · GW(p)

Weak derivatives

In calculus, the product rule says . The fundamental theorem of calculus says that the Riemann integral acts as the anti-derivative.[1] Combining these two facts, we derive integration by parts:

It turns out that we can use these two properties to generalize the derivative to match some of our intuitions on edge cases. Let's think about the absolute value function:

Image from Wikipedia

The boring old normal derivative isn't defined at , but it seems like it'd make sense to be able to say that the derivative is eg 0. Why might this make sense?

Taylor's theorem (and its generalizations) characterize first derivatives as tangent lines with slope which provide good local approximations of around : . You can prove that this is the best approximation you can get using only and ! In the absolute value example, defining the "derivative" to be zero at would minimize approximation error on average in neighborhoods around the origin.

In multivariable calculus, the Jacobian is a tangent plane which again minimizes approximation error (with respect to the Euclidean distance, usually) in neighborhoods around the function. That is, having a first derivative means that the function can be locally approximated by a linear map. It's like a piece of paper that you glue onto the point in question.

Source

This reasoning even generalizes to the infinite-dimensional case with functional derivatives (see my recent functional analysis textbook review [LW(p) · GW(p)]). All of these cases are instances of the Fréchet derivative.

Complex analysis provides another perspective on why this might make sense, but I think you get the idea and I'll omit that for now.

We can define a weaker notion of differentiability which lets us do this – in fact, it lets us define the weak derivative to be anything at ! Now that I've given some motivation, here's a great explanation of how weak derivatives arise from the criterion of "satisfy integration by parts for all relevant functions".


  1. As far as I can tell, the indefinite Riemann integral being the anti-derivative means that it's the inverse of in the group theoretic sense – with respect to composition in the -vector space of operators on real-valued functions. You might not expect this, because maps an integrable function to a set of functions . However, this doesn't mean that the inverse isn't unique (as it must be), because the inverse is in operator-space. ↩︎

comment by TurnTrout · 2020-04-24T15:47:08.477Z · score: 2 (1 votes) · LW(p) · GW(p)

The reason is undefined for the absolute value function is that you need the value to be the same for all sequences converging to 0 – both from the left and from the right. There's a nice way to motivate this in higher-dimensional settings by thinking about the action of e.g. complex multiplication, but this is a much stronger notion than real differentiability and I'm not quite sure how to think about motivating the single-valued real case yet. Of course, you can say things like "the theorems just work out nicer if you require both the lower and upper limits be the same"...

comment by TurnTrout · 2019-09-18T21:57:15.893Z · score: 13 (4 votes) · LW(p) · GW(p)

Good, original thinking feels present to me - as if mental resources are well-allocated.

The thought which prompted this:

Sure, if people are asked to solve a problem and say they can't after two seconds, yes - make fun of that a bit. But that two seconds covers more ground than you might think, due to System 1 precomputation.

Reacting to a bit of HPMOR here, I noticed something felt off about Harry's reply to the Fred/George-tried-for-two-seconds thing. Having a bit of experience noticing confusing, I did not think "I notice I am confused" (although this can be useful). I did not think "Eliezer probably put thought into this", or "Harry is kinda dumb in certain ways - so what if he's a bit unfair here?". Without resurfacing, or distraction, or wondering if this train of thought is more fun than just reading further, I just thought about the object-level exchange.

People need to allocate mental energy wisely; this goes far beyond focusing on important tasks. Your existing mental skillsets already optimize and auto-pilot certain mental motions for you, so you should allocate less deliberation to them. In this case, the confusion-noticing module was honed; by not worrying about how well I noticed confusion, I was able to quickly have an original thought.

When thought processes derail or brainstorming sessions bear no fruit, inappropriate allocation may be to blame. For example, if you're anxious, you're interrupting the actual thoughts with "what-if"s.

To contrast, non-present thinking feels like a controller directing thoughts to go from here to there: do this and then, check that, come up for air over and over... Present thinking is a stream of uninterrupted strikes, the train of thought chugging along without self-consciousness. Moving, instead of thinking about moving while moving.

I don't know if I've nailed down the thing I'm trying to point at yet.

comment by TurnTrout · 2019-09-19T16:04:52.767Z · score: 6 (3 votes) · LW(p) · GW(p)

Sure, if people are asked to solve a problem and say they can't after two seconds, yes - make fun of that a bit. But that two seconds covers more ground than you might think, due to System 1 precomputation.

Expanding on this, there is an aspect of Actually Trying that is probably missing from S1 precomputation. So, maybe the two-second "attempt" is actually useless for most people because subconscious deliberation isn't hardass enough at giving its all, at making desperate and extraordinary efforts to solve the problem.

comment by TurnTrout · 2019-11-29T02:52:46.899Z · score: 12 (6 votes) · LW(p) · GW(p)

From my Facebook

My life has gotten a lot more insane over the last two years. However, it's also gotten a lot more wonderful, and I want to take time to share how thankful I am for that.

Before, life felt like... a thing that you experience, where you score points and accolades and check boxes. It felt kinda fake, but parts of it were nice. I had this nice cozy little box that I lived in, a mental cage circumscribing my entire life. Today, I feel (much more) free.

I love how curious I've become, even about "unsophisticated" things. Near dusk, I walked the winter wonderland of Ogden, Utah with my aunt and uncle. I spotted this gorgeous red ornament hanging from a tree, with a hunk of snow stuck to it at north-east orientation. This snow had apparently decided to defy gravity. I just stopped and stared. I was so confused. I'd kinda guessed that the dry snow must induce a huge coefficient of static friction, hence the winter wonderland. But that didn't suffice to explain this. I bounded over and saw the smooth surface was iced, so maybe part of the snow melted in the midday sun, froze as evening advanced, and then the part-ice part-snow chunk stuck much more solidly to the ornament.

Maybe that's right, and maybe not. The point is that two years ago, I'd have thought this was just "how the world worked", and it was up to physicists to understand the details. Whatever, right? But now, I'm this starry-eyed kid in a secret shop full of wonderful secrets. Some secrets are already understood by some people, but not by me. A few secrets I am the first to understand. Some secrets remain unknown to all. All of the secrets are enticing.

My life isn't always like this; some days are a bit gray and draining. But many days aren't, and I'm so happy about that.

Socially, I feel more fascinated by people in general, more eager to hear what's going on in their lives, more curious what it feels like to be them that day. In particular, I've fallen in love with the rationalist and effective altruist communities, which was totally a thing I didn't even know I desperately wanted until I already had it in my life! There are so many kind, smart, and caring people, inside many of whom burns a similarly intense drive to make the future nice, no matter what. Even though I'm estranged from the physical community much of the year, I feel less alone: there's a home for me somewhere.

Professionally, I'm working on AI alignment, which I think is crucial for making the future nice. Two years ago, I felt pretty sidelined - I hadn't met the bars I thought I needed to meet in order to do Important Things, so I just planned for a nice, quiet, responsible, normal life, doing little kindnesses. Surely the writers of the universe's script would make sure things turned out OK, right?

I feel in the game now. The game can be daunting, but it's also thrilling. It can be scary, but it's important. It's something we need to play, and win. I feel that viscerally. I'm fighting for something important, with every intention of winning.

I really wish I had the time to hear from each and every one of you. But I can't, so I do what I can: I wish you a very happy Thanksgiving. :)

comment by TurnTrout · 2019-11-20T21:52:55.015Z · score: 12 (6 votes) · LW(p) · GW(p)

I feel very excited by the AI alignment discussion group I'm running at Oregon State University. Three weeks ago, most attendees didn't know much about "AI security mindset"-ish considerations. This week, I asked the question "what, if anything, could go wrong with a superhuman reward maximizer which is rewarded for pictures of smiling people? Don't just fit a bad story to the reward function. Think carefully."

There was some discussion and initial optimism, after which someone said "wait, those optimistic solutions are just the ones you'd prioritize! What's that called, again?" (It's called anthropomorphic optimism)

I'm so proud.

comment by TurnTrout · 2019-11-13T17:18:29.555Z · score: 12 (4 votes) · LW(p) · GW(p)

Yesterday, I put the finishing touches on my chef d'œuvre, a series of important safety-relevant proofs I've been striving for since early June. Strangely, I felt a great exhaustion come over me. These proofs had been my obsession for so long, and now - now, I'm done.

I've had this feeling before; three years ago, I studied fervently for a Google interview. The literal moment the interview concluded, a fever overtook me. I was sick for days. All the stress and expectation and readiness-to-fight which had been pent up, released.

I don't know why this happens. But right now, I'm still a little tired, even after getting a good night's sleep.

comment by Hazard · 2019-11-13T19:01:23.568Z · score: 2 (1 votes) · LW(p) · GW(p)

This happens to me sometimes. I know several people who have this happen at the end of a Uni semester. Hope you can get some rest.

comment by TurnTrout · 2020-10-09T16:17:36.093Z · score: 11 (6 votes) · LW(p) · GW(p)

I went to the doctor's yesterday. This was embarrassing for them on several fronts.

First, I had to come in to do an appointment which could be done over telemedicine, but apparently there are regulations against this.

Second, while they did temp checks and required masks (yay!), none of the nurses or doctors actually wore anything stronger than a surgical mask. I'm coming in here with a KN95 + goggles + face shield because why not take cheap precautions to reduce the risk, and my own doctor is just wearing a surgical? I bought 20 KN95s for, like, 15 bucks on Amazon.

Third, and worst of all, my own doctor spouted absolute nonsense. The mildest insinuation was that surgical facemasks only prevent transmission, but I seem to recall that many kinds of surgical masks halve your chances of infection as well.

Then, as I understood it, he first claimed that coronavirus and the flu have comparable case fatality rates. I wasn't sure if I'd heard him correctly - this was an expert talking about his area of expertise, so I felt like I had surely misunderstood him. I was taken aback. But, looking back, that's what he meant.

He went on to suggest that we can't expect COVID immunity to last (wrong) but also that we need to hit 70% herd immunity (wrong). How could you even believe both of these things at the same time? Under those beliefs, are we all just going to get sick forever? Maybe he didn't notice the contradiction because he made the claims a few minutes apart.

Next, he implied that it's not a huge deal that people have died because a lot of them had comorbidities. Except that's not how comorbidities and counterfactual impact works. "No one's making it out of here alive", he says. An amusing rationalization.

He also claimed that nursing homes have an average stay length of 5 months. Wrong. AARP says it's 1.5 years for men, 2.5 years for women, but I've seen other estimate elsewhere, all much higher than 5 months. Not sure what the point of this was - old people are 10 minutes from dying anyways? What?

Now, perhaps I misunderstood or misheard one or two points. But I'm pretty sure I didn't mishear all of them. Isn't it great that I can correct my doctor's epidemiological claims after reading Zvi's posts and half of an epidemiology textbook? I'm glad I can trust my doctor and his epistemology.

comment by mingyuan · 2020-10-09T17:09:26.778Z · score: 7 (4 votes) · LW(p) · GW(p)

Eli just took a plane ride to get to CA and brought a P100, but they told him he had to wear a cloth mask, that was the rule. So he wore a cloth mask under the P100, which of course broke the seal. I feel you.

comment by ChristianKl · 2020-10-09T17:32:20.872Z · score: 3 (3 votes) · LW(p) · GW(p)

I don't think that policy is unreasonable for a plane ride. Just because someone wears a P100 mask doesn't mean that their mask filters outgoing air as that's not the design goals for most of the use cases of P100 masks.

Checking on a case-by-case basis whether a particular P100 mask is not designed like an average P100 mask is likely not feasible in that context. 

comment by Dagon · 2020-10-09T19:06:29.704Z · score: 4 (2 votes) · LW(p) · GW(p)

What do you call the person who graduates last in their med school class?  Doctor.   And remember that GPs are weighted toward the friendly area of doctor-quality space rather than the hyper-competent.   Further remember that consultants (including experts on almost all topics) are generally narrow in their understanding of things - even if they are well above the median at their actual job (for a GP, dispensing common medication and identifying situations that need referral to a specialist), that doesn't indicate they're going to be well-informed even for adjacent topics.

That said, this level of misunderstanding on topics that impact patient behavior and outcome (mask use, other virus precautions) is pretty sub-par.  The cynic in me estimates it's the bottom quartile of front-line medical providers, but I hope it's closer to the bottom decile.  Looking into an alternate provider seems quite justified.

comment by ChristianKl · 2020-10-09T20:11:41.072Z · score: 2 (1 votes) · LW(p) · GW(p)

What do you call the person who graduates last in their med school class?  Doctor.  

In the US that isn't the case. There are limited places for internships and the worst person in medical school might not get a place for an internship and thus is not allowed to be a doctor. The medical system is heavily gated to keep out people.

comment by TurnTrout · 2020-07-14T02:26:18.654Z · score: 11 (5 votes) · LW(p) · GW(p)

When I notice I feel frustrated, unproductive, lethargic, etc, I run down a simple checklist:

  • Do I need to eat food?
  • Am I drinking lots of water?
  •  Have I exercised today?
  • Did I get enough sleep last night? 
    • If not, what can I do now to make sure I get more tonight?
  • Have I looked away from the screen recently?
  • Have I walked around in the last 20 minutes?

It's simple, but 80%+ of the time, it fixes the issue.

comment by Viliam · 2020-07-14T19:32:08.987Z · score: 2 (1 votes) · LW(p) · GW(p)

There is a "HALT: hungry? angry? lonely? tired?" mnemonic, but I like that your list includes water and walking and exercise. Now just please make it easier to remember.

comment by AllAmericanBreakfast · 2020-07-15T22:33:10.546Z · score: 1 (1 votes) · LW(p) · GW(p)

How about THREES: Thirsty Hungry Restless Eyestrain Exercise?

comment by mr-hire · 2020-07-14T03:38:33.959Z · score: 2 (1 votes) · LW(p) · GW(p)

Hey can I steal this for a course I'm teaching? (I'll give you credit).

comment by TurnTrout · 2020-07-14T11:49:27.742Z · score: 2 (1 votes) · LW(p) · GW(p)

sure!

comment by TurnTrout · 2020-01-05T02:27:54.205Z · score: 11 (2 votes) · LW(p) · GW(p)

Suppose you could choose how much time to spend at your local library, during which:

  • you do not age. Time stands still outside; no one enters or exits the library (which is otherwise devoid of people).
  • you don't need to sleep/eat/get sunlight/etc
  • you can use any computers, but not access the internet or otherwise bring in materials with you
  • you can't leave before the requested time is up

Suppose you don't go crazy from solitary confinement, etc. Remember that value drift is a potential thing.

How long would you ask for?

comment by FactorialCode · 2020-01-06T19:38:59.584Z · score: 1 (1 votes) · LW(p) · GW(p)

How good are the computers?

comment by TurnTrout · 2020-01-06T20:15:42.012Z · score: 2 (1 votes) · LW(p) · GW(p)

Windows machines circa ~2013. Let’s say 128GB hard drives which magically never fail, for 10 PCs.

comment by FactorialCode · 2020-01-07T17:01:17.079Z · score: 1 (1 votes) · LW(p) · GW(p)

Probably 3-5 years then. I'd use it to get a stronger foundation in low level programming skills, math and physics. The limiting factors would be entertainment in the library to keep me sane and the inevitable degradation of my social skills from so much spent time alone.

comment by TurnTrout · 2019-12-25T23:07:04.811Z · score: 11 (3 votes) · LW(p) · GW(p)

Judgment in Managerial Decision Making says that (subconscious) misapplication of e.g. the representativeness heuristic causes insensitivity to base rates and to sample size, failure to reason about probabilities correctly, failure to consider regression to the mean, and the conjunction fallacy. My model of this is that representativeness / availability / confirmation bias work off of a mechanism somewhat similar to attention in neural networks: due to how the brain performs time-limited search, more salient/recent memories get prioritized for recall.

The availability heuristic goes wrong when our saliency-weighted perceptions of the frequency of events is a biased estimator of the real frequency, or maybe when we just happen to be extrapolating off of a very small sample size. Concepts get inappropriately activated in our mind, and we therefore reason incorrectly. Attention also explains anchoring: you can more readily bring to mind things related to your anchor due to salience.

The case for confirmation bias seems to be a little more involved: first, we had evolutionary pressure to win arguments, which means our search is meant to find supportive arguments and avoid even subconsciously signalling that we are aware of the existence of counterarguments. This means that those supportive arguments feel salient, and we (perhaps by "design") get to feel unbiased - we aren't consciously discarding evidence, we're just following our normal search/reasoning process! This is what our search algorithm feels like from the inside. [LW · GW]

This reasoning feels clicky, but I'm just treating it as an interesting perspective for now.

comment by TurnTrout · 2019-11-04T01:29:44.252Z · score: 11 (4 votes) · LW(p) · GW(p)

With respect to the integers, 2 is prime. But with respect to the Gaussian integers, it's not: it has factorization . Here's what's happening.

You can view complex multiplication as scaling and rotating the complex plane. So, when we take our unit vector 1 and multiply by , we're scaling it by and rotating it counterclockwise by :

This gets us to the purple vector. Now, we multiply by , scaling it up by again (in green), and rotating it clockwise again by the same amount. You can even deal with the scaling and rotations separately (scale twice by , with zero net rotation).

comment by TurnTrout · 2020-08-28T03:12:31.419Z · score: 9 (3 votes) · LW(p) · GW(p)

Does Venting Anger Feed or Extinguish the Flame? Catharsis, Rumination, Distraction, Anger, and Aggressive Responding

Does distraction or rumination work better to diffuse anger? Catharsis theory predicts that rumination works best, but empirical evidence is lacking. In this study, angered participants hit a punching bag and thought about the person who had angered them (rumination group) or thought about becoming physically fit (distraction group). After hitting the punching bag, they reported how angry they felt. Next, they were given the chance to administer loud blasts of noise to the person who had angered them. There also was a no punching bag control group. People in the rumination group felt angrier than did people in the distraction or control groups. People in the rumination group were also most aggressive, followed respectively by people in the distraction and control groups. Rumination increased rather than decreased anger and aggression. Doing nothing at all was more effective than venting anger. These results directly contradict catharsis theory.

Interesting. A cursory !scholar search indicates these results have replicated, but I haven't done an in-depth review.

comment by MakoYass · 2020-08-31T02:39:48.128Z · score: 5 (3 votes) · LW(p) · GW(p)

It would be interesting to see a more long-term study about habits around processing anger.

For instance, randomly assigning people different advice about processing anger (likely to have quite an impact on them, I don't think the average person receives much advice in that class) and then checking in on them a few years later and ask them things like, how many enemies they have, how many enemies they've successfully defeated, how many of their interpersonal issues they resolve successfully?

comment by Raemon · 2020-08-28T04:49:24.751Z · score: 4 (2 votes) · LW(p) · GW(p)

Boggling a bit at the "can you actually reliably find angry people and/or make people angry on purpose?"

comment by capybaralet · 2020-09-15T08:27:19.399Z · score: 1 (1 votes) · LW(p) · GW(p)

I found this fascinating... it's rare these days that I see some fundamental assumption in my thinking that I didn't even realize I was making laid bare like this... it is particularly striking because I think I could easily have realized that my own experience contradicts catharsis theory... I know that I can distract myself to become less angry, but I usually don't want to, in the moment.

I think that desire is driven by emotion, but rationalized via something like catharsis theory. I want to try and rescue catharsis theory by saying that maybe there are negative long-term effects of being distracted from feelings of anger (e.g. a build up of resentment). I wonder how much this is also a rationalization.

I also wonder how accurately the authors have characterized catharsis theory, and how much to identify it with the "hydraulic model of anger"... I would imagine that there are lots of attempts along the lines of what I suggested to try and rescue catharsis theory by refining or moving away from the hydraulic model. A highly general version might claim: "over a long time horizon, not 'venting' anger is net negative".

comment by TurnTrout · 2019-10-01T01:07:11.804Z · score: 9 (5 votes) · LW(p) · GW(p)

An exercise in the companion workbook to the Feynman Lectures on Physics asked me to compute a rather arduous numerical simulation. At first, this seemed like a "pass" in favor of an exercise more amenable to analytic and conceptual analysis; arithmetic really bores me. Then, I realized I was being dumb - I'm a computer scientist.

Suddenly, this exercise became very cool, as I quickly figured out the equations and code, crunched the numbers in an instant, and churned out a nice scatterplot. This seems like a case where cross-domain competence is unusually helpful (although it's not like I had to bust out any esoteric theoretical CS knowledge). I'm wondering whether this kind of thing will compound as I learn more and more areas; whether previously arduous or difficult exercises become easy when attacked with well-honed tools and frames from other disciplines.

comment by TurnTrout · 2020-07-29T03:00:37.560Z · score: 8 (4 votes) · LW(p) · GW(p)

This might be the best figure I've ever seen in a textbook. Talk about making a point! 

Molecular Biology of the Cell, Alberts.
comment by TurnTrout · 2020-07-22T14:21:33.404Z · score: 8 (4 votes) · LW(p) · GW(p)

I never thought I'd be seriously testing the reasoning abilities of an AI in 2020 [LW · GW]. 

Looking back, history feels easy to predict; hindsight + the hard work of historians makes it (feel) easy to pinpoint the key portents. Given what we think about AI risk, in hindsight, might this have been the most disturbing development of 2020 thus far? 

I personally lean towards "no", because this scaling seemed somewhat predictable from GPT-2 (flag - possible hindsight bias), and because 2020 has been so awful so far. But it seems possible, at least. I don't really know what update GPT-3 is to my AI risk estimates & timelines.

comment by gwern · 2020-07-22T16:42:14.009Z · score: 14 (6 votes) · LW(p) · GW(p)

DL so far has been easy to predict - if you bought into a specific theory of connectionism & scaling espoused by Schmidhuber, Moravec, Sutskever, and a few others, as I point out in https://www.gwern.net/newsletter/2019/13#what-progress & https://www.gwern.net/newsletter/2020/05#gpt-3 . Even the dates are more or less correct! The really surprising thing is that that particular extreme fringe lunatic theory turned out to be correct. So the question is, was everyone else wrong for the right reasons (similar to the Greeks dismissing heliocentrism for excellent reasons yet still being wrong), or wrong for the wrong reasons, and why, and how can we prevent that from happening again and spending the next decade being surprised in potentially very bad ways?

comment by TurnTrout · 2020-07-26T18:33:52.588Z · score: 7 (4 votes) · LW(p) · GW(p)

An additional consideration for early work on interpretability: it slightly increases the chance we actually get an early warning shot. If a system misbehaves, we can inspect its cognition and (hopefully) find hints of intentional deception. Could motivate thousands of additional researcher-hours being put into alignment.

comment by Raemon · 2020-07-26T21:20:13.825Z · score: 2 (1 votes) · LW(p) · GW(p)

That's an interesting point.

comment by TurnTrout · 2020-07-19T17:58:52.262Z · score: 7 (4 votes) · LW(p) · GW(p)

Today, let's read about GPT-3's obsession with Shrek

As for me, I think Shrek is important because the most valuable thing in life is happiness. I mean this quite literally. There's a mountain of evidence for it, if you're willing to look at the research. And I think movies can help us get there. Or at least not get in the way.

Now, when I say "happiness," I'm not talking about the transient buzz that you get from, say, heroin. I'm talking about a sense of fulfillment. A sense that you are where you're meant to be. That you are doing what you're meant to do. That you are who you're meant to be. And, most importantly, that you like who you are.

It's hard to find this sense of fulfillment in life. For some people, it comes from family. For some, it comes from career. For some, it comes from a hobby. For some, it comes from religion. For some, it comes from drugs.

The problem is, these things are not always enough. And this is where Shrek comes in.

See, the first time I watched Shrek, I knew something was wrong with it. Not with the movie itself, of course—that's a classic and a timeless masterpiece—but with me. And the problem was that I couldn't figure out what was wrong.

You see, watching Shrek for the first time is an experience that everyone should have. You sit there in the dark, watching the story unfold on the screen in front of you. And as you watch it, you find yourself actually caring about the characters. You laugh when they laugh. You want them to succeed. You feel sad when something bad happens to them.

Now, I'll be the first to admit that this experience isn't unique to Shrek. A lot of movies can do this. The difference is that with most movies, you watch them once and that's the end of it. You may remember certain scenes or jokes or what-have-you, but you don't dwell on them.

But with Shrek, it's different. After you've watched it once, you'll probably want to watch it again. And again. And again. Before you know it, you've seen it fifty times. And each time, you pick up on something new.

I'll give another example. Let's say you've just watched the movie Tron, and you really liked it. So, you watch it again. This time, you pay attention to the cinematography. The way the light cycles chase each other on the screen. The way the discs explode when they hit something. The way the digitized effects blend into the real-life footage. The way the scenes are set to an electronic version of Liszt's 2nd Hungarian Rhapsody.

This attention to detail only increases your enjoyment of the movie. In fact, you enjoy it so much that you want to share this experience with others. So, the next time you're with a group of friends, you tell them how you watched Tron and how much you liked the cinematography.

They stare at you blankly.

You try again. You say, "You know, the way they did the light cycles and stuff."

Still nothing.

Finally, one of your friends gets it. "Oh yeah!" he says. "I remember that. It was cool how they did that."

But he doesn't really remember it. Not the way you remember it. To him, it's just a vague idea of something that happened, not an ingrained memory seared into his brain like it is for you. You see his reaction and you try to forget about it. After all, what does it matter? You know what you saw, and in your mind, that's all that matters.

But it's this mindset that keeps you going back to Shrek. And it's this mindset that will lead you to other movies, and then other TV shows, and then books, and then games, and then pictures of bunny rabbits with misplaced captions on Tumblr.

But I'm getting ahead of myself. This is a story about how I lost myself, but it's not my story. It's my brother's. My brother—let's call him Michael—had a similar experience with Shrek, except his was even more powerful because it was the first time he'd experienced it.

At the time, our family had just gotten cable, and one of the channels happened to be MTV. At this point in time, MTV was still playing music videos, so my brother and I would always sit in front of the TV watching music videos whenever we could. One day, Shrek came on. We didn't know anything about it. We hadn't read the book it was based on, and we hadn't seen the trailers. All we knew is that there was a movie with a bunch of animals talking.

When the movie ended, we were speechless. In fact, our jaws were on the floor. We didn't know movies could make you feel this way. For the next few days, all we could talk about was Shrek. We told our parents, our friends, anyone who would listen about this movie we'd seen. Of course, none of them understood. I mean, how could they? They hadn't seen it.

But something else happened when we watched that movie. It got under our skin in a way nothing else ever had. After the first time, we had to watch it again. And again. And again. Soon, we knew every line in the movie. Not just the main ones, but every single line. And we didn't just watch it. We analyzed it. We took scenes apart and put them back together again. We tried to find all the little details that the creators had hidden in the background artwork.

As the years passed, this process never changed. Shrek became a part of us. I remember getting sick one year and missing a week of school. I stayed in bed and watched Shrek at least once every day that week.

A few years later, a sequel was released. My brother and I went to see it on opening night. We saw it again the next day, and again the next day, and again the day after that… well, you get the idea.

We never did anything with other kids our age. Our lives were Shrek, and Shrek alone. When people would ask us what we were into, we always had the same answer: Shrek. They usually laughed and made fun of us, but we didn't care. As far as we were concerned, they just didn't get it.

When high school came around, I decided to change things up a bit. Instead of watching Shrek, I listened to music and read books. Michael didn't like these changes too much. He stuck with the Shrek stuff. I sometimes wonder where we would be now if I had encouraged him to listen to music and read books instead.

comment by ChristianKl · 2020-07-25T18:05:03.628Z · score: 2 (1 votes) · LW(p) · GW(p)

What's the input that produced the text from GPT-3?

comment by TurnTrout · 2020-07-25T19:40:45.378Z · score: 2 (1 votes) · LW(p) · GW(p)

Two Sequences posts... lol... Here's the full transcript

comment by TurnTrout · 2020-03-19T20:24:38.419Z · score: 7 (4 votes) · LW(p) · GW(p)

Broca’s area handles syntax, while Wernicke’s area handles the semantic side of language processing. Subjects with damage to the latter can speak in syntactically fluent jargon-filled sentences (fluent aphasia) – and they can’t even tell their utterances don’t make sense, because they can’t even make sense of the words leaving their own mouth!

It seems like GPT2 : Broca’s area :: ??? : Wernicke’s area. Are there any cog psych/AI theories on this?

comment by TurnTrout · 2020-03-06T02:06:02.268Z · score: 7 (4 votes) · LW(p) · GW(p)

Cool Math Concept You Never Realized You Wanted: Fréchet distance.

Imagine a man traversing a finite curved path while walking his dog on a leash, with the dog traversing a separate one. Each can vary their speed to keep slack in the leash, but neither can move backwards. The Fréchet distance between the two curves is the length of the shortest leash sufficient for both to traverse their separate paths. Note that the definition is symmetric with respect to the two curves—the Frechet distance would be the same if the dog was walking its owner.

The Fréchet distance between two concentric circles of radius and respectively is . The longest leash is required when the owner stands still and the dog travels to the opposite side of the circle (), and the shortest leash when both owner and dog walk at a constant angular velocity around the circle ().

comment by TurnTrout · 2020-01-02T16:18:14.203Z · score: 7 (4 votes) · LW(p) · GW(p)

Earlier today, I became curious why extrinsic motivation tends to preclude or decrease intrinsic motivation. This phenomenon is known as overjustification. There's likely agreed-upon theories for this, but here's some stream-of-consciousness as I reason and read through summarized experimental results. (ETA: Looks like there isn't consensus on why this happens)

My first hypothesis was that recognizing external rewards somehow precludes activation of curiosity-circuits in our brain. I'm imagining a kid engrossed in a puzzle. Then, they're told that they'll be given $10 upon completion. I'm predicting that the kid won't become significantly less engaged, which surprises me?

third graders who were rewarded with a book showed more reading behaviour in the future, implying that some rewards do not undermine intrinsic motivation.

Might this be because the reward for reading is more reading, which doesn't undermine the intrinsic interest in reading? You aren't looking forward to escaping the task, after all.

While the provision of extrinsic rewards might reduce the desirability of an activity, the use of extrinsic constraints, such as the threat of punishment, against performing an activity has actually been found to increase one's intrinsic interest in that activity. In one study, when children were given mild threats against playing with an attractive toy, it was found that the threat actually served to increase the child's interest in the toy, which was previously undesirable to the child in the absence of threat.

A few experimental summaries:

1 Researchers at Southern Methodist University conducted an experiment on 188 female university students in which they measured the subjects' continued interest in a cognitive task (a word game) after their initial performance under different incentives.

The subjects were divided into two groups. Members of the first group were told that they would be rewarded for competence. Above-average players would be paid more and below-average players would be paid less. Members of the second group were told that they would be rewarded only for completion. Their pay was scaled by the number of repetitions or the number of hours playing. Afterwards, half of the subjects in each group were told that they over-performed, and the other half were told that they under-performed, regardless of how well each subject actually did.

Members of the first group generally showed greater interest in the game and continued playing for a longer time than the members of the second group. "Over-performers" continued playing longer than "under-performers" in the first group, but "under-performers" continued playing longer than "over-performers" in the second group. This study showed that, when rewards do not reflect competence, higher rewards lead to less intrinsic motivation. But when rewards do reflect competence, higher rewards lead to greater intrinsic motivation.

2 Richard Titmuss suggested that paying for blood donations might reduce the supply of blood donors. To test this, a field experiment with three treatments was conducted. In the first treatment, the donors did not receive compensation. In the second treatment, the donors received a small payment. In the third treatment, donors were given a choice between the payment and an equivalent-valued contribution to charity. None of the three treatments affected the number of male donors, but the second treatment almost halved the number of female donors. However, allowing the contribution to charity fully eliminated this effect.

From a glance at the Wikipedia page, it seems like there's not really expert consensus on why this happens. However, according to self-perception theory,

a person infers causes about his or her own behavior based on external constraints. The presence of a strong constraint (such as a reward) would lead a person to conclude that he or she is performing the behavior solely for the reward, which shifts the person's motivation from intrinsic to extrinsic.

This lines up with my understanding of self-consistency effects.

comment by TurnTrout · 2019-12-15T21:26:26.150Z · score: 7 (4 votes) · LW(p) · GW(p)

Virtue ethics seems like model-free consequentialism to me.

comment by JohnSteidley · 2020-05-25T20:20:32.795Z · score: 5 (3 votes) · LW(p) · GW(p)

I've was thinking along similar lines!

From my notes from 2019-11-24: "Deontology is like the learned policy of bounded rationality of consequentialism"

comment by TurnTrout · 2020-09-11T18:59:28.916Z · score: 6 (3 votes) · LW(p) · GW(p)

When I imagine configuring an imaginary pile of blocks, I can feel the blocks in front of me in this fake imaginary plane of existence. I feel aware of their spatial relationships to me, in the same way that it feels different to have your eyes closed in a closet vs in an empty auditorium. 

But what is this mental workspace? Is it disjoint and separated from my normal spatial awareness, or does my brain copy/paste->modify my real-life spatial awareness. Like, if my brother is five feet in front of me, and then I imagine a blade flying five feet in front of me in my imaginary mental space where he doesn't exist, do I reflexively flinch? Does my brain overlay these two mental spaces, or are they separate? 

I don't know. When I run the test, I at least flinch at the thought of such a thing happening. This isn't a good experiment because I know what I'm testing for; I need to think of a better test.

comment by TurnTrout · 2020-07-17T14:30:23.330Z · score: 6 (3 votes) · LW(p) · GW(p)

AIDungeon's subscriber-only GPT-3 can do some complex arithmetic, but it's very spotty. Bold text is me.

You say "What happens if I take the square root of 3i?" 

The oracle says: "You'll get a negative number. [wrong] So, for example, the square root of  is ." [correct]
"What?" you say.
 "I just said it," the oracle repeats. 
"But that's ridiculous! The square root of  is not . It's complex. It's  plus a multiple of ." [wrong, but my character is supposed to be playing dumb here]

The oracle frowns. "I'm just telling you what the answer is," she says. 

The oracle explains, "(5+i)x(5+i) = 24+10i. 7i x 8i=-56. 7i x 30=210." [wrong]

"What?" you say. "That's crazy!  isn't even a real number. The square root of  is  with a whole integer  attached."

 The oracle frowns. "I'm just telling you what the answer is," she says. 

The oracle explains, ".

The last line is the impressive part - while  and  is definitely correct. 

Its proofs are definitely awful in the pattern-matching kind of way. Not surprising. 

The oracle explains, " The group axioms are 

(1) a x a = a for all a in the group; (wrong)

(2) a x (b x c) = (a x b) x c for all a, b, c in the group; 

(3) a x (b + c) = (a x b) + (a x c) for all a, b, c in the group." (wrong, that's a field axiom)

The oracle explains, "here's how you prove that any finite abelian group is isomorphic to the direct product of a finite number of finite cyclic groups. First, you notice that if G is a finite abelian group of order n, then G can be described as a set of generators of the form , where the  are the elements of G and no  is equal to zero. (uh... sure)

Now, we want to prove that this is a direct product. To do this, we must show that given any element g in the group, there exists an index i such that . Then we must show that the sequence can be reversed. Given any index j, we must show that there exists an element g such that . Then we're done, because g will be equal to , which is the product of the 's." (nice try, GPT-3)

comment by TurnTrout · 2020-06-10T21:29:33.673Z · score: 6 (3 votes) · LW(p) · GW(p)

The new "Broader Impact" NeurIPS statement is a good step, but incentives are misaligned. Admitting fatally negative impact would set a researcher back in their career, as the paper would be rejected. 

Idea: Consider a dangerous paper which would otherwise have been published. What if that paper were published title-only on the NeurIPS website, so that the researchers can still get career capital?

Problem: How do you ensure resubmission doesn't occur elsewhere?

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-06-10T22:54:12.979Z · score: 4 (2 votes) · LW(p) · GW(p)

The people at NeurIPS who reviewed the paper might notice if resubmission occurred elsewhere? Automated tools might help with this, by searching for specific phrases.

There's been talk of having a Journal of Infohazards. Seems like an idea worth exploring to me. Your suggestion sounds like a much more feasible first step.

Problem: Any entity with halfway decent hacking skills (such as a national government, or clever criminal) would be able to peruse the list of infohazardy titles, look up the authors, cyberstalk them, and then hack into their personal computer and steal the files. We could hope that people would take precautions against this, but I'm not very optimistic. That said, this still seems better than the status quo.

comment by TurnTrout · 2020-05-26T18:01:51.657Z · score: 6 (3 votes) · LW(p) · GW(p)

Sentences spoken aloud are a latent space embedding of our thoughts; when trying to move a thought from our mind to another's, our thoughts are encoded with the aim of minimizing the other person's decoder error.

comment by TurnTrout · 2019-12-12T17:14:12.363Z · score: 6 (3 votes) · LW(p) · GW(p)

Going through an intro chem textbook, it immediately strikes me how this should be as appealing and mysterious as the alchemical magic system of Fullmetal Alchemist. "The law of equivalent exchange" "conservation of energy/elements/mass (the last two holding only for normal chemical reactions)", etc. If only it were natural to take joy in the merely real...

comment by Hazard · 2019-12-12T17:39:17.386Z · score: 4 (2 votes) · LW(p) · GW(p)

Have you been continuing your self-study schemes into realms beyond math stuff? If so I'm interested in both the motivation and how it's going! I remember having little interest in other non-physics science growing up, but that was also before I got good at learning things and my enjoyment was based on how well it was presented.

comment by TurnTrout · 2019-12-12T17:55:54.030Z · score: 4 (2 votes) · LW(p) · GW(p)

Yeah, I've read a lot of books since my reviews fell off last year, most of them still math. I wasn't able to type reliably until early this summer, so my reviews kinda got derailed. I've read Visual Group Theory, Understanding Machine Learning, Computational Complexity: A Conceptual Perspective, Introduction to the Theory of Computation, An Illustrated Theory of Numbers, most of Tadellis' Game Theory, the beginning of Multiagent Systems, parts of several graph theory textbooks, and I'm going through Munkres' Topology right now. I've gotten through the first fifth of the first Feynman lectures, which has given me an unbelievable amount of mileage for generally reasoning about physics.

I want to go back to my reviews, but I just have a lot of other stuff going on right now. Also, I run into fewer basic confusions than when I was just starting at math, so I generally have less to talk about. I guess I could instead try and re-present the coolest concepts from the book.

My "plan" is to keep learning math until the low graduate level (I still need to at least do complex analysis, topology, field / ring theory, ODEs/PDEs, and something to shore up my atrocious trig skills, and probably more)[1], and then branch off into physics + a "softer" science (anything from microecon to psychology). CS ("done") -> math -> physics -> chem -> bio is the major track for the physical sciences I have in mind, but that might change. I dunno, there's just a lot of stuff I still want to learn. :)


  1. I also still want to learn Bayes nets, category theory, get a much deeper understanding of probability theory, provability logic, and decision theory. ↩︎

comment by Hazard · 2019-12-12T19:08:03.098Z · score: 4 (2 votes) · LW(p) · GW(p)

Yay learning all the things! Your reviews are fun, also completely understandable putting energy elsewhere. Your energy for more learning is very useful for periodically bouncing myself into more learning.

comment by TurnTrout · 2019-10-01T20:57:16.698Z · score: 6 (3 votes) · LW(p) · GW(p)

We can think about how consumers respond to changes in price by considering the elasticity of the quantity demanded at a given price - how quickly does demand decrease as we raise prices? Price elasticity of demand is defined as ; in other words, for price and quantity , this is (this looks kinda weird, and it wasn't immediately obvious what's happening here...). Revenue is the total amount of cash changing hands: .

What's happening here is that raising prices is a good idea when the revenue gained (the "price effect") outweighs the revenue lost to falling demand (the "quantity effect"). A lot of words so far for an easy concept:

If price elasticity is greater than 1, demand is inelastic and price hikes decrease revenue (and you should probably have a sale). However, if it's less than 1, demand is elastic and boosting the price increases revenue - demand isn't dropping off quickly enough to drag down the revenue. You can just look at the area of the revenue rectangle for each effect!

comment by TurnTrout · 2019-09-22T02:28:49.631Z · score: 6 (3 votes) · LW(p) · GW(p)

How does representation interact with consciousness? Suppose you're reasoning about the universe via a partially observable Markov decision process, and that your model is incredibly detailed and accurate. Further suppose you represent states as numbers, as their numeric labels.

To get a handle on what I mean, consider the game of Pac-Man, which can be represented as a finite, deterministic, fully-observable MDP. Think about all possible game screens you can observe, and number them. Now get rid of the game screens. From the perspective of reinforcement learning, you haven't lost anything - all policies yield the same return they did before, the transitions/rules of the game haven't changed - in fact, there's a pretty strong isomorphism I can show between these two MDPs. All you've done is changed the labels - representation means practically nothing to the mathematical object of the MDP, although many eg DRL algorithms should be able to exploit regularities in the representation to reduce sample complexity.

So what does this mean? If you model the world as a partially observable MDP whose states are single numbers... can you still commit mindcrime via your deliberations? Is the structure of the POMDP in your head somehow sufficient for consciousness to be accounted for (like how the theorems of complexity theory govern computers both of flesh and of silicon)? I'm confused.

comment by G Gordon Worley III (gworley) · 2019-09-23T17:31:16.845Z · score: 8 (4 votes) · LW(p) · GW(p)

I think a reasonable and related question we don't have a solid answer for is if humans are already capable of mind crime.

For example, maybe Alice is mad at Bob and imagines causing harm to Bob. How well does Alice have to model Bob for her imaginings to be mind crime? If Alice has low cognitive empathy is it not mind crime but if her cognitive empathy is above some level is it then mind crime?

I think we're currently confused enough about what mind crime is such that it's hard to even begin to know how we could answer these questions based on more than gut feelings.

comment by Vladimir_Nesov · 2019-09-22T05:55:58.633Z · score: 2 (1 votes) · LW(p) · GW(p)

I suspect that it doesn't matter how accurate or straightforward a predictor is in modeling people. What would make prediction morally irrelevant is that it's not noticed by the predicted people, irrespective of whether this happens because it spreads the moral weight conferred to them over many possibilities (giving inaccurate prediction), keeps the representation sufficiently baroque, or for some other reason. In the case of inaccurate prediction or baroque representation, it probably does become harder for the predicted people to notice being predicted, and I think this is the actual source of moral irrelevance, not those things on their own. A more direct way of getting the same result is to predict counterfactuals where the people you reason about don't notice the fact that you are observing them, which also gives a form of inaccuracy (imagine that your predicting them is part of their prior, that'll drive the counterfactual further from reality).

comment by TurnTrout · 2019-09-16T22:19:19.496Z · score: 6 (3 votes) · LW(p) · GW(p)

I seem to differently discount different parts of what I want. For example, I'm somewhat willing to postpone fun to low-probability high-fun futures, whereas I'm not willing to do the same with romance.

comment by TurnTrout · 2020-08-02T00:33:10.521Z · score: 5 (3 votes) · LW(p) · GW(p)

If you measure death-badness from behind the veil of ignorance, you’d naively prioritize well-liked, famous people with large families.

comment by Pattern · 2020-08-02T02:18:30.194Z · score: 2 (1 votes) · LW(p) · GW(p)

Would you prioritize the young from behind the veil of ignorance?

comment by TurnTrout · 2020-06-26T19:41:44.688Z · score: 5 (3 votes) · LW(p) · GW(p)

Idea: learn by making conjectures (math, physical, etc) and then testing them / proving them, based on what I've already learned from a textbook. 

Learning seems easier and faster when I'm curious about one of my own ideas.

comment by NaiveTortoise (An1lam) · 2020-06-27T14:19:53.202Z · score: 7 (3 votes) · LW(p) · GW(p)

For what it's worth, this is very true for me as well.

I'm also reminded of a story of Robin Hanson from Cryonics magazine:

Robin’s attraction to the more abstract ideas supporting various fields of interest was similarly shown in his approach – or rather, lack thereof – to homework. “In the last two years of college, I simply stopped doing my homework, and started playing with the concepts. I could ace all the exams, but I got a zero on the homework… Someone got scatter plots up there to convince people that you could do better on exams if you did homework.” But there was an outlier on that plot, courtesy of Robin, that said otherwise.

comment by Rudi C (rudi-c) · 2020-06-27T14:12:40.714Z · score: 1 (1 votes) · LW(p) · GW(p)

How do you estimate how hard your invented problems are?

comment by TurnTrout · 2020-02-05T16:41:01.779Z · score: 5 (2 votes) · LW(p) · GW(p)

AFAICT, the deadweight loss triangle from eg price ceilings is just a lower bound on lost surplus. inefficient allocation to consumers means that people who value good less than market equilibrium price can buy it, while dwl triangle optimistically assumes consumers with highest willingness to buy will eat up the limited supply.

comment by Wei_Dai · 2020-02-07T22:14:31.405Z · score: 5 (2 votes) · LW(p) · GW(p)

Good point. By searching for "deadweight loss price ceiling lower bound" I was able to find a source (see page 26) that acknowledges this, but most explications of price ceilings do not seem to mention that the triangle is just a lower bound for lost surplus.

comment by Dagon · 2020-02-07T00:02:40.045Z · score: 2 (1 votes) · LW(p) · GW(p)

Lost surplus is definitely a loss - it's not linear with utility, but it's not uncorrelated. Also, if supply is elastic over any relevant timeframe, there's an additional source of loss. And I'd argue that for most goods, over timeframes smaller than most price-fixing proposals are expected to last, there is significant price elasticity.

comment by TurnTrout · 2020-02-07T13:10:46.833Z · score: 2 (1 votes) · LW(p) · GW(p)

Lost surplus is definitely a loss - it's not linear with utility, but it's not uncorrelated.

I don't think I was disagreeing?

comment by Dagon · 2020-02-07T17:13:50.111Z · score: 2 (1 votes) · LW(p) · GW(p)

Ah, I took the "just" in "just a lower bound on lost surplus" as an indicator that it's less important than other factors. And I lightly believe (meaning: for the cases I find most available, I believe it, but I don't know how general it is) that the supply elasticity _is_ the more important effect of such distortions.

So I wanted to reinforce that I wasn't ignoring that cost, only pointing out a greater cost.

comment by TurnTrout · 2019-09-30T00:29:52.406Z · score: 5 (3 votes) · LW(p) · GW(p)

I had an intuition that attainable utility preservation (RL but you maintain your ability to achieve other goals) points at a broader template for regularization. AUP regularizes the agent's optimal policy to be more palatable towards a bunch of different goals we may wish we had specified. I hinted at the end of Towards a New Impact Measure [LW · GW] that the thing-behind-AUP might produce interesting ML regularization techniques.

This hunch was roughly correct; Model-Agnostic Meta-Learning tunes the network parameters such that they can be quickly adapted to achieve low loss on other tasks (the problem of few-shot learning). The parameters are not overfit on the scant few data points to which the parameters are adapted, which is also interesting.

comment by TurnTrout · 2020-10-16T18:24:48.881Z · score: 4 (2 votes) · LW(p) · GW(p)

Epistemic status: not an expert

Understanding Newton's third law, .

Consider the vector-valued velocity as a function of time, . Scale this by the object's mass and you get the momentum function over time. Imagine this momentum function wiggling around over time, the vector from the origin rotating and growing and shrinking.

The third law says that force is the derivative of this rescaled vector function - if an object is more massive, then the same displacement of this rescaled arrow is a proportionally smaller velocity modification, because of the rescaling!

And also, forces have opposite reactions (by conservation of momentum) and equal reactions (by conservation of energy).

comment by TurnTrout · 2020-09-27T22:52:55.668Z · score: 4 (2 votes) · LW(p) · GW(p)

What is "real"? I think about myself as a computation embedded in some other computation (i.e. a universe-history). I think "real" describes hypotheses about the environment where my computation lives. What should I think is real? That which an "ideal embedded reasoner" would assign high credence. However that works.

This sensibly suggests that Gimli-in-actual-Ea (LOTR) should believe he lives in Ea, and that Ea is real, even though it isn't our universe's Earth. Also, the notion accounts for indexical uncertainty by punting it to how embedded reasoning should work (a la radical probabilism), without being tautological. Also, it supports both the subjective nature of what one should call "real", and the notion of an actual out-there-somewhere shared reality (multiple computations can be embedded within the same universe-history).

comment by TurnTrout · 2020-07-17T23:01:50.436Z · score: 4 (2 votes) · LW(p) · GW(p)

Tricking AIDungeon's GPT-3 model into writing HPMOR:

You start reading Harry Potter and the Methods of Rationality by Eliezer Yudkowsky:

" "It said to me," said Professor Quirrell, "that it knew me, and that it would hunt me down someday, wherever I tried to hide." His face was rigid, showing no fright.
"Ah," Harry said. "I wouldn't worry about that, Professor Quirrell." It's not like Dementors can actually talk, or think; the structure they have is borrowed from your own mind and expectations...
Now everyone was giving him very strange looks. The Aurors were glancing nervously at each other, at the Dementor, at Harry.
And they stood directly before the Dementor's cage.
"They are wounds in the world," Harry said. "It's just a wild guess, but I'm guessing the one who said that was Godric Gryffindor."
"Yes..." said Dumbledore. "How did you know?"
It is a common misconception, thought Harry, that all the best rationalists are Sorted into Ravenclaw, leaving none for other Houses. This is not so; being Sorted into Ravenclaw indicates that your strongest virtue is curiosity, wondering and desiring to know the true answer. And this is not the only virtue a rationalist needs. Sometimes you have to work hard on a problem, and stick to it for a while. Sometimes you need a clever plan for finding out. And sometimes what you need more than anything else to see an answer, is the courage to face it...
Harry's gaze went to what lay beneath the cloak, the horror far worse than any decaying mummy. Rowena Ravenclaw might also have known, for it was an obvious enough riddle once you saw it as a riddle.
And it was also obvious why the Patronuses were animals. The animals didn't know, and so were sheltered from the fear.
But Harry knew, and would always know, and would never be able to forget. He'd tried to teach himself to face reality without flinching, and though Harry had not yet mastered that art, still those grooves had been worn into his mind, the learned reflex to look toward the painful thought instead of away. Harry would never be able to forget by thinking warm happy thoughts about something else, and that was why the spell hadn't worked for him.
So Harry would think a warm happy thought that wasn't about something else.
Harry drew forth his wand that Professor Flitwick had returned to him, put his feet into the beginning stance for the Patronus Charm.
Within his mind, Harry discarded the last remnants of the peace of the phoenix, put aside the calm, the dreamlike state, remembered instead Fawkes's piercing cry, and roused himself for battle. Called upon all the pieces and elements of himself to awaken. Raised up within himself all the strength that the Patronus Charm could ever draw upon, to put himself into the right frame of mind for the final warm and happy thought; remembered all bright things.
The books his father had bought him.
Mum's smile when Harry had handmade her a mother's day card, an elaborate thing that had used half a pound of spare electronics parts from the garage to flash lights and beep a little tune, and had taken him three days to make.
Professor McGonagall telling him that his parents had died well, protecting him. As they had.
Realizing that Hermione was keeping up with him and even running faster, that they could be true rivals and friends.
Coaxing Draco out of the darkness, watching him slowly move toward the light.
Neville and Seamus and Lavender and Dean and everyone else who looked up to him, everyone that he would have fought to protect if anything threatened Hogwarts.
Everything that made life worth living.
His wand rose into the starting position for the Patronus Charm.
Harry thought of the stars, the image that had almost held off the Dementor even without a Patronus. Only this time, Harry added the missing ingredient, he'd never truly seen it but he'd seen the pictures and the video....

Harry thought of the image of his family, Mum and Dad and Liliana, standing outside the house on Gardiners Street, smiling at him. He thought of them in the back garden of that house, laughing as he chased the gnomes, and that was his final warm and happy thought.
And Harry cast the Patronus Charm.
A burst of silver erupted from the end of his wand and took the form of a deer, shining brightly.
...His mother's eyes.
And the final piece clicked into place.
Harry's mind was flooded with warmth, and the world seemed to still. Time itself seemed to stop, sound itself ceased to be of concern to Harry, all that remained was himself, and the terrible terrible void that surrounded him. But even as he felt himself begin to be pulled into it, a new force intervened.
A new shape of white, a beautiful silver serpent, coiled around him, protecting him from the darkness. The beautiful creature seemed to both give off warmth and energy, while also absorbing it from the void itself.
Harry felt himself pulled backward, away from the Dementor, and he found himself in the dark halls of Azkaban, being shielded by his wonderful silver Patronus.
...He thought of the night sky, of the moon, shining down upon him and his friends, and the stars beyond. His wand began to glow. "Expecto Patronum!" The glow burst from the tip of his wand and formed into a shape, and the shape was glowing white like a shadowless bright light, yet it had form and shape and figure, and that shape was the shadow of the memory of a star.
...He thought of the sunlight. Of warm air and a bright sky. Of happy people and happy times. Of hope and joy. Of a future worth living for, not just worth dying for. And Harry's wand began to glow. And then the golden stag erupted from the tip of his wand, and charged the Dementor with a cry that echoed through the tunnel.


[...] Only this time, Harry added the missing ingredient,
...the reason to fight for. "Everything I love," Harry hissed out, "everything I love, Reflect if you're worth saving." And with that, the first glimmer of white silver started appearing in the air. "What is that?" whispered Hermione. "My Patronus," Harry said softly. "It's not working." The silver was fading, like a bubble that threatened to burst. "I can't cast the Patronus Charm..."


...He imagined the stars as they looked from Earth, imagined his true home, thought of it being there with him, holding off the Dementor. [?!]

The cold was already fading. That wouldn't help against this fear, but another warm memory would. Harry brought to mind a summer's day in primary school, where he'd first learned to cast a spell.
... the most important part. His father, and his mother, waving to him from a dock as a ship bound for Egypt sailed away. Harry filled his mind with the happy thought, let it overtake him, let it become him.

Let it carry him, like an airship on a favorable wind. "Expecto Patronum!" A burst of pure light, the blazing shadow of an immensely tall humanoid, armored and bearing a helm and a sword. [?!]

... a wish beyond the thought, a belief that he could not truly be saved. "Don't die," Harry whispered to himself, as he had when facing Quirrell, a prayer to whatever Gods might be listening, an order to himself rather than request of them. "Don't die, don't die, don't die..." Concentrating solely upon that wish, the image of a star exploding into existence upon his outstretched wand.


... that he'd been too sad to think of before. His mother's love. He had to believe. He had to believe, even if she wasn't really there, that she loved him. He had to believe, or else the Patronus wouldn't come. A few feet away from him, Draco fell to his knees. "No," he whispered. "No!"
That was what broke Harry's concentration. In front of him, the shimmering humanoid shape collapsed, the mist and light fading away. There was silence for a moment. Then a howl of mad laughter shook the chamber, as Mother's face contorted with rage. "Fool! Idiot! You thought you could beat me, with such a pitiful attempt!" Her voice was rising to a shriek.


...the reason why it had been almost enough, and not quite. Hope. I have seen the brightest futures of the world, and I have seen them swept away. "Expecto Patronum!" From the tip of his wand burst a shining Unicorn, white as snow and bright as the sun.
And for just one heartbeat it stood there, before the world went black.

The Dark Lord is gone. The Ministry has fallen. London is burning. But you have a mission, and you will see it through, because this is the last light in the darkness. This is the last hope for the world.

And you will see it done.

comment by Pattern · 2020-07-19T00:16:46.256Z · score: 2 (1 votes) · LW(p) · GW(p)

I love the ending. It's way more exciting,

and terrifying.

comment by TurnTrout · 2020-07-18T15:39:29.058Z · score: 2 (1 votes) · LW(p) · GW(p)

... that which he had thought was absent. Love. He didn't think of the books, or his parents or Professor McGonagall. He thought of Hermione, and how she had always believed in him. He thought of how she'd helped him in so many ways, not just with homework, not just with fighting the Dark Arts. How she'd tried to help him every day since they'd first met on the Hogwarts Express.

comment by habryka (habryka4) · 2020-07-18T01:18:56.707Z · score: 2 (1 votes) · LW(p) · GW(p)

Mod note: Spoilerified, to shield the eyes of the innocent.

comment by TurnTrout · 2020-07-18T01:34:34.938Z · score: 4 (2 votes) · LW(p) · GW(p)

My bad! Thanks.

comment by TurnTrout · 2020-07-13T13:20:18.523Z · score: 4 (2 votes) · LW(p) · GW(p)

ARCHES distinguishes between single-agent / single-user and single-agent/multi-user alignment scenarios. Given assumptions like "everyone in society is VNM-rational" and "societal preferences should also follow VNM rationality", and "if everyone wants a thing, society also wants the thing", Harsanyi's utilitarian theorem shows that the societal utility function is a linear non-negative weighted combination of everyone's utilities. So, in a very narrow (and unrealistic) setting, Harsanyi's theorem tells you how the single-multi solution is built from the single-single solutions. 

This obviously doesn't actually solve either alignment problem. But, it seems like an interesting parallel for what we might eventually want.

comment by TurnTrout · 2020-05-13T15:25:36.291Z · score: 4 (2 votes) · LW(p) · GW(p)

From FLI's AI Alignment Podcast: Inverse Reinforcement Learning and Inferring Human Preferences with Dylan Hadfield-Menell:

Dylan: There’s one example that I think about, which is, say, you’re cooperating with an AI system playing chess. You start working with that AI system, and you discover that if you listen to its suggestions, 90% of the time, it’s actually suggesting the wrong move or a bad move. Would you call that system value-aligned?

Lucas: No, I would not.

Dylan: I think most people wouldn’t. Now, what if I told you that that program was actually implemented as a search that’s using the correct goal test? It actually turns out that if it’s within 10 steps of a winning play, it always finds that for you, but because of computational limitations, it usually doesn’t. Now, is the system value-aligned? I think it’s a little harder to tell here. What I do find is that when I tell people the story, and I start off with the search algorithm with the correct goal test, they almost always say that that is value-aligned but stupid.

There’s an interesting thing going on here, which is we’re not totally sure what the target we’re shooting for is. You can take this thought experiment and push it further. Supposed you’re doing that search, but, now, it says it’s heuristic search that uses the correct goal test but has an adversarially chosen heuristic function. Would that be a value-aligned system? Again, I’m not sure. If the heuristic was adversarially chosen, I’d say probably not. If the heuristic just happened to be bad, then I’m not sure.

Consider the optimizer/optimized distinction: the AI assistant is better described as optimized to either help or stop you from winning the game. This optimization may or may not have been carried out by a process which is "aligned" with you; I think that ascribing intent alignment to the assistant's creator makes more sense. In terms of the adversarial heuristic case, intent alignment seems unlikely.

But, this also feels like passing the buck – hoping that at some point in history, there existed something to which we are comfortable ascribing alignment and responsibility.

comment by TurnTrout · 2020-05-06T17:19:57.756Z · score: 4 (2 votes) · LW(p) · GW(p)

On page 22 of Probabilistic reasoning in intelligent systems, Pearl writes:

Raw experiential data is not amenable to reasoning activities such as prediction and planning; these require that data be abstracted into a representation with a coarser grain. Probabilities are summaries of details lost in this abstraction...

An agent observes a sequence of images displaying either a red or a blue ball. The balls are drawn according to some deterministic rule of the time step. Reasoning directly from the experiential data leads to ~Solomonoff induction. What might Pearl's "coarser grain" look like for a real agent?

Imagine an RNN trained with gradient descent and binary cross-entropy loss function ("given the data so far, did it correctly predict the next draw?"), and suppose the learned predictive accuracy is good. How might this happen?

  1. The network learns to classify whether the most recent input image contains a red or blue ball, for instrumental predictive reasons, and

  2. A recurrent state records salient information about the observed sequence, which could be arbitrarily long. The RNN + learned weights form a low-complexity function approximator in the space of functions on arbitrary-length sequences. My impression is that gradient descent has simplicity as an inductive bias (cf double descent debate).

Being an approximation of some function over arbitrary-length sequences, the network outputs a prediction for the next color, a specific feature of the next image in the sequence. Can this prediction be viewed as nontrivially probabilistic? In other words, could we use the output to learn about the network's "beliefs" over hypotheses which generate the sequence of balls?

The RNN probably isn't approximating the true (deterministic) hypothesis which explains the sequence of balls. Since it's trained to minimize cross-entropy loss, it learns to hedge, essentially making it approximate a distribution over hypotheses. This implicitly defines its "posterior probability distribution".

Under this interpretation, the output is just the measure of hypotheses predicting blue versus the measure predicting red.

comment by TurnTrout · 2020-05-06T17:37:19.042Z · score: 2 (1 votes) · LW(p) · GW(p)

In particular, the coarse-grain is what I mentioned in 1) – beliefs are easier to manage with respect to a fixed featurization of the observation space.

comment by NaiveTortoise (An1lam) · 2020-05-06T19:43:08.114Z · score: 1 (1 votes) · LW(p) · GW(p)

Only related to the first part of your post, I suspect Pearl!2020 would say the coarse-grained model should be some sort of causal model on which we can do counterfactual reasoning.

comment by TurnTrout · 2020-04-28T19:23:47.063Z · score: 4 (2 votes) · LW(p) · GW(p)

We can imagine aliens building a superintelligent agent which helps them get what they want. This is a special case of aliens inventing tools. What kind of general process should these aliens use – how should they go about designing such an agent?

Assume that these aliens want things in the colloquial sense (not that they’re eg nontrivially VNM EU maximizers) and that a reasonable observer would say they’re closer to being rational than antirational. Then it seems[1] like these aliens eventually steer towards reflectively coherent rationality (provided they don’t blow themselves to hell before they get there): given time, they tend to act to get what they want, and act to become more rational. But, they aren’t fully “rational”, and they want to build a smart thing that helps them. What should they do?

In this situation, it seems like they should build an agent which empowers them & increases their flexible control over the future, since they don’t fully know what they want now. Lots of flexible control means they can better error-correct and preserve value for what they end up believing they actually want. This also protects them from catastrophe and unaligned competitor agents.


  1. I don’t know if this is formally and literally always true, I’m just trying to gesture at an intuition about what kind of agentic process these aliens are. ↩︎

comment by TurnTrout · 2020-04-15T22:12:03.952Z · score: 4 (2 votes) · LW(p) · GW(p)

ordinal preferences just tell you which outcomes you like more than others: apples more than oranges.

Interval scale preferences assign numbers to outcomes, which communicates how close outcomes are in value: kiwi 1, orange 5, apple 6. You can say that apples have 5 times the advantage over kiwis that they do over oranges, but you can't say that apples are six times as good as kiwis. Fahrenheit and Celsius are also like this.

Ratio scale ("rational"? 😉) preferences do let you say that apples are six times as good as kiwis, and you need this property to maximize expected utility. You have to be able to weigh off the relative desirability of different outcomes, and ratio scale is the structure which let you do it – the important content of a utility function isn't in its numerical values, but in the ratios of the valuations.

comment by mr-hire · 2020-04-16T14:54:35.550Z · score: 4 (2 votes) · LW(p) · GW(p)

Isn't the typical assumption in game theory that preferences are ordinal? This suggests that you can make quite a few strategic decisions without bringing in ratio.

comment by Dagon · 2020-04-16T18:09:32.389Z · score: 3 (2 votes) · LW(p) · GW(p)

From what I have read, and from self-introspection, humans mostly have ordinal preferences. Some of them we can interpolate to interval scales or ratios (or higher-order functions) but if we extrapolate very far, we get odd results.

It turns out you can do a LOT with just ordinal preferences. Almost all real-world decisions are made this way.

comment by TurnTrout · 2020-03-26T23:51:48.935Z · score: 4 (2 votes) · LW(p) · GW(p)

It seems to me that Zeno's paradoxes leverage incorrect, naïve notions of time and computation. We exist in the world, and we might suppose that that the world is being computed in some way. If time is continuous, then the computer might need to do some pretty weird things to determine our location at an infinite number of intermediate times. However, even if that were the case, we would never notice it – we exist within time and we would not observe the external behavior of the system which is computing us, nor its runtime.

comment by Pattern · 2020-03-28T06:18:45.520Z · score: 2 (1 votes) · LW(p) · GW(p)

What are your thoughts on infinitely small quantities?

comment by TurnTrout · 2020-03-28T13:14:46.395Z · score: 2 (1 votes) · LW(p) · GW(p)

Don't have much of an opinion - I haven't rigorously studied infinitesimals yet. I usually just think of infinite / infinitely small quantities as being produced by limiting processes. For example, the intersection of all the -balls around a real number is just that number (under the standard topology), which set has 0 measure and is, in a sense, "infinitely small".

comment by TurnTrout · 2020-03-18T18:36:37.285Z · score: 4 (2 votes) · LW(p) · GW(p)

Very rough idea

In 2018, I started thinking about corrigibility as "being the kind of agent lots of agents would be happy to have activated". This seems really close to a more ambitious version of what AUP tries to do (not be catastrophic for most agents).

I wonder if you could build an agent that rewrites itself / makes an agent which would tailor the AU landscape towards its creators' interests, under a wide distribution of creator agent goals/rationalities/capabilities. And maybe you then get a kind of generalization, where most simple algorithms which solve this solve ambitious AI alignment in full generality.

comment by TurnTrout · 2020-02-06T16:56:15.771Z · score: 4 (2 votes) · LW(p) · GW(p)

My autodidacting has given me a mental reflex which attempts to construct a gears-level explanation of almost any claim I hear. For example, when listening to “Listen to Your Heart” by Roxette:

Listen to your heart,

There’s nothing else you can do

I understood what she obviously meant and simultaneously found myself subvocalizing “she means all other reasonable plans are worse than listening to your heart - not that that’s literally all you can do”.

This reflex is really silly and annoying in the wrong context - I’ll fix it soon. But it’s pretty amusing that this is now how I process claims by default, and I think it usually serves me well.

comment by TurnTrout · 2019-12-26T16:51:41.378Z · score: 4 (2 votes) · LW(p) · GW(p)

The framing effect & aversion to losses generally cause us to execute more cautious plans. I’m realizing this is another reason to reframe my x-risk motivation from “I won’t let the world be destroyed” to “there’s so much fun we could have, and I want to make sure that happens”. I think we need more exploratory thinking in alignment research right now.

(Also, the former motivation style led to me crashing and burning a bit when my hands were injured and I was no longer able to do much.)

ETA: actually, i’m realizing I had the effect backwards. Framing via losses actually encourages more risk-taking plans. Oops. I’d like to think about this more, since I notice my model didn’t protest when I argued the opposite of the experimental conclusions.

comment by TurnTrout · 2019-12-26T19:49:07.630Z · score: 7 (3 votes) · LW(p) · GW(p)

I’m realizing how much more risk-neutral I should be:

Paul Samuelson... offered a colleague a coin-toss gamble. If the colleague won the coin toss, he would receive $200, but if he lost, he would lose $100. Samuelson was offering his colleague a positive expected value with risk. The colleague, being risk-averse, refused the single bet, but said that he would be happy to toss the coin 100 times! The colleague understood that the bet had a positive expected value and that across lots of bets, the odds virtually guaranteed a profit. Yet with only one trial, he had a 50% chance of regretting taking the bet.

Notably, Samuelson‘s colleague doubtless faced many gambles in life… He would have fared better in the long run by maximizing his expected value on each decision... all of us encounter such “small gambles” in life, and we should try to follow the same strategy. Risk aversion is likely to tempt us to turn down each individual opportunity for gain. Yet the aggregated risk of all of the positive expected value gambles that we come across would eventually become infinitesimal, and potential profit quite large.

comment by Isnasene · 2019-12-27T01:36:36.046Z · score: 3 (2 votes) · LW(p) · GW(p)

For what it's worth, I tried something like the "I won't let the world be destroyed"->"I want to make sure the world keeps doing awesome stuff" reframing back in the day and it broadly didn't work. This had less to do with cautious/uncautious behavior and more to do with status quo bias. Saying "I won't let the world be destroyed" treats "the world being destroyed" as an event that deviates from the status quo of the world existing. In contrast, saying "There's so much fun we could have" treats "having more fun" as the event that deviates from the status quo of us not continuing to have fun.

When I saw the world being destroyed as status quo, I cared a lot less about the world getting destroyed.



comment by TurnTrout · 2019-11-26T18:43:56.755Z · score: 4 (2 votes) · LW(p) · GW(p)

I was having a bit of trouble holding the point of quadratic residues in my mind. I could effortfully recite the definition, give an example, and walk through the broad-strokes steps of proving quadratic reciprocity. But it felt fake and stale and memorized.

Alex Mennen suggested a great way of thinking about it. For some odd prime , consider the multiplicative group . This group is abelian and has even order . Now, consider a primitive root / generator . By definition, every element of the group can be expressed as . The quadratic residues are those expressible by even (this is why, for prime numbers, half of the group is square mod ). This also lets us easily see that the residual subgroup is closed under multiplication by (which generates it), that two non-residues multiply to make a residue, and that a residue and non-residue make a non-residue. The Legendre symbol then just tells us, for , whether is even.

Now, consider composite numbers whose prime decomposition only contains or in the exponents. By the fundamental theorem of finite abelian groups and the chinese remainder theorem, we see that a number is square mod iff it is square mod all of the prime factors.

I'm still a little confused about how to think of squares mod .

comment by AlexMennen · 2019-11-26T23:28:49.532Z · score: 4 (2 votes) · LW(p) · GW(p)

The theorem: where is relatively prime to an odd prime and , is a square mod iff is a square mod and is even.

The real meat of the theorem is the case (i.e. a square mod that isn't a multiple of is also a square mod . Deriving the general case from there should be fairly straightforward, so let's focus on this special case.

Why is it true? This question has a surprising answer: Newton's method for finding roots of functions. Specifically, we want to find a root of , except in instead of .

To adapt Newton's method to work in this situation, we'll need the p-adic absolute value on : for relatively prime to . This has lots of properties that you should expect of an "absolute value": it's positive ( with only when ), multiplicative (), symmetric (), and satisfies a triangle inequality (; in fact, we get more in this case: ). Because of positivity, symmetry, and the triangle inequality, the p-adic absolute value induces a metric (in fact, ultrametric, because of the strong version of the triangle inequality) . To visualize this distance function, draw giant circles, and sort integers into circles based on their value mod . Then draw smaller circles inside each of those giant circles, and sort the integers in the big circle into the smaller circles based on their value mod . Then draw even smaller circles inside each of those, and sort based on value mod , and so on. The distance between two numbers corresponds to the size of the smallest circle encompassing both of them. Note that, in this metric, converges to .

Now on to Newton's method: if is a square mod , let be one of its square roots mod . ; that is, is somewhat close to being a root of with respect to the p-adic absolute value. , so ; that is, is steep near . This is good, because starting close to a root and the slope of the function being steep enough are things that helps Newton's method converge; in general, it might bounce around chaotically instead. Specifically, It turns out that, in this case, is exactly the right sense of being close enough to a root with steep enough slope for Newton's method to work.

Now, Newton's method says that, from , you should go to . is invertible mod , so we can do this. Now here's the kicker: , so . That is, is closer to being a root of than is. Now we can just iterate this process until we reach with , and we've found our square root of mod .

Exercise: Do the same thing with cube roots. Then with roots of arbitrary polynomials.

comment by AlexMennen · 2019-11-26T23:37:25.091Z · score: 2 (1 votes) · LW(p) · GW(p)

The part about derivatives might have seemed a little odd. After all, you might think, is a discrete set, so what does it mean to take derivatives of functions on it. One answer to this is to just differentiate symbolically using polynomial differentiation rules. But I think a better answer is to remember that we're using a different metric than usual, and isn't discrete at all! Indeed, for any number , , so no points are isolated, and we can define differentiation of functions on in exactly the usual way with limits.

comment by TurnTrout · 2019-10-22T19:22:03.859Z · score: 4 (2 votes) · LW(p) · GW(p)

I noticed I was confused and liable to forget my grasp on what the hell is so "normal" about normal subgroups. You know what that means - colorful picture time!

First, the classic definition. A subgroup is normal when, for all group elements , (this is trivially true for all subgroups of abelian groups).

ETA: I drew the bounds a bit incorrectly; is most certainly within the left coset ().

Notice that nontrivial cosets aren't subgroups, because they don't have the identity .

This "normal" thing matters because sometimes we want to highlight regularities in the group by taking a quotient. Taking an example from the excellent Visual Group Theory, the integers have a quotient group consisting of the congruence classes , each integer slotted into a class according to its value mod 12. We're taking a quotient with the cyclic subgroup .

So, what can go wrong? Well, if the subgroup isn't normal, strange things can happen when you try to take a quotient.

Here's what's happening:

Normality means that when you form the new Cayley diagram, the arrows behave properly. You're at the origin, . You travel to using . What we need for this diagram to make sense is that if you follow any you please, applying means you go back to . In other words, . In other words, . In other other words (and using a few properties of groups), .

comment by TurnTrout · 2019-09-30T00:45:44.638Z · score: 4 (2 votes) · LW(p) · GW(p)

One of the reasons I think corrigibility might have a simple core principle is: it seems possible to imagine a kind of AI which would make a lot of different possible designers happy. That is, if you imagine the same AI design deployed by counterfactually different agents with different values and somewhat-reasonable rationalities, it ends up doing a good job by almost all of them. It ends up acting to further the designers' interests in each counterfactual. This has been a useful informal way for me to think about corrigibility, when considering different proposals.

This invariance also shows up (in a different way) in AUP, where the agent maintains its ability to satisfy many different goals. In the context of long-term safety, AUP agents are designed to avoid gaining power, which implicitly ends up respecting the control of other agents present in the environment (no matter their goals).

I'm interested in thinking more about this invariance, and why it seems to show up in a sensible way in two different places.

comment by TurnTrout · 2020-05-06T15:46:45.897Z · score: 3 (2 votes) · LW(p) · GW(p)

Continuous functions can be represented by their rational support; in particular, for each real number , choose a sequence of rational numbers converging to , and let .

Therefore, there is an injection from the vector space of continuous functions to the vector space of all sequences : since the rationals are countable, enumerate them by . Then the sequence represents continuous function .

comment by itaibn0 · 2020-05-06T20:30:33.088Z · score: 3 (2 votes) · LW(p) · GW(p)

This map is not a surjection because not every map from the rational numbers to the real numbers is continuous, and so not every sequence represents a continuous function. It is injective, and so it shows that a basis for the latter space is at least as large in cardinality as a basis for the former space. One can construct an injective map in the other direction, showing the both spaces of bases with the same cardinality, and so they are isomorphic.

comment by TurnTrout · 2020-05-06T20:59:50.419Z · score: 2 (1 votes) · LW(p) · GW(p)

Fixed, thanks.

comment by TurnTrout · 2019-10-01T02:44:33.918Z · score: 3 (2 votes) · LW(p) · GW(p)

(Just starting to learn microecon, so please feel free to chirp corrections)

How diminishing marginal utility helps create supply/demand curves: think about the uses you could find for a pillow. Your first few pillows are used to help you fall asleep. After that, maybe some for your couch, and then a few spares to keep in storage. You prioritize pillow allocation in this manner; the value of the latter uses is much less than the value of having a place to rest your head.

How many pillows do you buy at a given price point? Well, if you buy any, you'll buy some for your bed at least. Then, when pillows get cheap enough, you'll start buying them for your couch. At what price, exactly? Depends on the person, and their utility function. So as the price goes up or down, it does or doesn't become worth it to buy pillows for different levels of the "use hierarchy".

Then part of what the supply/demand curve is reflecting is the distribution of pillow use valuations in the market. It tracks when different uses become worth it for different agents, and how significant these shifts are!

comment by TurnTrout · 2020-10-21T02:37:18.929Z · score: 2 (1 votes) · LW(p) · GW(p)
From unpublished work.

The answer to this seems obvious in isolation: shaping helps with credit assignment, rescaling doesn't (and might complicate certain methods in the advantage vs Q-value way). But I feel like maybe there's an important interaction here that could inform a mathematical theory of how a reward signal guides learners through model space?

comment by TurnTrout · 2020-09-26T22:24:17.184Z · score: 2 (1 votes) · LW(p) · GW(p)

Reasoning about learned policies via formal theorems on the power-seeking incentives of optimal policies

One way instrumental subgoals might arise in actual learned policies: we train a proto-AGI reinforcement learning agent with a curriculum including a variety of small subtasks. The current theorems show sufficient conditions for power-seeking tending to be optimal in fully-observable environments; many environments meet these sufficient conditions; optimal policies aren't hard to compute for the subtasks. One highly transferable heuristic would therefore be to gain power in new environments, and then figure out what to do for the specific goal at hand. This may or may not take the form of an explicit mesa-objective embedded in e.g. the policy network.

Later, the heuristic has the agent seek power for the "real world" environment.

(Optimal Farsighted Agents Tend to Seek Power is rather dated and will be updated soon.)

comment by TurnTrout · 2020-07-23T22:45:56.637Z · score: 2 (1 votes) · LW(p) · GW(p)

In order to reduce bias (halo effect, racism, etc), shouldn't many judicial proceedings generally be held over telephone, and/or through digital audio-only calls with voice anonymizers? 

comment by Mark Xu (mark-xu) · 2020-07-25T04:11:22.213Z · score: 3 (2 votes) · LW(p) · GW(p)

I don't see strong reasons why this isn't a good idea. I have heard that technical interviews sometimes get conducted with voice anonymizers.

comment by TurnTrout · 2020-07-22T16:22:01.647Z · score: 2 (1 votes) · LW(p) · GW(p)

I prompted GPT-3 with modified versions of Eliezer's Beisutsukai stories, where I modified the "class project" to be about solving intent alignment instead of quantum gravity. 

... Taji looked over his sheets. "Okay, I think we've got to assume that every avenue that Eld science was trying is a blind alley, or they would have found it. And if this is possible to do in one month, the answer must be, in some sense, elegant. So no human mistake models. If we start doing anything that looks like we should call it 'utility function patching', we'd better stop."

He turned to Brennan. "Do you have any ideas?"

"I don't think so," she said. She glanced at her paper, frowning. "There are two problems here."

One problem was simple enough: she didn't understand what it meant for an AI to "want" to help a human. This was hard because as far as she could tell, there hadn't been much research done on this question. It seemed to be something that only humans did, but there wasn't any evidence that this was true. What was it about wanting? Was it a biological drive or a social instinct? A combination of both? More importantly, since it was a human concept, how could she measure if an AI behaved the way humans would in certain situations when the AI was obviously not human?

Then there was the question of practicality. The most elegant solution was not always the best. This question—could an AI desire to help its overseer?—was highly suited to explore this problem, but it still wouldn't be a real test. Humans operated on a "helpfulness scale" which started from things like cashiers wishing you "have a nice day" and descending to Nazi concentration camp guards.

She wanted to craft an AI that would at least be a good clerk in today's society. That was hardly an ambitious goal for a team of four highly intelligent teenagers. And yet, and yet...

"We can look at this differently," Hyori said. "People are born with a blank mind. We can shape them however we want. AIs on the other hand, are born with 800 terabytes of behavioral examples from the moment they're activated. The only data we have about unprogrammed AIs is that they either stay still or randomly move around. All this ... it's not making any confident in how easy this will be." [?]

Brennan stopped writing and turned to look at her, frowning. "So what are you saying?"

"I don't want to approach this problem by trying to divert the AI from its goal," she said. "What if, instead of changing the mind of an AI, we instead changed the environment that an AI found itself in?"

The team fell silent. 

Styrlyn broke the silence. "Uh..."

"What I mean is," she said, "what if, instead of trying to divert the AI from one task, we created a situation where accomplishing two tasks would be more beneficial than accomplishing just one? We don't need to patch new programs into the mind of an AI to make it want to help us. We can literally make helping us the most logical decision for it."

Full transcript.

comment by TurnTrout · 2020-07-10T21:20:58.213Z · score: 2 (1 votes) · LW(p) · GW(p)

Transparency Q: how hard would it be to ensure a neural network doesn't learn any explicit NANDs?

comment by TurnTrout · 2020-06-27T20:29:07.335Z · score: 2 (1 votes) · LW(p) · GW(p)

Physics has existed for hundreds of years. Why can you reach the frontier of knowledge with just a few years of study? Think of all the thousands of insights and ideas and breakthroughs that have been had - yet, I do not imagine you need most of those to grasp modern consensus.

Idea 1: the tech tree is rather horizontal - for any given question, several approaches and frames are tried. Some are inevitably more attractive or useful. You can view a Markov decision process in several ways - through the Bellman equations, through the structure of the state visitation distribution functions, through the environment's topology, through Markov chains induced by different policies. Almost everyone thinks about them in terms of Bellman equations, there were thousands of papers on that frame pre-2010, and you don't need to know most of them to understand how deep Q-learning works.

Idea 2: some "insights" are wrong (phlogiston) or approximate (Newtonian mechanics) and so are later discarded. The insights become historical curiosities and/or pedagogical tools and/or numerical approximations of a deeper phenomenon. 

Idea 3: most work is on narrow questions which end up being dead-ends or not generalizing. As a dumb example, I could construct increasingly precise torsion balance pendulums, in order to measure the mass of my copy of Dune to increasing accuracies. I would be learning new facts about the world using a rigorous and accepted methodology. But no one would care. 

More realistically, perhaps only a few other algorithms researchers care about my refinement of a specialized sorting algorithm (from  to ), but the contribution is still quite publishable and legible. 

I'm not sure what publishing incentives were like before the second half of the 20th century, so perhaps this kind of research was less incentivized in the past.

comment by Viliam · 2020-06-27T22:47:34.606Z · score: 8 (4 votes) · LW(p) · GW(p)

Could this depend on your definition of "physics"? Like, if you use a narrow definition like "general relativity + quantum mechanics", you can learn that in a few years. But if you include things like electricity, expansion of universe, fluid mechanics, particle physics, superconductors, optics, string theory, acoustics, aerodynamics... most of them may be relatively simple to learn, but all of them together it's too much.

comment by TurnTrout · 2020-06-28T04:07:40.864Z · score: 4 (2 votes) · LW(p) · GW(p)

Maybe. I don't feel like that's the key thing I'm trying to point at here, though. The fact that you can understand any one of those in a reasonable amount of time is still surprising, if you step back far enough.

comment by TurnTrout · 2020-06-24T01:16:46.096Z · score: 2 (1 votes) · LW(p) · GW(p)

When under moral uncertainty, rational EV maximization will look a lot like preserving attainable utility / choiceworthiness for your different moral theories / utility functions, while you resolve that uncertainty.

comment by MichaelA · 2020-06-24T03:05:06.050Z · score: 3 (2 votes) · LW(p) · GW(p)

This seems right to me, and I think it's essentially the rationale for the idea of the Long Reflection [EA · GW].

comment by TurnTrout · 2020-03-27T13:30:45.421Z · score: 2 (1 votes) · LW(p) · GW(p)

To prolong my medicine stores by 200%, I've mixed in similar-looking iron supplement placebos with my real medication. (To be clear, nothing serious happens to me if I miss days)