Posts

EA Forum AMA - MIRI's Buck Shlegeris 2019-11-15T23:27:07.238Z · score: 31 (12 votes)
A simple sketch of how realism became unpopular 2019-10-11T22:25:36.357Z · score: 61 (24 votes)
Christiano decision theory excerpt 2019-09-29T02:55:35.542Z · score: 57 (16 votes)
Kohli episode discussion in 80K's Christiano interview 2019-09-29T01:40:33.852Z · score: 14 (4 votes)
Rob B's Shortform Feed 2019-05-10T23:10:14.483Z · score: 19 (3 votes)
Helen Toner on China, CSET, and AI 2019-04-21T04:10:21.457Z · score: 71 (25 votes)
New edition of "Rationality: From AI to Zombies" 2018-12-15T21:33:56.713Z · score: 80 (31 votes)
On MIRI's new research directions 2018-11-22T23:42:06.521Z · score: 57 (16 votes)
Comment on decision theory 2018-09-09T20:13:09.543Z · score: 72 (27 votes)
Ben Hoffman's donor recommendations 2018-06-21T16:02:45.679Z · score: 40 (17 votes)
Critch on career advice for junior AI-x-risk-concerned researchers 2018-05-12T02:13:28.743Z · score: 208 (73 votes)
Two clarifications about "Strategic Background" 2018-04-12T02:11:46.034Z · score: 77 (23 votes)
Karnofsky on forecasting and what science does 2018-03-28T01:55:26.495Z · score: 17 (3 votes)
Quick Nate/Eliezer comments on discontinuity 2018-03-01T22:03:27.094Z · score: 76 (24 votes)
Yudkowsky on AGI ethics 2017-10-19T23:13:59.829Z · score: 92 (40 votes)
MIRI: Decisions are for making bad outcomes inconsistent 2017-04-09T03:42:58.133Z · score: 7 (8 votes)
CHCAI/MIRI research internship in AI safety 2017-02-13T18:34:34.520Z · score: 5 (6 votes)
MIRI AMA plus updates 2016-10-11T23:52:44.410Z · score: 15 (13 votes)
A few misconceptions surrounding Roko's basilisk 2015-10-05T21:23:08.994Z · score: 57 (53 votes)
The Library of Scott Alexandria 2015-09-14T01:38:27.167Z · score: 73 (55 votes)
[Link] Nate Soares is answering questions about MIRI at the EA Forum 2015-06-11T00:27:00.253Z · score: 19 (20 votes)
Rationality: From AI to Zombies 2015-03-13T15:11:20.920Z · score: 86 (85 votes)
Ends: An Introduction 2015-03-11T19:00:44.904Z · score: 4 (4 votes)
Minds: An Introduction 2015-03-11T19:00:32.440Z · score: 8 (10 votes)
Biases: An Introduction 2015-03-11T19:00:31.605Z · score: 101 (146 votes)
Rationality: An Introduction 2015-03-11T19:00:31.162Z · score: 17 (18 votes)
Beginnings: An Introduction 2015-03-11T19:00:25.616Z · score: 10 (7 votes)
The World: An Introduction 2015-03-11T19:00:12.370Z · score: 3 (3 votes)
Announcement: The Sequences eBook will be released in mid-March 2015-03-03T01:58:45.893Z · score: 47 (48 votes)
A forum for researchers to publicly discuss safety issues in advanced AI 2014-12-13T00:33:50.516Z · score: 12 (13 votes)
Stuart Russell: AI value alignment problem must be an "intrinsic part" of the field's mainstream agenda 2014-11-26T11:02:01.038Z · score: 26 (31 votes)
Groundwork for AGI safety engineering 2014-08-06T21:29:38.767Z · score: 13 (14 votes)
Politics is hard mode 2014-07-21T22:14:33.503Z · score: 43 (74 votes)
The Problem with AIXI 2014-03-18T01:55:38.274Z · score: 29 (29 votes)
Solomonoff Cartesianism 2014-03-02T17:56:23.442Z · score: 34 (31 votes)
Bridge Collapse: Reductionism as Engineering Problem 2014-02-18T22:03:08.008Z · score: 54 (49 votes)
Can We Do Without Bridge Hypotheses? 2014-01-25T00:50:24.991Z · score: 11 (12 votes)
Building Phenomenological Bridges 2013-12-23T19:57:22.555Z · score: 67 (60 votes)
The genie knows, but doesn't care 2013-09-06T06:42:38.780Z · score: 57 (63 votes)
The Up-Goer Five Game: Explaining hard ideas with simple words 2013-09-05T05:54:16.443Z · score: 29 (34 votes)
Reality is weirdly normal 2013-08-25T19:29:42.541Z · score: 33 (48 votes)
Engaging First Introductions to AI Risk 2013-08-19T06:26:26.697Z · score: 20 (27 votes)
What do professional philosophers believe, and why? 2013-05-01T14:40:47.028Z · score: 31 (44 votes)

Comments

Comment by robbbb on Taking Initial Viral Load Seriously · 2020-04-01T17:12:35.235Z · score: 14 (7 votes) · LW · GW

Rob Wiblin: "A problem with healthcare workers is they're going to be run off their feet, lacking sleep, etc. That could worsen outcomes. They are also at higher risk of being exposed to multiple pathogens simultaneously (e.g. get COVID and flu from patients the same day)."

Comment by robbbb on What should we do once infected with COVID-19? · 2020-04-01T00:30:50.822Z · score: 11 (2 votes) · LW · GW

Idea I saw someone float:

If [a COVID-19] case becomes severe & ventilator access is limited, postural drainage is a thing I would be trying (seems low-cost & fits my models about what sort of thing should help). https://www.healthline.com/health/postural-drainage

Relatedly, if you're showing COVID-19 symptoms, I think I would recommend that you start lying on your chest if you can sleep and rest well in that position, using pillows for support as needed. I base this on an NY doctor working in a non-ICU COVID-19 unit who says:

Proning [i.e., having patients lay on their stomach] is now standard in our ICU and I tried hard to get my sicker patients to do it too to head off intubation. [...]

https://twitter.com/SepsisUK/status/1243236007346163712 [...]

[in reply to "Is proning something we can do at home to help with milder symptoms? My brother is short of breath but not at ICU level, should he try this?":]

Yes, can’t hurt, likely help

I only looked at these studies briefly, but they suggest that ARDS patients benefit from lying on their stomach:

Comment by robbbb on How do you survive in the humanities? · 2020-02-23T05:49:42.148Z · score: 2 (1 votes) · LW · GW

The real disagreement is probably about whether the teacher would change her how-to-treat-evidence preferences if she were exposed to more information. Is her view stable, or would she see it for a confusion and mistake if she knew more, and say that she now sees things differently and more clearly?

Comment by robbbb on Concerns Surrounding CEV: A case for human friendliness first · 2020-01-24T04:44:36.792Z · score: 3 (2 votes) · LW · GW

Sure! :) Sorry if I came off as brusque, I was multi-tasking a bit.

Comment by robbbb on Concerns Surrounding CEV: A case for human friendliness first · 2020-01-23T23:55:48.756Z · score: 3 (2 votes) · LW · GW

I wasn't bringing up evolution because you brought up evolution; I was bringing it up separately to draw a specific analogy.

Comment by robbbb on Concerns Surrounding CEV: A case for human friendliness first · 2020-01-23T22:49:40.195Z · score: 2 (1 votes) · LW · GW

By analogy, I'd ask you to consider why it doesn't make sense to try to "cooperate" with the process of evolution. Evolution can be thought of as an optimizer, with a "goal" of maximizing inclusive reproductive fitness. Why do we just try to help actual conscious beings, rather than doing some compromise between "helping conscious beings" and "maximizing inclusive reproductive fitness" in order to be more fair to evolution?

A few reasons:

  • The things evolution "wants" are terrible. This isn't a case of "vanilla or chocolate?"; it's more like "serial killing or non-serial-killing?".
  • Evolution isn't a moral patient: it isn't a person, it doesn't have experiences or emotions, etc.
    • (A paperclip maximizer might be a moral patient, but it's not obvious that it would be; and there are obvious reasons for us to deliberately design AGI systems to not be moral patients, if possible.)
  • Evolution can't use threats or force to get us to do what it wants.
    • (Ditto a random optimizer, at least if we're smart enough to not build threatening or coercive systems!)
  • Evolution won't reciprocate if we're nice to it.
    • (Ditto a random optimizer. This is still true after you build an unfriendly optimizer, though not for the same reasons: an unfriendly superintelligence is smart enough to reciprocate, but there's no reason to do so relative to its own goals, if it can better achieve those goals through force.)
Comment by robbbb on Concerns Surrounding CEV: A case for human friendliness first · 2020-01-23T22:35:06.036Z · score: 4 (2 votes) · LW · GW

In the next part (forgiving me if this is way off) essentially you are saying my second question in the post is false, it wont be self aware or if it is it wont reflect enough to consider significantly rewriting its source code

No, this is not right. A better way of stating my claim is: "The notion of 'self-awareness' or 'reflectiveness' you're appealing to here is a confused notion." You're doing the thing described in Ghosts in the Machine and Anthropomorphic Optimism, most likely for reasons described in Sympathetic Minds and Humans in Funny Suits: absent a conscious effort to correct for anthropomorphism, humans naturally model other agents in human-ish terms.

Im more positing at what point does paperclip maximizer learn so much it has a model of behaving in a manner that doesn't optimize paperclips and explores that, or have a model of its own learning capabilities and explore optimizing for other utilities.

What does "exploring" mean? I think that I'm smart enough to imagine adopting an ichneumon wasp's values, or a serial killer's values, or the values of someone who hates baroque pop music and has strong pro-Spain nationalist sentiments; but I don't try to actually adopt those values, it's just a thought experiment. If a paperclip maximizer considers the thought experiment "what if I switched to less paperclip-centric values?", why (given its current values) would it decide to make that switch?

maybe the initial task we give it should take into account what its potential volition may be at some point rather than just our own as a pre signal of pre committing to cooperation.

I think there's a good version of ideas in this neighborhood, and a bad version of such ideas. The good version is cosmopolitan value and not trying to lock in the future to an overly narrow or parochial "present-day-human-beings" version of what's good and beautiful.

The bad version is deliberately building a paperclipper out of a misguided sense of fairness to random counterfactual value systems, or out of a misguided hope that a paperclipper will spontaneously generate emotions of mercy, loyalty, or reciprocity when given a chance to convert especially noble and virtuous humans into paperclips.

Comment by robbbb on Concerns Surrounding CEV: A case for human friendliness first · 2020-01-23T18:19:50.024Z · score: 21 (5 votes) · LW · GW

To answer questions like these, I recommend reading https://www.lesswrong.com/rationality and then browsing https://arbital.com/explore/ai_alignment/. Especially relevant:

Or, quoting "The Value Learning Problem":

[S]ystems that can strictly outperform humans cognitively have less to gain from integrating into existing economies and communities. Hall [2007] has argued:

"The economic law of comparative advantage states that cooperation between individuals of differing capabilities remains mutually beneficial. [ . . . ] In other words, even if AIs become much more productive than we are, it will remain to their advantage to trade with us and to ours to trade with them."

As noted by Benson-Tilsen and Soares [forthcoming 2016], however, rational trade presupposes that agents expect more gains from trade than from coercion. Non-human species have various “comparative advantages” over humans, but humans generally exploit non-humans through force. Similar patterns can be observed in the history of human war and conquest. Whereas agents at similar capability levels have incentives to compromise, collaborate, and trade, agents with strong power advantages over others can have incentives to simply take what they want.

The upshot of this is that engineering a functioning society of powerful autonomous AI systems and humans requires that those AI systems be prosocial. The point is an abstract one, but it has important practical consequences: rational agents’ interests do not align automatically, particularly when they have very different goals and capabilities.

And quoting Ensuring smarter-than-human intelligence has a positive outcome:

The notion of AI systems “breaking free” of the shackles of their source code or spontaneously developing human-like desires is just confused. The AI system is its source code, and its actions will only ever follow from the execution of the instructions that we initiate. The CPU just keeps on executing the next instruction in the program register. We could write a program that manipulates its own code, including coded objectives. Even then, though, the manipulations that it makes are made as a result of executing the original code that we wrote; they do not stem from some kind of ghost in the machine.

The serious question with smarter-than-human AI is how we can ensure that the objectives we’ve specified are correct, and how we can minimize costly accidents and unintended consequences in cases of misspecification.

Enslaving conscious beings is obviously bad. It would be catastrophic to bake into future AGI systems the assumption that non-human animals, AI systems, ems, etc. can't be moral patients, and there should be real effort to avoid accidentally building AI systems that are moral patients (or that contain moral patients as subsystems); and if we do build AI systems like that, then their interests need to be fully taken into account.

But the language you use in the post above is privileging the hypothesis that AGI systems' conditional behavior and moral status will resemble a human's, and that we can't design smart optimizers any other way. You're positing that sufficiently capable paperclip maximizers must end up with sufficient nobility of spirit to prize selflessness, trust, and universal brotherhood over paperclips; but what's the causal mechanism by which this nobility of spirit enters the system's values? It can't just be "the system can reflect on its goals and edit them", since the system's decisions about which edits to make to its goals (if any) are based on the goals it already has.

You frame alignment as "servitude", as though there's a ghost or homunculus in the AI with pre-existing goals that the AI programmers ruthlessly subvert or overwrite. But there isn't a ghost, just a choice by us to either build systems with humane-value-compatible or humane-value-incompatible optimization targets.

The links above argue that the default outcome, if you try to be "hands-off", is a human-value-incompatible target -- and not because inhumane values are what some ghost "really" wants, and being hands-off is a way of letting it follow through on its heart's desire. Rather, the heart's desire is purely a product of our design choices, with no "perfectly impartial and agent-neutral" reason to favor one option over any other (though plenty of humane reasons to do so!!), and the default outcome comes from the fact that many possible minds happen to converge on adversarial strategies, even though there's no transcendent agent that "wants" this convergence to happen. Trying to cooperate with this convergence property is like trying to cooperate with gravity, or with a rock.

Comment by robbbb on BrienneYudkowsky's Shortform · 2019-12-30T03:22:28.179Z · score: 2 (1 votes) · LW · GW

May 2018 Brienne post, "In Defense of Shame"

Comment by robbbb on We run the Center for Applied Rationality, AMA · 2019-12-22T15:17:29.545Z · score: 18 (6 votes) · LW · GW

I feel like this comment should perhaps be an AIRCS class -- not on meta-ethics, but on 'how to think about what doing debugging your brain is, if your usual ontology is "some activities are object-level engineering, some activities are object-level science, and everything else is bullshit or recreation"'. (With meta-ethics addressed in passing as a concrete example.)

Comment by robbbb on We run the Center for Applied Rationality, AMA · 2019-12-22T15:14:00.209Z · score: 33 (14 votes) · LW · GW

I felt a "click" in my brain reading this comment, like an old "something feels off, but I'm not sure what" feeling about rationality techniques finally resolving itself.

If this comment were a post, and I were in the curating-posts business, I'd curate it. The demystified concrete examples of the mental motion "use a tool from an unsciencey field to help debug scientists" are super helpful.

Comment by robbbb on We run the Center for Applied Rationality, AMA · 2019-12-22T06:19:31.595Z · score: 12 (8 votes) · LW · GW

Can you too-tersely summarize your Nisbett and Wilson argument?

Or, like... writer a teaser / movie trailer for it, if you're worried your summary would be incomplete or inoculating?

Comment by robbbb on We run the Center for Applied Rationality, AMA · 2019-12-21T20:45:37.493Z · score: 15 (8 votes) · LW · GW

I don't know Dario well, but I know enough to be able to tell that the anon here doesn't know what they're talking about re Dario.

Comment by robbbb on We run the Center for Applied Rationality, AMA · 2019-12-20T23:27:18.254Z · score: 30 (7 votes) · LW · GW

More timeline statements, from Eliezer in March 2016:

That said, timelines are the hardest part of AGI issues to forecast, by which I mean that if you ask me for a specific year, I throw up my hands and say “Not only do I not know, I make the much stronger statement that nobody else has good knowledge either.” Fermi said that positive-net-energy from nuclear power wouldn’t be possible for 50 years, two years before he oversaw the construction of the first pile of uranium bricks to go critical. The way these things work is that they look fifty years off to the slightly skeptical, and ten years later, they still look fifty years off, and then suddenly there’s a breakthrough and they look five years off, at which point they’re actually 2 to 20 years off.

If you hold a gun to my head and say “Infer your probability distribution from your own actions, you self-proclaimed Bayesian” then I think I seem to be planning for a time horizon between 8 and 40 years, but some of that because there’s very little I think I can do in less than 8 years, and, you know, if it takes longer than 40 years there’ll probably be some replanning to do anyway over that time period.

And from me in April 2017:

Since [August], senior staff at MIRI have reassessed their views on how far off artificial general intelligence (AGI) is and concluded that shorter timelines are more likely than they were previously thinking. [...]

There’s no consensus among MIRI researchers on how long timelines are, and our aggregated estimate puts medium-to-high probability on scenarios in which the research community hasn’t developed AGI by, e.g., 2035. On average, however, research staff now assign moderately higher probability to AGI’s being developed before 2035 than we did a year or two ago.

I talked to Nate last month and he outlined the same concepts and arguments from Eliezer's Oct. 2017 There's No Fire Alarm for AGI (mentioned by Ben above) to describe his current view of timelines, in particular (quoting Eliezer's post):

History shows that for the general public, and even for scientists not in a key inner circle, and even for scientists in that key circle, it is very often the case that key technological developments still seem decades away, five years before they show up. [...]

And again, that's not to say that people saying "fifty years" is a certain sign that something is happening in a squash court; they were saying “fifty years” sixty years ago too. It's saying that anyone who thinks technological timelines are actually forecastable, in advance, by people who are not looped in to the leading project's progress reports and who don't share all the best ideas about exactly how to do the thing and how much effort is required for that, is learning the wrong lesson from history. In particular, from reading history books that neatly lay out lines of progress and their visible signs that we all know now were important and evidential. It's sometimes possible to say useful conditional things about the consequences of the big development whenever it happens, but it’s rarely possible to make confident predictions about the timing of those developments, beyond a one- or two-year horizon. And if you are one of the rare people who can call the timing, if people like that even exist, nobody else knows to pay attention to you and not to the Excited Futurists or Sober Skeptics. [...]

So far as I can presently estimate, now that we've had AlphaGo and a couple of other maybe/maybe-not shots across the bow, and seen a huge explosion of effort invested into machine learning and an enormous flood of papers, we are probably going to occupy our present epistemic state until very near the end.

By saying we're probably going to be in roughly this epistemic state until almost the end, I don't mean to say we know that AGI is imminent, or that there won't be important new breakthroughs in AI in the intervening time. I mean that it's hard to guess how many further insights are needed for AGI, or how long it will take to reach those insights. After the next breakthrough, we still won't know how many more breakthroughs are needed, leaving us in pretty much the same epistemic state as before. Whatever discoveries and milestones come next, it will probably continue to be hard to guess how many further insights are needed, and timelines will continue to be similarly murky. Maybe researcher enthusiasm and funding will rise further, and we'll be able to say that timelines are shortening; or maybe we’ll hit another AI winter, and we'll know that's a sign indicating that things will take longer than they would otherwise; but we still won't know how long.

Comment by robbbb on Moloch's Toolbox (1/2) · 2019-12-19T12:03:08.692Z · score: 2 (1 votes) · LW · GW

Yeah, to be clear, I don't think everything Simplicio says is obviously wrong, and I think a more interesting (and much longer) version of this dialogue would have tried harder to understand and steelman various Simplicio-models, regardless of whether it ended up siding against Simplicio.

Comment by robbbb on Moloch's Toolbox (1/2) · 2019-12-19T02:03:33.812Z · score: 7 (3 votes) · LW · GW

Inadequate Equilibria was Eliezer trying to explain the perspective he'd previously tagged "civilizational inadequacy" and before that "people are crazy, the world is mad". One of the main ways he'd previously been misunderstood when he used those phrases was that people took them as "generic cynicism" or "the kind of cynicism I'm used to", so my model of Eliezer considered it particularly important to differentiate those two things now that he was finally properly explaining the view.

Comment by robbbb on Moloch's Toolbox (1/2) · 2019-12-19T01:59:37.290Z · score: 7 (3 votes) · LW · GW

I think calling the character "Simplicio" was a joke. I think that the main function the character is meant to serve is to distinguish "civilizational inadequacy" pessimism from other forms of pessimism. If the discussion were just "person talking about how things are broken vs. person who doesn't expect things to be broken by default", it would be easier to round off the former perspective to "generic pessimism/cynicism" or "the forms of pessimism/cynicism I'm most familiar with".

Comment by robbbb on Is Rationalist Self-Improvement Real? · 2019-12-12T21:49:28.355Z · score: 4 (2 votes) · LW · GW

I'm confused about how manioc detox is more useful to the group than the individual - each individual self-interestedly would prefer to detox manioc, since they will die (eventually) if they don't.

Yeah, I was wrong about manioc.

Something about the "science is fragile" argument feels off to me. Perhaps it's that I'm not really thinking about RCTs; I'm looking at Archimedes, Newton, and Feynman, and going "surely there's something small that could have been tweaked about culture beforehand to make some of this low-hanging scientific fruit get grabbed earlier by a bunch of decent thinkers, rather than everything needing to wait for lone geniuses". Something feels off to me when I visualize a world where all the stupidly-simple epistemic-methods-that-are-instrumentally-useful fruit got plucked 4000 years ago, but where Feynman can see big gains from mental habits like "look at the water" (which I do think happened).

Your other responses make sense. I'll need to chew on your comments longer to see how much I end up updating overall toward your view.

Comment by robbbb on Is Rationalist Self-Improvement Real? · 2019-12-10T01:33:05.892Z · score: 19 (9 votes) · LW · GW

I'm not sure how much we disagree; it sounds like I disagree with you, but maybe most of that is that we're using different framings / success thresholds.

Efficient markets. Rationalists developed rationalist self-help by thinking about it for a while. This implies that everyone else left a $100 bill on the ground for the past 4000 years. If there were techniques to improve your financial, social, and romantic success that you could develop just by thinking about them, the same people who figured out the manioc detoxification techniques, or oracle bone randomization for hunting, or all the other amazingly complex adaptations they somehow developed, would have come up with them.

If you teleported me 4000 years into the past and deleted all of modernity and rationalism's object-level knowledge of facts from my head, but let me keep as many thinking heuristics and habits of thought as I wanted, I think those heuristics would have a pretty large positive effect on my ability to pursue mundane happiness and success (compared to someone with the same object-level knowledge but more normal-for-the-time heuristics).

The way you described things here feels to me like it would yield a large overestimate of how much deliberate quality-adjusted optimization (or even experimentation and random-cultural-drift-plus-selection-for-things-rationalists-happen-to-value) human individuals and communities probably put into discovering, using, and propagating "rationalist skills that work" throughout all of human history.

Example: implementation intentions / TAPs are an almost comically simple idea. AFAIK, it has a large effect size that hasn't fallen victim to the replication crisis (yet!). Humanity crystallized this idea in 1999. A well-calibrated model of "how much optimization humanity has put into generating, using, and propagating rationality techniques" shouldn't strongly predict that an idea this useful and simple will reach fixation in any culture or group throughout human history before the 1990s, since this in fact never happened. But your paragraph above seems to me like it would predict that many societies throughout history would have made heavy use of TAPs.

I'd similarly worry that the "manioc detoxification is the norm + human societies are as efficient at installing mental habits and group norms as they are at detoxifying manioc" model should predict that the useful heuristics underlying the 'scientific method' (e.g., 'test literally everything', using controls, trying to randomize) reach fixation in more societies earlier.

Plausibly science is more useful to the group than to the individual; but the same is true for manioc detoxification. There's something about ideas like science that caused societies not to converge on them earlier. (And this should hold with even more force for any ideas that are hard to come up with, deploy, or detect-the-usefulness-of without science.)

Another thing that it sounds like your stated model predicts: "adopting prediction markets wouldn't help organizations or societies make money, or they'd already have been widely adopted". (Of course, what helps the group succeed might not be what helps the relevant decisionmakers in that organization succeed. But it didn't sound like you expected rationalists to outperform common practice or common sense on "normal" problems, even at the group level.)

Comment by robbbb on Raemon's Scratchpad · 2019-12-07T02:25:16.405Z · score: 2 (1 votes) · LW · GW

I think I prefer bolding full lines b/c it makes it easier to see who authored what?

Comment by robbbb on Raemon's Scratchpad · 2019-12-07T01:30:07.655Z · score: 10 (4 votes) · LW · GW

I'd be interested in trying it out. At a glance, it feels too much to me like it's trying to get me to read Everything, when I can tell from the titles and snippets that some posts aren't for me. If anything the posts I've already read are often ones I want emphasized more? (Because I'm curious to see if there are new comments on things I've already read, or I may otherwise want to revisit the post to link others to it, or finish reading it, etc.)

The bold font does look aesthetically fine and breaks things up in an interesting way, so I like the idea of maybe using it for more stuff?

Comment by robbbb on The Devil Made Me Write This Post Explaining Why He Probably Didn't Hide Dinosaur Bones · 2019-12-06T15:02:07.609Z · score: 4 (2 votes) · LW · GW
Comment by robbbb on Misconceptions about continuous takeoff · 2019-12-05T21:12:04.809Z · score: 6 (3 votes) · LW · GW

That part of the interview with Paul was super interesting to me, because the following were previously claims I'd heard from Nate and Eliezer in their explanations of how they think about fast takeoff:

[E]volution [hasn't] been putting a decent amount of effort into optimizing for general intelligence. [...]

'I think if you optimize AI systems for reasoning, it appears much, much earlier.'

Ditto things along the lines of this Paul quote from the same 80K interview:

It’s totally conceivable from our current perspective, I think, that an intelligence that was as smart as a crow, but was actually designed for doing science, actually designed for doing engineering, for advancing technologies rapidly as possible -- it is quite conceivable that such a brain would actually outcompete humans pretty badly at those tasks.


I think that’s another important thing to have in mind, and then when we talk about when stuff goes crazy, I would guess humans are an upper bound for when stuff goes crazy. That is we know that if we had cheap simulated humans, that technological progress would be much, much faster than it is today. But probably stuff goes crazy somewhat before you actually get to humans.

This is part of why I don't talk about "human-level" AI when I write things for MIRI.

If you think humans, corvids, etc. aren't well-optimized for economically/pragmatically interesting feats, this predicts that timelines may be shorter and that "human-level" may be an especially bad way of thinking about the relevant threshold(s).

There still remains the question of whether the technological path to "optimizing messy physical environments" (or "science AI", or whatever we want to call it) looks like a small number of "we didn't know how to do this at all, and now we do know how to do this and can suddenly take much better advantage of available compute" events, vs. looking like a large number of individually low-impact events spread out over time.

If no one event is impactful enough, then a series of numerous S-curves ends up looking like a smooth slope when you zoom out; and large historical changes are usually made of many small changes that add up to one big effect. We don't invent nuclear weapons, get hit by a super-asteroid, etc. every other day.

Comment by robbbb on A list of good heuristics that the case for AI x-risk fails · 2019-12-04T21:54:38.980Z · score: 12 (6 votes) · LW · GW

This doesn't seem like it belongs on a "list of good heuristics", though!

Comment by robbbb on A list of good heuristics that the case for AI x-risk fails · 2019-12-03T18:30:21.502Z · score: 6 (4 votes) · LW · GW

I helped make this list in 2016 for a post by Nate, partly because I was dissatisfied with Scott's list (which includes people like Richard Sutton, who thinks worrying about AI risk is carbon chauvinism):

Stuart Russell’s Cambridge talk is an excellent introduction to long-term AI risk. Other leading AI researchers who have expressed these kinds of concerns about general AI include Francesca Rossi (IBM), Shane Legg (Google DeepMind), Eric Horvitz (Microsoft), Bart Selman (Cornell), Ilya Sutskever (OpenAI), Andrew Davison (Imperial College London), David McAllester (TTIC), and Jürgen Schmidhuber (IDSIA).

These days I'd probably make a different list, including people like Yoshua Bengio. AI risk stuff is also sufficiently in the Overton window that I care more about researchers' specific views than about "does the alignment problem seem nontrivial to you?". Even if we're just asking the latter question, I think it's more useful to list the specific views and arguments of individuals (e.g., note that Rossi is more optimistic about the alignment problem than Russell), list the views and arguments of the similarly prominent CS people who think worrying about AGI is silly, and let people eyeball which people they think tend to produce better reasons.

Comment by robbbb on Optimization Amplifies · 2019-12-02T06:10:05.382Z · score: 5 (3 votes) · LW · GW

One of the main explanations of the AI alignment problem I link people to.

Comment by robbbb on Useful Does Not Mean Secure · 2019-12-02T03:33:24.439Z · score: 13 (3 votes) · LW · GW

Eliezer also strongly believes that discrete jumps will happen. But the crux for him AFAIK is absolute capability and absolute speed of capability gain in AGI systems, not discontinuity per se (and not particular methods for improving capability, like recursive self-improvement). Hence in So Far: Unfriendly AI Edition Eliezer lists his key claims as:

  • (1) "Orthogonality thesis",
  • (2) "Instrumental convergence",
  • (3) "Rapid capability gain and large capability differences",
  • (A) superhuman intelligence makes things break that don't break at infrahuman levels,
  • (B) "you have to get [important parts of] the design right the first time",
  • (C) "if something goes wrong at any level of abstraction, there may be powerful cognitive processes seeking out flaws and loopholes in your safety measures", and the meta-level
  • (D) "these problems don't show up in qualitatively the same way when people are pursuing their immediate incentives to get today's machine learning systems working today".

From Sam Harris' interview of Eliezer (emphasis added):

Eliezer: [...] I think that artificial general intelligence capabilities, once they exist, are going to scale too fast for that to be a useful way to look at the problem. AlphaZero going from 0 to 120 mph in four hours or a day—that is not out of the question here. And even if it’s a year, a year is still a very short amount of time for things to scale up.

[...] I’d say this is a thesis of capability gain. This is a thesis of how fast artificial general intelligence gains in power once it starts to be around, whether we’re looking at 20 years (in which case this scenario does not happen) or whether we’re looking at something closer to the speed at which Go was developed (in which case it does happen) or the speed at which AlphaZero went from 0 to 120 and better-than-human (in which case there’s a bit of an issue that you better prepare for in advance, because you’re not going to have very long to prepare for it once it starts to happen).

[...] Why do I think that? It’s not that simple. I mean, I think a lot of people who see the power of intelligence will already find that pretty intuitive, but if you don’t, then you should read my paper Intelligence Explosion Microeconomics about returns on cognitive reinvestment. It goes through things like the evolution of human intelligence and how the logic of evolutionary biology tells us that when human brains were increasing in size, there were increasing marginal returns to fitness relative to the previous generations for increasing brain size. Which means that it’s not the case that as you scale intelligence, it gets harder and harder to buy. It’s not the case that as you scale intelligence, you need exponentially larger brains to get linear improvements.

At least something slightly like the opposite of this is true; and we can tell this by looking at the fossil record and using some logic, but that’s not simple.

Sam: Comparing ourselves to chimpanzees works. We don’t have brains that are 40 times the size or 400 times the size of chimpanzees, and yet what we’re doing—I don’t know what measure you would use, but it exceeds what they’re doing by some ridiculous factor.

Eliezer: And I find that convincing, but other people may want additional details.

[...] AlphaZero seems to me like a genuine case in point. That is showing us that capabilities that in humans require a lot of tweaking and that human civilization built up over centuries of masters teaching students how to play Go, and that no individual human could invent in isolation… [...] AlphaZero blew past all of that in less than a day, starting from scratch, without looking at any of the games that humans played, without looking at any of the theories that humans had about Go, without looking at any of the accumulated knowledge that we had, and without very much in the way of special-case code for Go rather than chess—in fact, zero special-case code for Go rather than chess. And that in turn is an example that refutes another thesis about how artificial general intelligence develops slowly and gradually, which is: “Well, it’s just one mind; it can’t beat our whole civilization.”

I would say that there’s a bunch of technical arguments which you walk through, and then after walking through these arguments you assign a bunch of probability, maybe not certainty, to artificial general intelligence that scales in power very fast—a year or less. And in this situation, if alignment is technically difficult, if it is easy to screw up, if it requires a bunch of additional effort—in this scenario, if we have an arms race between people who are trying to get their AGI first by doing a little bit less safety because from their perspective that only drops the probability a little; and then someone else is like, “Oh no, we have to keep up. We need to strip off the safety work too. Let’s strip off a bit more so we can get in the front.”—if you have this scenario, and by a miracle the first people to cross the finish line have actually not screwed up and they actually have a functioning powerful artificial general intelligence that is able to prevent the world from ending, you have to prevent the world from ending. You are in a terrible, terrible situation. You’ve got your one miracle. And this follows from the rapid capability gain thesis and at least the current landscape for how these things are developing.

See also:

The question is simply "Can we do cognition of this quality at all?"[...] The speed and quantity of cognition isn't the big issue, getting to that quality at all is the question. Once you're there, you can solve any problem which can realistically be done with non-exponentially-vast amounts of that exact kind of cognition.

Comment by robbbb on What's been written about the nature of "son-of-CDT"? · 2019-12-01T20:19:37.561Z · score: 4 (3 votes) · LW · GW

The Retro Blackmail Problem in "Toward Idealized Decision Theory" shows that if CDT can self-modify (i.e., build an agent that follows an arbitrary decision rule), it self-modifies to something that still gives in to some forms of blackmail. This is Son-of-CDT, though they don't use the name.

Comment by robbbb on What's been written about the nature of "son-of-CDT"? · 2019-12-01T20:14:37.537Z · score: 5 (2 votes) · LW · GW

My understanding is that the CDT agent would take the choice that causes the highest number of paperclips to be created (in expectation).

This is true if we mean something very specific by "causes". CDT picks the action that would cause the highest number of paperclips to be created, if past predictions were uncorrelated with future events.

I agree that a CDT agent will never agree to precommit to acting like a LDT agent for correlations that have already been created, but I don't think that determines what kind of successor agent they would choose to create.

If an agent can arbitrarily modify its own source code ("precommit" in full generality), then we can model "the agent making choices over time" as "a series of agents that are constantly choosing which successor-agent follows them at the next time-step". If Son-of-CDT were the same as LDT, this would be the same as saying that a self-modifying CDT agent will rewrite itself into an LDT agent, since nothing about CDT or LDT assigns special weight to actions that happen inside the agent's brain vs. outside the agent's brain.

Comment by robbbb on Toward a New Technical Explanation of Technical Explanation · 2019-12-01T20:02:14.542Z · score: 8 (4 votes) · LW · GW

When I read this post, it struck me as a remarkably good introduction to logical induction, and the whole discussion seemed very core to the formal-epistemology projects on LW and AIAF.

Comment by robbbb on Useful Does Not Mean Secure · 2019-12-01T07:47:53.962Z · score: 11 (3 votes) · LW · GW

Note that on my model, the kind of paranoia Eliezer is pointing to with "AI safety mindset" or security mindset is something he believes you need in order to prevent adversarialness and the other bad byproducts of "your system devotes large amounts of thought to things and thinks in really weird ways". It's not just (or even primarily) a fallback measure to keep you safe on the off chance your system does generate a powerful adversary. Quoting Nate:

Lastly, alignment looks difficult for the same reason computer security is difficult: systems need to be robust to intelligent searches for loopholes.

Suppose you have a dozen different vulnerabilities in your code, none of which is itself fatal or even really problematic in ordinary settings. Security is difficult because you need to account for intelligent attackers who might find all twelve vulnerabilities and chain them together in a novel way to break into (or just break) your system. Failure modes that would never arise by accident can be sought out and exploited; weird and extreme contexts can be instantiated by an attacker to cause your code to follow some crazy code path that you never considered.

A similar sort of problem arises with AI. The problem I’m highlighting here is not that AI systems might act adversarially: AI alignment as a research program is all about finding ways to prevent adversarial behavior before it can crop up. We don’t want to be in the business of trying to outsmart arbitrarily intelligent adversaries. That’s a losing game.

The parallel to cryptography is that in AI alignment we deal with systems that perform intelligent searches through a very large search space, and which can produce weird contexts that force the code down unexpected paths. This is because the weird edge cases are places of extremes, and places of extremes are often the place where a given objective function is optimized. Like computer security professionals, AI alignment researchers need to be very good at thinking about edge cases.

It’s much easier to make code that works well on the path that you were visualizing than to make code that works on all the paths that you weren’t visualizing. AI alignment needs to work on all the paths you weren’t visualizing.

Scott Garrabrant mentioned to me at one point that he thought Optimization Amplifies distills a (maybe the?) core idea in Security Mindset and Ordinary Paranoia. The problem comes from "lots of weird, extreme-state-instantiating, loophole-finding optimization", not from "lots of adversarial optimization" (even though the latter is a likely consequence of getting things badly wrong with the former).

Eliezer models most of the difficulty (and most of the security-relatedness) of the alignment problem as lying in 'get ourselves to a place where in fact our systems don't end up as powerful adversarial optimizers', rather than (a) treating this as a gimme and focusing on what we should do absent such optimizers, or (b) treating the presence of adversarial optimization as inevitable and asking how to manage it.

I think this idea ("avoiding generating powerful adversarial optimizers is an enormous constraint and requires balancing on a knife's edge between disaster and irrelevance") is also behind the view that system safety largely comes from things like "the system can't think about any topics, or try to solve any cognitive problems, other than the ones we specifically want it to", vs. Rohin's "the system is trying to do what we want".

Comment by robbbb on Useful Does Not Mean Secure · 2019-12-01T07:41:52.224Z · score: 15 (4 votes) · LW · GW

Re:

"As you take the system and make it vastly superintelligent, your primary focus needs to be on security from adversarial forces, rather than primarily on making something that's useful."

I agree if you assume a discrete action that simply causes the system to become vastly superintelligent. But we can try not to get to powerful adversarial optimization in the first place; if that never happens then you never need the security.

And:

I certainly agree that in the presence of powerful adversarial optimizers, you need security to get your system to do what you want. However, we can just not build powerful adversarial optimizers. My preferred solution is to make sure our AI systems are trying to do what we want , so that they never become adversarial in the first place. But if for some reason we can't do that, then we could make sure AI systems don't become too powerful, or not build them at all. It seems very weird to instead say "well, the AI system is going to be adversarial and way more powerful, let's figure out how to make it secure" -- that should be the last approach, if none of the other approaches work out.

The latter summary in particular sounds superficially like Eliezer's proposed approach, except that he doesn't think it's easy in the AGI regime to "just not build powerful adversarial optimizers" (and if he suspected this was easy, he wouldn't want to build in the assumption that it's easy as a prerequisite for a safety approach working; he would want a safety approach that's robust to the scenario where it's easy to accidentally end up with vastly more quality-adjusted optimization than intended).

The "do alignment in a way that doesn't break if capability gain suddenly speeds up" approach, or at least Eliezer's version of that approach, similarly emphasizes "you're screwed (in the AGI regime) if you build powerful adversarial optimizers, and it's a silly idea to do that in the first place, so just don't do it, ever, in any context". From AI Safety Mindset:

Niceness as the first line of defense / not relying on defeating a superintelligent adversary

[...] Paraphrasing Schneier, we might say that there's three kinds of security in the world: Security that prevents your little brother from reading your files, security that prevents major governments from reading your files, and security that prevents superintelligences from getting what they want. We can then go on to remark that the third kind of security is unobtainable, and even if we had it, it would be very hard for us to know we had it. Maybe superintelligences can make themselves knowably secure against other superintelligences, but we can't do that and know that we've done it.

[...] The final component of an AI safety mindset is one that doesn't have a strong analogue in traditional computer security, and it is the rule of not ending up facing a transhuman adversary in the first place. The winning move is not to play. Much of the field of value alignment theory is about going to any length necessary to avoid needing to outwit the AI.

In AI safety, the first line of defense is an AI that does not want to hurt you. If you try to put the AI in an explosive-laced concrete bunker, that may or may not be a sensible and cost-effective precaution in case the first line of defense turns out to be flawed. But the first line of defense should always be an AI that doesn't want to hurt you or avert your other safety measures, rather than the first line of defense being a clever plan to prevent a superintelligence from getting what it wants.

A special case of this mindset applied to AI safety is the Omni Test - would this AI hurt us (or want to defeat other safety measures) if it were omniscient and omnipotent? If it would, then we've clearly built the wrong AI, because we are the ones laying down the algorithm and there's no reason to build an algorithm that hurts us period. If an agent design fails the Omni Test desideratum, this means there are scenarios that it prefers over the set of all scenarios we find acceptable, and the agent may go searching for ways to bring about those scenarios.

If the agent is searching for possible ways to bring about undesirable ends, then we, the AI programmers, are already spending computing power in an undesirable way. We shouldn't have the AI running a search that will hurt us if it comes up positive, even if we expect the search to come up empty. We just shouldn't program a computer that way; it's a foolish and self-destructive thing to do with computing power. Building an AI that would hurt us if omnipotent is a bug for the same reason that a NASA probe crashing if all seven other planets line up would be a bug - the system just isn't supposed to behave that way period; we should not rely on our own cleverness to reason about whether it's likely to happen.

Omnipotence Test for AI Safety:

Suppose your AI suddenly became omniscient and omnipotent - suddenly knew all facts and could directly ordain any outcome as a policy option. Would the executing AI code lead to bad outcomes in that case? If so, why did you write a program that in some sense 'wanted' to hurt you and was only held in check by lack of knowledge and capability? Isn't that a bad way for you to configure computing power? Why not write different code instead?

The Omni Test is that an advanced AI should be expected to remain aligned, or not lead to catastrophic outcomes, or fail safely, even if it suddenly knows all facts and can directly ordain any possible outcome as an immediate choice. The policy proposal is that, among agents meant to act in the rich real world, any predicted behavior where the agent might act destructively if given unlimited power (rather than e.g. pausing for a safe user query) should be treated as a bug.

Non-Adversarial Principle:

The 'Non-Adversarial Principle' is a proposed design rule for sufficiently advanced Artificial Intelligence stating that:

By design, the human operators and the AGI should never come into conflict.

Special cases of this principle include Niceness is the first line of defense and The AI wants your safety measures.

[...] No aspect of the AI's design should ever put us in an adversarial position vis-a-vis the AI, or pit the AI's wits against our wits. If a computation starts looking for a way to outwit us, then the design and methodology has already failed. We just shouldn't be putting an AI in a box and then having the AI search for ways to get out of the box. If you're building a toaster, you don't build one element that heats the toast and then add a tiny refrigerator that cools down the toast.

Cf. the "X-and-only-X" problem.

Comment by robbbb on The Correct Contrarian Cluster · 2019-11-30T15:44:56.841Z · score: 8 (4 votes) · LW · GW

Huh? Strong evidence for that would be us all being dead.

I want to insist that "it's unreasonable to strongly update about technological risks until we're all dead" is not a great heuristic for evaluating GCRs.

Comment by robbbb on The Correct Contrarian Cluster · 2019-11-29T21:10:03.982Z · score: 13 (3 votes) · LW · GW

bfinn was discounting Eliezer for being a non-economist, rather than discounting Sumner for being insufficiently mainstream; and bfinn was skeptical in particular that Eliezer understood NGDP targeting well enough to criticize the Bank of Japan. So Sumner seems unusually relevant here, and I'd expect him to pick up on more errors from someone talking at length about his area of specialization.

Comment by robbbb on Getting Ready for the FB Donation Match · 2019-11-28T22:55:20.253Z · score: 3 (3 votes) · LW · GW

Colm put together specific recommendations for people who want to help MIRI get matched on Giving Tuesday: https://intelligence.org/2019/11/28/giving-tuesday-2019/.

Other EA orgs that want to get matched might benefit from something similar; I haven't looked at the specific suggestions other orgs are making.

Comment by robbbb on The Correct Contrarian Cluster · 2019-11-28T22:10:36.780Z · score: 21 (5 votes) · LW · GW

You're leaning heavily on the concept "amateur", which (a) doesn't distinguish "What's your level of knowledge and experience with X?" and "Is X your day job?", and (b) treats people as being generically "good" or "bad" at extremely broad and vague categories of proposition like "propositions about quantum physics" or "propositions about macroeconomics".

I think (b) is the main mistake you're making in the quantum physics case. Eliezer isn't claiming "I'm better at quantum physics than professionals". He's claiming that the specific assertion "reifying quantum amplitudes (in the absence of evidence against collapse/agnosticism/nonrealism) violates Ockham's Razor because it adds 'stuff' to the universe" is false, and that a lot of quantum physicists have misunderstood this because their training is in quantum physics, not in algorithmic information theory or formal epistemology.

I think (a) is the main mistake you're making in the economics case. Eliezer is basically claiming to understand macroeconomics better than key decisionmakers at the Bank of Japan, but based on the results, I think he was just correct about that. As far as I can tell, Eliezer is just really good at economic reasoning, even though it's not his day job. Cf. Central banks should have listened to Eliezer Yudkowsky (or 1, 2, 3).

Comment by robbbb on Two clarifications about "Strategic Background" · 2019-11-19T17:52:08.333Z · score: 4 (2 votes) · LW · GW

Oops, I saw your question when you first posted it but forgot to get back to you, Issa. (Issa re-asked here.) My apologies.

I think there are two main kinds of strategic thought we had in mind when we said "details forthcoming":

  • 1. Thoughts on MIRI's organizational plans, deconfusion research, and how we think MIRI can help play a role in improving the future — this is covered by our November 2018 update post, https://intelligence.org/2018/11/22/2018-update-our-new-research-directions/.
  • 2. High-level thoughts on things like "what we think AGI developers probably need to do" and "what we think the world probably needs to do" to successfully navigate the acute risk period.

Most of the stuff discussed in "strategic background" is about 2: not MIRI's organizational plan, but our model of some of the things humanity likely needs to do in order for the long-run future to go well. Some of these topics are reasonably sensitive, and we've gone back and forth about how best to talk about them.

Within the macrostrategy / "high-level thoughts" part of the post, the densest part was maybe 7a. The criteria we listed for a strategically adequate AGI project were "strong opsec, research closure, trustworthy command, a commitment to the common good, security mindset, requisite resource levels, and heavy prioritization of alignment work".

With most of these it's reasonably clear what's meant in broad strokes, though there's a lot more I'd like to say about the specifics. "Trustworthy command" and "a commitment to the common good" are maybe the most opaque. By "trustworthy command" we meant things like:

  • The organization's entire command structure is fully aware of the difficulty and danger of alignment.
  • Non-technical leadership can't interfere and won't object if technical leadership needs to delete a code base or abort the project.

By "a commitment to the common good" we meant a commitment to both short-term goodness (the immediate welfare of present-day Earth) and long-term goodness (the achievement of transhumanist astronomical goods), paired with a real commitment to moral humility: not rushing ahead to implement every idea that sounds good to them.

We still plan to produce more long-form macrostrategy exposition, but given how many times we've failed to word our thoughts in a way we felt comfortable publishing, and given how much other stuff we're also juggling, I don't currently expect us to have any big macrostrategy posts in the next 6 months. (Note that I don't plan to give up on trying to get more of our thoughts out sooner than that, if possible. We'll see.)

Comment by robbbb on Raemon's Scratchpad · 2019-11-12T04:32:17.161Z · score: 2 (1 votes) · LW · GW

I haven't noticed a problem with this in my case. Might just not have noticed having this issue.

Comment by robbbb on A simple sketch of how realism became unpopular · 2019-10-28T01:05:57.674Z · score: 2 (1 votes) · LW · GW

I mean "But we should consider that bodies are [...] a mere appearance of who knows what unknown object; that motion is not the effect of this unknown cause, but merely the appearance of its influence on our senses; that consequently neither of these is something outside us, but both are merely representations in us" seems pretty unambiguous to me. Kant isn't saying here that 'we can only know stuff about mind-independent objects by using language and concepts and frameworks' in this passage; he's saying 'we can only know stuff about mere representations inside of us'.

Kant passages oscillate between making sense under one of these interpretations or the other (or neither):

  • the "causality interpretation", which says that things-in-themselves are objects that cause appearances, like a mind-independent object causes an experience in someone's head. If noumena are the "true correlates" of phenomena, while phenomena are nothing but subjective experiences, then this implies that we really don't know anything about the world outside our heads. You can try to squirm out of this interpretation by asserting that words like "empirical" and "world" should be redefined to refer to subjective experiences in our heads, but this is just playing with definitions.
  • the "identity interpretation", which says that things-in-themselves are the same objects as phenomena, just construed differently.

Quoting Wood (66-67, 69-70):

Yet the two interpretations appear to yield very different (incompatible) answers to the following three questions:
1. Is an appearance the very same entity as a thing in itself? The causality interpretation says no, the identity interpretation says yes.
2. Are appearances caused by things in themselves? The causality interpretation says yes, the identity interpretation says no.
3. Do the bodies we cognize have an existence in themselves? The causality interpretation says no, the identity interpretation says yes.
[... N]o entity stands to itself in the relation of cause to effect. Transcendental idealism is no intelligible doctrine at all if it cannot give self-consistent answers to the above three questions. [...]
Kant occasionally tries to combine "causality interpretation" talk with "identity interpretation" talk. When he does, the result is simply nonsense and self-contradiction:
"I say that things as objects of our senses existing outside us are given, but we know nothing of what they may be in themselves, cognizing only their appearances, that is, the representations which they cause in us by affecting our senses. Consequently, I grant by all means that there are bodies outside us, that is, things which, though quite unknown to us as to what they are in themselves, we still cognize by the representations which their influence on our sensibility procures us, and which we call bodies, a term signifying merely the appearance of the thing which is unknown to us but not the less actual. (P 4:289)
The first sentence here says that objects of the senses are given to our cognition, but then denies that we cognize these objects, saying instead that we cognize an entirely different set of objects (different from the ones he has just said are given). The second sentence infers from this that there are bodies outside us, but proceeds to say that it is not these bodies (that is, the entities Kant has just introduced to us as 'bodies') that we call 'bodies', but rather bodies are a wholly different set of entities. Such Orwellian doubletalk seems to be the inevitable result of trying to combine the causality interpretation with the identity interpretation while supposing that they are just two ways of saying the same thing. [...]
Kant of course denies that we can ever have cognition of an object as it is in itself, because we can have no sensible intuition of it -- as it is in itself. But he seems to regard it as entirely permissible and even inevitable that we should be able to think the phenomenal objects around us solely through pure concepts of the understanding, hence as they are in themselves. If I arrive at the concept of a chair in the corner first by cognizing it empirically and then by abstracting from those conditions of cognition, so that I think of it existing in itself outside those conditions, then it is obvious that I am thinking of the same object, not of two different objects. It is also clear that when I think of it the second way, I am thinking of it, and not of its cause (if it has one). From this point of view, the causality interpretation seems utterly unmotivated and even nonsensical.
The problem arises, however, because Kant also wants to arrive at the concept of a thing existing in itself in another way. He starts from the fact that our empirical cognition results from the affection of our sensibility by something outside us. This leads him to think that there must be a cause acting on our sensibility from outside, making it possible for us to intuit appearances, which are then conceived as the effects of this cause.
Of course it would be open to him to think of this for each case of sensible intuition as the appearance acting on our sensibility those a wholly empirical causality. But Kant apparently arrived at transcendental idealism in part by thinking of it as a revised version of the metaphysics of physical influence between substances that he derived from Crusius. Thus sensible intuition is sometimes thought of as the affection of our senses by an object not as an appearance but as a thing in itself, and transcendental idealism is thought of as having to claim (inconsistently) that we are to regard ourselves (as things in themselves) as being metaphysically influenced by things in themselves.
Such a metaphysics would of course be illegitimately transcendent by the standards of the Critique, but Kant unfortunately appears sometimes to think that transcendental idealism is committed to it, and many of his followers down to the present day seem addicted to the doctrine that appears to be stated in the letter of those texts that express that thought, despite the patent nonsense they involve from the critical point of view. The thing in itself is then taken to be this transcendent cause affecting our sensibility as a whole, and the appearance is seen as the ensemble of representations resulting from its activity on us.
Comment by robbbb on A simple sketch of how realism became unpopular · 2019-10-27T23:11:14.391Z · score: 2 (1 votes) · LW · GW

The reason I'm focusing on this is that I think some of the phrasings you chose in trying to summarize Kant (and translate or steelman his views) are sliding between the three different claims I described above:

[1] "We can't know things about ultimate reality without relying on initially unjustified knowledge/priors/cognitive machinery."
[2] "We can't know things about ultimate reality."
[3] "(We can know that) ultimate reality is wildly different from reality-as-we-conceive-of-it."

E.g., you say

The kind of knowledge he says you can't have is knowledge of the thing in itself, which in modern terms would mean something like knowledge that is not relative to some conceptual framework or way of perceiving

In treating all these claims as equivalent, you're taking a claim that sounds at first glance like 2 ("you can't have knowledge of the thing in itself"), and identifying it with claims that sound like either 1 or 3 ("you can't have knowledge that is not relative to some conceptual framework or way of perceiving," "you can't have knowledge of the real world that exists outside our concepts", "space/time/etc. are things our brains make up, not ultimately real things").

I think dissecting these examples helps make it easier to see how a whole continent could get confused about Berkeleian master-argument-syle reasoning for 100-200 years, and get confused about distinctions like 'a thought you aren't thinking' vs. 'an object-of-thought you aren't thinking about'.

Comment by robbbb on A simple sketch of how realism became unpopular · 2019-10-27T23:08:08.635Z · score: 8 (2 votes) · LW · GW

I claim that the most natural interpretation of "[Transcendental] idealism means all specific human perceptions are moulded by the general form of human perception and there is no way to backtrack to a raw form." is that there's no way to backtrack from our beliefs, impressions, and perceptions to ultimate reality. That is, I'm interpreting "backtrack" causally: the world causes our perceptions, and backtracking would mean reconstructing what the ultimate, outside-our-heads, existed-before-humanity reality is like before we perceive or categorize it. (Or perhaps backtracking causally to the initial, relatively unprocessed sense-data our brains receive.)

In those terms, we know a ton about ultimate, outside-our-heads reality (and a decent amount about how the brain processes new sensory inputs), and there's no special obstacle to backtracking from our processed sense data to the raw, unprocessed real world. (Our reasoning faculties do need to be working OK, but that's true for our ability to learn truths about math, about our own experiences, etc. as well. Good conclusions require a good concluder.)

If instead the intended interpretation of "backtrack to a raw form" is "describe something without describing it", "think about something without thinking about it", or "reason about something without reasoning about it", then your original phrasing stops making sense to me.

Take the example of someone standing by a barn. They can see the front side of the barn, but they've never observed the back side. At noon, you ask them to describe their subjective experience of the barn, and they do so. Then you ask them to "backtrack to the raw form" beyond their experience. They proceed to start describing the full quantum state of the front of the barn as it was at noon (taking into account many-worlds: the currently-speaking observer has branched off from the original observer).

Then you go, "No, no, I meant describe something about the barn as it exists outside of your conceptual schemes." And the person repeats their quantum description, which is a true description regardless of the conceptual scheme used; the quantum state is in the world, not in my brain or in my concepts.

Then you go, "No, I meant describe an aspect of the barn that transcends your experiences entirely; not a property of the barn that caused your experience, but a property unconnected to your experience." And the person proceeds to conjecture that the barn has a back side, even though they haven't seen it; and they start speculating about likely properties the back side may have.

Then you go, "No! I meant describe something about the barn without using your concepts in the description." Or: "Describe something that bears no causal relation to your cognition whatsoever, like a causally inert quiddity that in no way interacts with any of the kinds of things you've ever experienced or computed."

And the person might reply: Well, I can say that such a thing would be a causally inert quiddity, as you say; and then perhaps I can't say much more than that, other than to drill down on what the relevant terms mean. Or, if the requirement is to describe a thing without describing it, then obviously I can't do that; but that seems like an even more trivial observation.

Why would the request to "describe something without describing it" ever be phrased as "backtracking to a raw form"? There's no "backtracking" involved, and we aren't returning to an earlier "raw" or unprocessed thing, since we're evidently not talking about an earlier (preconceptual) cognition that was subsequently processed into a proper experience; and since we're evidently not talking about the physical objects outside our heads that are the cause and referent for our thoughts about them.

I claim that there's an important equivocation at work in the idealist tradition between "backtracking" or finding a more "raw" or ultimate version of a thing, and "describe a thing without describing it". I claim that these only sound similar because of the mistake in Berkeley's master argument: confusing the ideas "an electron (i.e., an object) that exists outside of any conceptual framework" and "an 'electron' (i.e., a term or concept) that exists outside of any conceptual framework". I claim that the very temptation to use 'Ineffable-Thingie'-reifying phrasings like "there is no way to backtrack to a raw form" and "what an electron is outside of any conceptual framework", is related to this mistake.

Phrasing it as "We can't conceive of an electron without conceiving of it" makes it sound trivial, whereas the way of speaking that phrases things almost as though there were some object in the world (Kant's 'noumena') that transcends our conceptual frameworks and outstrips our every attempt to describe it, makes it sound novel and important and substantive. (And makes it an appealing Inherently Mysterious Thing to worship.)

Comment by robbbb on A simple sketch of how realism became unpopular · 2019-10-26T16:51:08.452Z · score: 2 (1 votes) · LW · GW

I agree that Kant thought of himself as trying to save science from skepticism (e.g., Hume) and weird metaphysics (e.g., Berkeley), and I'm happy you're trying to make it easier to pass Kant's Ideological Turing Test.

Transendental idealism means all specific human perceptions are moulded by the general form of human perception and there is no way to backtrack to a raw form. [...]
The kind of knowledge he says you can't have is knowledge of the thing in itself, which in modern terms would mean something like knowledge that is not relative to some conceptual framework or way of perceiving. Physicalism doesn't refute that in the least, because it is explicitly based on using physical science as its framework.

I have two objections:

(1) Physicalism does contradict the claim "there is no way to backtrack to a raw form", if this is taken to mean we should be agnostic about whether things are (really, truly, mind-independently) physical.

I assert that the "raw form" of an electron, insofar as physics is accurate, is just straightforwardly and correctly described by physics; and unless there's a more fundamental physical account of electrons we have yet to discover, physics is plausibly (though I doubt we can ever prove this) a complete description of electrons. There may not be extra features that we're missing.

(2) Modern anti-realist strains, similar to 19th-century idealism, tend to slide between these three claims:

  • "We can't know things about ultimate reality without relying on initially unjustified knowledge/priors/cognitive machinery."
  • "We can't know things about ultimate reality."
  • "(We can know that) ultimate reality is wildly different from reality-as-we-conceive-of-it."

The first claim is true, but the second and third claims are false.

This sliding is probably the real thing we have Kant to thank for, and the thing that's made anti-realist strains so slippery and hard to root out; Berkeley was lucid enough to unequivocally avoid the above leaps.

Quoting Allen Wood (pp. 63-64, 66-67):

The doctrine can even be stated with apparent simplicity: We can have cognition of appearances but not of things in themselves. But it is far from clear what this doctrine means, and especially unclear what sort of restriction it is supposed to place on our knowledge.
Some readers of Kant have seen the restriction as trivial, so trivial as to be utterly meaningless, even bordering on incoherence. They have criticized Kant not for denying that we can know 'things in themselves' but rather for thinking that the notion of a 'thing in itself' even makes sense. If by a 'thing in itself' we mean a thing standing outside any relation to our cognitive powers, then of course it seems impossible for us to know such things; perhaps it is even self-contradictory to suppose that we could so much as think of them.
Other readers have seen transcendental idealism as a radical departure from common sense, a form of skepticism at least as extreme as any Kant might have been trying to combat. To them it seems that Kant is trying (like Berkeley) to reduce all objects of our knowledge to mere ghostly representations in our minds. He is denying us the capacity to know anything whatever about any genuine (that is, any extra-mental) reality. [...]
I think much of the puzzlement about transcendental idealism arises from the fact that Kant himself formulates transcendental idealism in a variety of ways, and it is not at all clear how, or whether, his statements of it can all be reconciled, or taken as statements of a single, self-consistent doctrine. I think Kant's central formulations suggest two quite distinct and mutually incompatible doctrines. [...]
Some interpreters of Kant, when they become aware of these divergences, respond by saying that there is no significant difference between the two interpretations, that they are only 'two ways of saying the same thing.' These interpreters are probably faithful to Kant's intentions, since it looks as if he thought the two ways of talking about appearances and things in themselves are interchangeable and involve no difference in doctrine. But someone can intend to speak self-consistently and yet fail to do so; and it looks like this is what has happened to Kant in this case.

In particular, here's Wood on why Kant is sometimes saying 'we can't know about the world outside our heads', not just 'we can't have knowledge without relying on some conceptual framework or way of perceiving' (p. 64):

Kant often distinguishes appearances from things in themselves through locutions like the following: "What the objects may be in themselves would still never be known through the most enlightened cognition of their appearance, which alone is given to us" (KrV A43/B60). "Objects in themselves are not known to us at all, and what we call external objects are nothing other than mere representations of our sensibility, whose form is space, but whose true correlate, i.e. the thing in itself, is not and cannot be cognized through them" (KrV A30/B45).
Passages like these suggest that things existing in themselves are entities distinct from 'their appearances' -- which are subjective states caused in us by these things. Real things (things in themselves) cause appearances. Appearances have no existence in themselves, being only representations in us. "Appearances do not exist in themselves, but only relative to the [subject] insofar as it has senses" (KrV B164). "But we should consider that bodies are not objects in themselves that are present to us, but rather a mere appearance of who knows what unknown object; that motion is not the effect of this unknown cause, but merely the appearance of its influence on our senses; that consequently neither of these is something outside us, but both are merely representations in us" (KrV A387).

Whereas (p. 65):

In other passages, transcendental idealism is formulated so as to present us with a very different picture. [...] Here Kant does not distinguish between two separate entities, but rather between the same entity as it appears (considered in relation to our cognitive faculties) and as it exists in itself (considered apart from that relation). [...]
On the identity interpretation, appearances are not merely subjective entities or states in our minds; they do have an existence in themselves. The force of transcendental idealism is not to demote them, so to speak, from reality to ideality, but rather to limit our cognition of real entities to those features of them that stand in determinate relations to our cognitive faculties.
Comment by robbbb on A simple sketch of how realism became unpopular · 2019-10-26T15:52:17.254Z · score: 2 (1 votes) · LW · GW

I'd say: definitely nuanced. Definitely very inconsistent on this point. Not consistently asserting an extreme metaphysical view like "the true, mind-independent world is incomprehensibly different from the world we experience", though seeming to flirt with this view (or framing?) constantly, to the extent that all his contemporaries did think he had a view at least this weird. Mainly guilty of (a) muddled and poorly-articulated thoughts and (b) approaching epistemology with the wrong goals and methods.

Comment by robbbb on A simple sketch of how realism became unpopular · 2019-10-20T02:57:39.561Z · score: 2 (1 votes) · LW · GW
When Chalmers claims to have "direct" epistemic access to certain facts, the proper response is to provide the arguments for doubting that claim, not to play a verbal sleight-of-hand like Dennett's (1991, emphasis added):

Chalmers' The Conscious Mind was written in 1996, so this is wrong. The wrongness doesn't seem important to me. (Jackson and Nagel were 1979/1982, and Dennett re-endorsed this passage in 2003.)

Comment by robbbb on A simple sketch of how realism became unpopular · 2019-10-19T23:22:04.852Z · score: 2 (1 votes) · LW · GW
It is indisputably the case that Chalmers, for instance, makes arguments along the lines of “there are further facts revealed by introspection that can’t be translated into words”. But it is not only not indisputably the case

What does "indisputably" mean here in Bayesian terms? A Bayesian's epistemology is grounded in what evidence that individual has access to, not in what disputes they can win. When Chalmers claims to have "direct" epistemic access to certain facts, the proper response is to provide the arguments for doubting that claim, not to play a verbal sleight-of-hand like Dennett's (1991, emphasis added):

You are not authoritative about what is happening in you, but only about what seems to be happening in you, and we are giving you total, dictatorial authority over the account of how it seems to you, about what it is like to be you. And if you complain that some parts of how it seems to you are ineffable, we heterophenomenologists will grant that too. What better grounds could we have for believing that you are unable to describe something than that (1) you don’t describe it, and (2) confess that you cannot? Of course you might be lying, but we’ll give you the benefit of the doubt.

It's intellectually dishonest of Dennett to use the word "ineffable" here to slide between the propositions "I'm unable to describe my experience" and "my experience isn't translatable in principle", as it is to slide between Nagel's term of art "what it's like to be you" and "how it seems to you".

Again, I agree with Dennett that Chalmers is factually wrong about his experience (and therefore lacks a certain degree of epistemic "authority" with me, though that's such a terrible way of phrasing it!). There are good Bayesian arguments against trusting autophenomenology enough for Chalmers' view to win the day (though Dennett isn't describing any of them here), and it obviously is possible to take philosophers' verbal propositions as data to study (cf. also the meta-problem of consciousness), but it's logically rude to conceal your cruxes, pretend that your method is perfectly neutral and ecumenical, and let the "scientificness" of your proposed methodology do the rhetorical pushing and pulling.

but indeed can’t ever (without telepathy etc., or maybe not even then) be shown to another person, or perceived by another person, to be the case, that there are further facts revealed by introspection that can’t be translated into words.

There's a version of this claim I agree with (since I'm a physicalist), but the version here is too strong. First, I want to note again that this is equating group epistemology with individual epistemology. But even from a group's perspective, it's perfectly possible for "facts revealed by introspection that can't be translated into words" to be transmitted between people; just provide someone with the verbal prompts (or other environmental stimuli) that will cause them to experience and notice the same introspective data in their own brains.

If that's too vague, consider this scenario as an analogy: Our universe is a (computable) simulation, running in a larger universe that's uncomputable. Humans are "dualistic" in the sense that they're Cartesian agents outside the simulation whose brains contain uncomputable subprocesses, but their sensory experiences and communication with other agents is all via the computable simulation. We could then imagine scenarios where the agents have introspective access to evidence that they're performing computations too powerful to run in the laws of physics (as they know them), but don't have output channels expressive enough to demonstrate this fact to others in-simulation; instead, they prompt the other agents to perform the relevant introspective feat themselves.

The other agents can then infer that their minds are plausibly all running on physics that's stronger than the simulated world's physics, even though they haven't found a directly demonstrate this (e.g., via neurosurgery on the in-simulation pseudo-brain).

Indeed it’s not even clear how you’d demonstrate to yourself that what your introspection reveals is real.

You can update upward or downward about the reliability of your introspection (either in general, or in particular respects), in the same way you can update upward or downward about the reliability of your sensory perception. E.g., different introspective experiences or faculties can contradict each other, suggest their own unreliability ("I'm introspecting that this all feels like bullshit..."), or contradict other evidence sources.

Comment by robbbb on A simple sketch of how realism became unpopular · 2019-10-19T22:13:07.441Z · score: 4 (2 votes) · LW · GW

A simple toy example would be: "You have perfect introspective access to everything about how your brain works, including how your sensory organs work. This allows you to deduce that your external sensory organs provide noise data most of the time, but provide accurate data about the environment anytime you wear blue sunglasses at night."

Comment by robbbb on A simple sketch of how realism became unpopular · 2019-10-19T20:23:36.313Z · score: 4 (2 votes) · LW · GW

"Heterophenomenology" might be fine as a meme for encouraging certain kinds of interesting research projects, but there are several things I dislike about how Dennett uses the idea.

Mainly, it's leaning on the social standards of scientific practice, and on a definition of what "real science" or "good science" is, to argue against propositions like "any given scientist studying consciousness should take into account their own introspective data -- e.g., the apparent character of their own visual field -- in addition to verbal descriptions, as an additional fact to explain." This is meant to serve as a cudgel and bulwark against philosophers like David Chalmers, who claim that introspection reveals further facts (/data/explananda) not strictly translatable into verbal reports.

This is framing the issue as one of social-acceptability-to-the-norms-of-scientists or conformity-with-a-definition-of-"science", whereas correct versions of the argument are Bayesian. (And it's logically rude to not make the Bayesianness super explicit and clear, given the opportunity; it obscures your premises while making your argument feel more authoritative via its association with "science".)

We can imagine a weird alien race (or alien AI) that has extremely flawed sensory faculties, and very good introspection. A race like that might be able to bootstrap to good science, via leveraging their introspection to spot systematic ways in which their sensory faculties fail, and sift out the few bits of reliable information about their environments.

Humans are plausibly the opposite: as an accident of evolution, we have much more reliable sensory faculties than introspective faculties. This is a generalization from the history of science and philosophy, and from the psychology literature. Moreover, humans have a track record of being bad at distinguishing cases where their introspection is reliable from cases where it's unreliable; so it's hard to be confident of any lines we could draw between the "good introspection" and the "bad introspection". All of this is good reason to require extra standards of evidence before humanity "takes introspection at face value" and admits it into its canon of Established Knowledge.

Personally, I think consciousness is (in a certain not-clarified-here sense) an illusion, and I'm happy to express confidence that Chalmers' view is wrong. But I think Dennett has been uniquely bad at articulating the reasons Chalmers is probably wrong, often defaulting to dismissing them or trying to emphasize their social illegitimacy (as "unscientific").

The "heterophenomenology" meme strikes me as part of that project, whereas a more honest approach would say "yeah, in principle introspective arguments are totally admissible, they just have to do a bit more work than usual because we're giving them a lower prior (for reasons X, Y, Z)" and "here are specific reasons A, B, C that Chalmers' arguments don't meet the evidential bar that's required for us to take the 'autophenomenological' data at face value in this particular case".

Comment by robbbb on A simple sketch of how realism became unpopular · 2019-10-19T19:07:08.136Z · score: 2 (1 votes) · LW · GW

Also interesting: "insistence that we be immune to skeptical arguments" and "fascination with the idea of representation/intentionality/'aboutness'" seems to have led the continental philosophers in similar directions, as in Sartre's "Intentionality: A Fundamental Idea of Husserl’s Phenomenology." But that intellectual tradition had less realism, instrumentalism, and love-of-science in its DNA, so there was less resistance to sliding toward an "everything is sort of subjective" position.

Comment by robbbb on A simple sketch of how realism became unpopular · 2019-10-19T17:41:40.282Z · score: 16 (4 votes) · LW · GW

Upvoted! My discussion of a bunch of these things above is very breezy, and I approve of replacing the vague claims with more specific historical ones. To clarify, here are four things I'm not criticizing:

  • 1. Eliminativism about particular mental states, of the form 'we used to think that this psychological term (e.g., "belief") mapped reasonably well onto reality, but now we understand the brain well enough to see it's really doing [description] instead, and our previous term is a misleading way of gesturing at this (or any other) mental process.'

I'm an eliminativist (or better, an illusionist) about subjectivity and phenomenal consciousness myself. (Though I think the arguments favoring that view are complicated and non-obvious, and there's no remotely intellectually satisfying illusionist account of what the things we call "conscious" really consist in.)

  • 2. In cases where the evidence for an eliminativist hypothesis isn't strong, the practice of having some research communities evaluate eliminativism or try eliminativism out and see if it leads in any productive directions. Importantly, a community doing this should treat the eliminativist view as an interesting hypothesis or an exploratory research program, not in any way as settled science (or pre-scientific axiom!).
  • 3. Demanding evidence for claims, and being relatively skeptical of varieties of evidence that have a poor track record, even if they "feel compelling".
  • 4. Demanding that high-level terms be in principle reducible to lower-level physical terms (given our justified confidence in physicalism and reductionism).

In the case of psychology, I am criticizing (and claiming really happened, though I agree that these views weren't as universal, unquestioned, and extreme as is sometimes suggested):

  • Skinner's and other behaviorists' greedy reductionism; i.e., their tendency to act like they'd reduced or explained more than they actually had. Scientists should go out of their way to emphasize the limitations and holes in their current models, and be very careful (and fully explicit about why they believe this) when it comes to claims of the form 'we can explain literally everything in [domain] using only [method].'
  • Rushing to achieve closure, dismiss open questions, forbid any expressions of confusion or uncertainty, and treat blank parts of your map as though they must correspond to a blank (or unimportant) territory. Quoting Watson (1928):
With the advent of behaviorism in 1913 the mind-body problem disappeared — not because ostrich-like its devotees hid their heads in the sand but because they would take no account of phenomena which they could not observe. The behaviorist finds no mind in his laboratory — sees it nowhere in his subjects. Would he not be unscientific if he lingered by the wayside and idly speculated upon it; just as unscientific as the biologists would be if they lingered over the contemplation of entelechies, engrams and the like. Their world and the world of the behaviorist are filled with facts — with data which can be accumulated and verified by observation — with phenomena which can be predicted and controlled.
If the behaviorists are right in their contention that there is no observable mind-body problem and no observable separate entity called mind — then there can be no such thing as consciousness and its subdivision. Freud's concept borrowed from somatic pathology breaks down. There can be no festering spot in the substratum of the mind — in the unconscious —because there is no mind.
  • More generally: overconfidence in cool new ideas, and exaggeration of what they can do.
  • Over-centralizing around an eliminativist hypothesis or research program in a way that pushes out brainstorming, hypothesis-generation, etc. that isn't easy to fit into that frame. I quote Hempel (1935) here:
[Behaviorism's] principal methodological postulate is that a scientific psychology should limit itself to the study of the bodily behavior with which man and the animals respond to changes in their physical environment, and should proscribe as nonscientific any descriptive or explanatory step which makes use of terms from introspective or 'understanding' psychology, such as 'feeling', 'lived experience', 'idea', 'will', 'intention', 'goal', 'disposition', 'represension'. We find in behaviorism, consequently, an attempt to construct a scientific psychology[.]
  • Simply put: getting the wrong answer. Some errors are more excusable than others, but even if my narrative about why they got it wrong is itself wrong, it would still be important to emphasize that they got it wrong, and could have done much better.
  • The general idea that introspection is never admissible as evidence. It's fine if you want to verbally categorize introspective evidence as 'unscientific' in order to distinguish it from other kinds of evidence, and there are some reasonable grounds for skepticism about how strong many kinds of introspective evidence are. But evidence is still evidence; a Bayesian shouldn't discard evidence just because it's hard to share with other agents.
  • The rejection of folk-psychology language, introspective evidence, or anything else for science-as-attire reasons.

Idealism emphasized some useful truths (like 'our perceptions and thoughts are all shaped by our mind's contingent architecture') but ended up in a 'wow it feels great to make minds more and more important' death spiral.

Behaviorism too emphasized some useful truths (like 'folk psychology presupposes a bunch of falsifiable things about minds that haven't all been demonstrated very well', 'it's possible for introspection to radically mislead us in lots of ways', and 'it might benefit psychology to import and emphasize methods from other scientific fields that have a better track record') but seems to me to have fallen into a 'wow it feels great to more and more fully feel like I'm playing the role of a True Scientist and being properly skeptical and cynical and unromantic about humans' trap.

Comment by robbbb on A simple sketch of how realism became unpopular · 2019-10-18T15:56:48.174Z · score: 4 (2 votes) · LW · GW
The same "that sounds silly" heuristic that helps you reject Berkeley's argument (when it's fringe and 'wears its absurdity on its sleeve') helps you accept 19th-century idealists' versions of the argument (when it's respectable and framed as the modern/scientific/practical/educated/consensus view on the issue).

I should also emphasize that Berkeley's idealism is very different from (e.g.) Hegel's idealism. "Idealism" comes in enough different forms that it's probably more useful for referring to a historical phenomenon than a particular ideology. (Fortunately, the former is the topic I'm interested in here.)