Posts
Comments
Computer science & ML will become lower in relevance/restricted in scope for the purposes of working with silicon-based minds, just as human-neurosurgery specifics are largely but not entirely irrelevant for most civilization-scale questions like economic policy, international relations, foundational research, etc.
Or IOW: Model neuroscience (and to some extent, model psychology) requires more in-depth CS/ML expertise than will the smorgasbord of incoming subfields of model sociology, model macroeconomics, model corporate law, etc.
EA has gotten a little more sympathetic to vibes-based reasoning recently, and will continue to incorporate more of it.
The mind (ie. your mind), and how it is experienced from the inside, is potentially a very rich source of insights for keeping AI minds aligned on the inside.
The virtue of the void is indeed the virtue above all others (in rationality), and fundamentally unformalizable.
There is likely a deep compositional structure to be found for alignment, possibly to the extent that AGI alignment could come from "merely" stacking together "microalignment", even if in non-trivial ways.
I haven't read this post super deeply yet, but obviously this is one of those excellent posts that's going to become a Schelling point for various semi-related gripes after a mere token skim, even though most of them have been anticipated already in the post!
Some of those gripes are:
- Near enemies: Once a term for a phenomenon is entrenched in a community, it's a lot a lot a lot of work to name anything that's close to it but not quite it. (See, for example, "goodhart" for what is IMO a very diverse and subtle cluster of clumsiness in holding onto intentionality.) Escaping near enemies is difficult in any community, but especially difficult in this community because there tend to be mathematical/formalist intuitions of "well I have Abstraction that captures The Other Thing as a special case", which is often true. But it is also often "false" in the sense of...
- Sneaking in Connotations: I'm sad ADBOC hasn't become as CK as it could, especially for checking the unconscious tools that receive salience. Usually, if you don't believe in "vibes" and the relevance of connotations, you tend to only do some denotational pattern-matching to check whether a concept applies. But then it sneaks in burdensome frames and stances to bring to bear anyway. Sure, basketball ultimately, technically, happens in physics. I'm not sure that looking for the relevant differential equations (which is what you often do in physics) is the best frame to bring to improve your shot! Further, connotations also very often carry information that just don't make it into your cleaned up abstraction...
- Stickiness: ...and then it sticks. This is a feature, of course, as you name in point five: reification. You've made plenty of remarks about how the kata is useful even though it is not The Thing. That's great. But the fact of stickiness also implies that the underlying frame needs to be the relevant one for contexts where it is cued in people's minds, because it's now in some kind of vaguely zero-sum game with other frames. Now there's some transaction costs in switching to the other frame, taxing other perspectives. This is not even beginning to touch though, the real taxing involved in...
- Reification bias: There's a spectrum of errors from taking "you only understand it if you can program it" too literally, forcefully narrowing all the kinds of clarity available to human brains and bodies. It's a short step from there to "it's only real if you can formalize it" and "it's only real if it can be made explicit" and "it's only real if it can be reified/sustained in some way", which last thing a lot of non-dual practices attempt to disabuse humans of. It's not like this is extremely difficult to recognize (although it can be for certain dispositions), but it is especially immune to being noticed if you believe in...
- Solving everything with more explicitness: I love how a lot of double binds are dissolved in this community by kind and skillful naming of things. This is greatly underpumped nearly everywhere else. But I think it's silly to automatically assume it is a safe defense for any issue. It isn't impossible that certain situations might benefit from limiting explicitness rather than unboundedly including ever more meta accounting.
In sum: Detailed models that frames and handles come with are great, but burdensome. It's probably an oversight to not account for transaction costs in switching to (or even instantiating to, if you think your handle is general) other frames when you camp around one frame though, roughly multiplied with how burdensome the current frame might be.
Even if SEP was right about getting around the infinity problem and CK was easy to obtain before, it certainly isn't now! (Because there is some chance that whoever you're talking to has read this post, and whoever reads this post will have some doubt about whether the other believes that...)
Love this post overall! It's hard to overstate the importance of (what is believed to be) common knowledge. Legitimacy is, as Vitalik notes[1], the most important scarce resource (not just in crypto) and is likely closer to whatever we usually intend to name when we say "common knowledge", which this post argues (successfully IMO) is not actually common knowledge.
It does seem like legitimacy is possible to model with p-CK, but I'm not convinced.[2] Nor do I know how substitutable p-CK is with my old notion of CK, what it's good for! Which theorems can be rescued with p-CK where they depended on CK? Does Aumann's still hold with probability p upon p-CK of priors, or does the entire reasoning collapse? How careful do I have to be?
- ^
He seems to only talk about finitely many layers of higher order knowledge, as does Duncan (for explicitly pedagogical reasons) in his post on CK and Miasma. I think this can be right, but if so, for complicated reasons. And it still leaves some seemingly self-undermining "rationality" in our frameworks.
- ^
Mainly, I expect self-reference in legitimacy to be troublesome. A lot of things are legitimate because people think other people think they are legitimate, which seems enough like Lob's theorem that I worry about the lobstacle.
I've been a longtime CK atheist (and have been an influence on Abram's post), and your comment is in the shape of my current preferred approach. Unfortunately, rational ignorance seems to require CK that agents will engage in bounded thinking, and not be too rational!
(CK-regress like the above is very common and often non-obvious. It seems plausible that we must accept this regress and in fact humans need to be Created Already in Coordination, in analogy with Created Already in Motion)
I think it is at least possible to attain p-CK in the case that there are enough people who aren't "inductively inclined". This sort of friction from people who aren't thinking too hard causes unbounded neuroticism to stop and allow coordination. I'm not yet sure if such friction is necessary for any agent or merely typical.
What are the standard doomy "lol no" responses to "Any AGI will have a smart enough decision theory to not destroy the intelligence that created it (ie. us), because we're only willing to build AGI that won't kill us"?
(I suppose it isn't necessary to give a strong reason why acausality will show up in AGI decision theory, but one good one is that it has to be smart enough to cooperate with itself.)
Some responses that I can think of (but I can also counter, with varying success):
A. Humanity is racing to build an AGI anyway, this "decision" is not really enough in our control to exert substantial acausal influence
B. It might not destroy us, but it will likely permanently lock away our astronomical endowment and this is basically just as bad and therefore the argument is mostly irrelevant
C. We don't particularly care to preserve what our genes may "wish" to preserve, not even the acausal-pilled among us
D. Runaway, reckless consequntialism is likely to emerge long before a sophisticated decision theory that incorporates human values/agency, and so if there is such a trade to be had, it will likely be already too late
E. There is nothing natural about the categories carved up for this "trade" and so we wouldn't expect it to take place. If we can't even tell it what a diamond is, we certainly wouldn't share enough context for this particular acausal trade to snap into place
F. The correct decision theory will actually turn out to only one-box in Newcomb's and not in Transparent Newcomb's, and this is Transparent Newcomb's
G.There will be no "agent" or "decision theory" to speak of, we just go out with a whimper via increasingly lowered fidelity to values in the machines we end up designing
This is from ten minutes of brainstorming, I'm sure it misses out some important ones. Obviously, if there don't exist any good ones (ones without counters), that gives us reason to beieve in alignment by default!
Keen to hear your responses.
When there's not a "right" operationalization, that usually means that the concepts involved were fundamentally confused in the first place.
Curious about the scope of the conceptual space where this belief was calibrated. It seems to me to tacitly say something like "everything that's important is finitely characterizable".
Maybe the "fundamentally confused" in your phrasing already includes the case of "stupidly tried to grab something that wasn't humanly possible, even if in principle" as a confused way for a human, without making any claim of reality being conveniently compressible at all levels. (Note that this link explicitly disavows beauty at "all levels" too.)
I suppose you might also say "I didn't make any claim of finiteness" but I do think something like "at least some humans are only a finite string away from grokking anything" is implicit if you expect there to be blogposts/textbooks that can operationalize everything relevant. It would be an even stronger claim than "finiteness", it would be "human-typical length strings"
I believe Adam is pointing at something quite important, akin to a McNamara fallacy for formalization. To paraphrase:
The first step is to formalize whatever can be easily formalized. This is OK as far as it goes. The second step is to disregard that which can't be easily formalized or to make overly simplifying assumptions. This is artificial and misleading. The third step is to presume that what can't be formalized easily really isn't important. This is blindness. The fourth step is to say that what can't be easily formalized really doesn't exist. This is suicide.
In the case of something that has already been engineered (human brains with agency), we probably should grant that it is possible to operationalize everything relevant. But I want to pushback on the general version and would want "why do you believe simple-formalization is possible here, in this domain?" to be allowed to be asked.
[PS. am not a native speaker]
It seems like one of the most useful features of having agreement separate from karma is that it lets you vote up the joke and vote down the meaning :)
Thanks for clarifying! And for the excellent post :)
Finally, when steam flows out to the world, and the task passes out of our attention, the consequences (the things we were trying to achieve) become background assumptions.
To the extent that Steam-in-use is a kind of useful certainty about the future, I'd expect "background assumptions" to become an important primitive that interacts in this arena as well, given that it's a useful certainty about the present. I realize that's possibly already implicit in your writing when you say figure/ground.
I think some equivalent of Steam pops out as an important concept in enabling-agency-via-determinism (or requiredism, as Eliezer calls it), when you have in your universe both:
- iron causal laws coming from deterministic physics and
- almost iron "telic laws" coming from regulation by intelligent agents with something to protect.
The latter is something that can also become a very solid (full of Steam) thing to lean on for your choice-making, and that's an especially useful model to apply to your selves across time or to a community trying to self-organize. It seems very neglected, formally speaking. Economically-minded thinking tends to somewhat respect it as a static assumption, but not so much the dynamics of formation AFAIK (and so dynamic Steam is a pretty good metaphor).
However, shouldn't "things that have faded into the background" be the other kind of trivial, ie. have "maximal Steam" rather than have "no Steam"? It's like an action that will definitely take place. Something that will be in full force. Trivially common knowledge. You yourself seem to point at it with "Something with a ton of steam feels inevitable", but I suppose that's more like the converse.
(EDIT: Or at least something like that. If a post on the forum has become internalized by the community, a new comment on it won't get a lot of engagement, which fits with "losing steam" after it becomes "solid". But even if we want to distinguish where the action is currently, it makes sense to have a separate notion of what's finished and can easily re-enter attention compared to what was never started.)
Also when you say, in your sunk costs example, "no steam to spend time thinking", I'd say a better interpretation than "time thinking" would be "not enough self-trust to repledge solidity in a new direction". Time to think sounds to me more like Slack, but maybe I'm confused.
I'm unsure if open sets (or whatever generalization) are a good formal underpinning of what we call concepts, but I'm in agreement that there seems needed at least a careful reconsideration of intuitions one takes for granted when working with a concept, when you're actually working with a negation-of-concept. And "believing in" might be one of those things that you can't really do with negation-of-concepts.
Also, I think a typo: you said "logical complement", I'm imagining you meant "set-theoretic complement". (This seems important to point out since in topological semantics for intuitionistic logic, the "logical complement" is in fact defined to be the interior of the set-theoretic complement, which guarantees an open.)
I began reading this charitably (unaware of whatever inside baseball is potentially going on, and seems to be alluded to), but to be honest struggled after "X" seemed to really want someone (Eliezer) to admit they're "not smart"? I'm not sure why that would be relevant.
I think I found these lines especially confusing, if you want to explain:
- "I just hope that people can generalize from "alignment is hard" to "generalized AI capabilities are hard".
Is capability supposed to be hard for similar reasons as alignment? Can you expand/link? The only argument I can think of relating the two (which I think is a bad one) is "machines will have to solve their own alignment problem to become capable." - Eliezer is invalidating the second part of this but not the first.
This would be a pretty useless machiavellian strategy, so I'm assuming you're saying it's happening for other reasons? Maybe self-deception? Can you explain? - Eliezer thinks that OpenAI will try to make things go faster rather than slower, but this is plainly inconsistent with things like the state of vitamin D research
This just made me go "wha" at first but my guess now is that this and the bits above it around speech recognition seem to be pointing at some AI winter-esque (or even tech stagnation) beliefs? Is this right?
There's probably a radical constructivist argument for not really believing in open/noncompact categories like . I don't know how to make that argument, but this post too updates me slightly towards such a Tao of conceptualization.
(To not commit this same error at the meta level: Specifically, I update away from thinking of general negations as "real" concepts, disallowing statements like "Consider a non-chair, ...").
But this is maybe a tangent, since just adopting this rule doesn't resolve the care required in aggregation with even compact categories.
(A suggestion for the forum)
You know that old post on r/ShowerThoughts which went something like "People who speak somewhat broken english as their second language sound stupid, but they're actually smarter than average because they know at least one other language"?
I was thinking about this. I don't struggle with my grasp of English the language so much, but I certainly do with what might be called an American/Western cadence. I'm sure it's noticeable occasionally, inducing just the slightest bit of microcringe in the typical person that hangs around here. Things like strange sentence structure, or weird use of italics, or overuse of a word, or over/under hedging... all the writing skills you already mastered in grade school. And you probably grew up seeing that the ones who continued to struggle with it often didn't get other things quickly either.
Maybe you notice some of these already in what you're reading right now (despite my painstaking efforts otherwise). It's likely to look "wannabe" or "amateurish" because it isone learns language and rhythm by imitating. But this imitation game is confined to language & rhythm, and it would be a mistake to also infer from this that the ideas behind them would be unoriginal or amateurish.
I'd like to think it wouldn't bother anyone on LW because people here believe that linguistic faux pas, as much as social ones, ought to be screened off by the content.
But it probably still happens. You might believe it but not alieve it. Imagine someone saying profound things but using "u" and "ur" everywhere, even for "you're". You could actually try this (even though it would be a somewhat shallow experiment, because what I'm pointing at with "cadence" is deeper than spelling mistakes) to get a flavor for it.
A solution I can think of: make a [Non-Native Speaker] tag and allow people to self-tag. Readers could see it and shoot for a little bit more charity across anything linguistically-aesthetically displeasing. The other option is to take advantage of customizable display names here, but I wonder if that'd be distracting if mass-adopted, like twitter handles that say "[Name] ...is in New York".
I would (maybe, at some point) even generalize it to [English Writing Beginner] or some such, which you can self-assign even if you speak natively but are working on your writing skills. This one is more likely to be diluted though.
I like this question. I imagine the deeper motivation is to think harder about credit assignment.
I wrote about something similar a few years ago, but with the question of "who gets moral patienthood" rather than "who gets fined for violating copyright law". In the language of that comment, "you publishing random data" is just being an insignificant Seed.
Yeah, this can be really difficult to bring out. The word "just" is a good noticer for this creeping in.
It's like a deliberate fallacy of compression: sure you can tilt your view so they look the same and call it "abstraction", but maybe that view is too lossy for what we're trying to do! You're not distilling, you're corrupting!
I don't think the usual corrections for fallacies of compression can help either (eg. Taboo) because we're operating at the subverbal layer here. It's much harder to taboo cleverness at that layer. Better off meditating on the virtue of The Void instead.
But it is indeed a good habit to try to unify things, for efficiency reasons. Just don't get caught up on those gains.
The "shut up"s and "please stop"s are jarring.
Definitely not, for example, norms to espouse in argumentation (and tbf nowhere does this post claim to be a model for argument, except maybe implicitly under some circumstances).
Yet there's something to it.
There's a game of Chicken arising out of the shared responsibility to generate (counter)arguments. If Eliezer commits to Straight, ie. refuses to instantiate the core argument over and over again (either explicitly, by saying "you need to come up with the generator" or implicitly, by refusing to engage with a "please stop."), then the other will be incentivized to Swerve, ie. put some effort into coming up with their own arguments and thereby stumble upon the generator.
This isn't my preferred way of coordinating on games of Chicken, since it is somewhat violent and not really coordination. My preferred way is to proportionately share the price of anarchy, which can be loosely estimated with some honest explicitness. But that's what (part of) this post is, a very explicit presentation of the consequences!
So I recoil less. It feels inviting instead, about a real human issue in reasoning. And bold, given all the possible ways to mischaracterize it as "Eliezer says 'shut up' to quantitative models because he has a pet theory about AGI doom".
But is this an important caveat to the fifth virtue, at least in simulated dialogue? That remains open for me.
It occurred to me while reading your comment that I could respond entirely with excerpts from Minding our way. Here's a go (it's just fun, if you also find it useful, great!):
You will spend your entire life pulling people out from underneath machinery, and every time you do so there will be another person right next to them who needs the same kind of help, and it goes on and on forever
This is a grave error, in a world where the work is never finished, where the tasks are neverending.
Rest isn't something you do when everything else is finished. Everything else doesn't get finished. Rather, there are lots of activities that you do, some which are more fun than others, and rest is an important one to do in appropriate proportions.
Rest isn't a reward for good behavior! It's not something you get to do when all the work is finished! That's finite task thinking. Rather, rest and health are just two of the unending streams that you move through. [...]
the scope of the problem, at least relative to your contribution, is infinite
This behavior won't do, for someone living in a dark world. If you're going to live in a dark world, then it's very important to learn how to choose the best action available to you without any concern for how good it is in an absolute sense. [...]
You will beg for a day in which you go outside and don't find another idiot stuck under his fucking car
I surely don't lack the capacity to feel frustration with fools, but I also have a quiet sense of aesthetics and fairness which does not approve of this frustration. There is a tension there.
I choose to resolve the tension in favor of the people rather than the feelings. [...]
somebody else is going to die, you monster
We aren't yet gods. We're still fragile. If you have something urgent to do, then work as hard as you can — but work as hard as you can over a long period of time, not in the moment. [...]
You can look at the bad things in this world, and let cold resolve fill you — and then go on a picnic, and have a very pleasant afternoon. That would be a little weird, but you could do it! [...]
So eventually you either give up, or you put earplugs in your ears and go enjoy some time in the woods, completely unable to hear the people yelling for help.
many people seem to think that there is a privileged "don't do anything" action, that consists of something like curling up into a ball, staying in bed, and refusing to answer emails. It's much easier to adopt the "buckle down" demeanor when, instead, curling up in a ball and staying in bed feels like just another action. It's just another way to respond to the situation, which has some merits and some flaws.
(That's not to say that it's bad to curl up in a ball on your bed and ignore the world for a while. Sometimes this is exactly what you need to recover. Sometimes it's what the monkey is going to do regardless of what you decide. [...])
So see the dark world. See everything intolerable. Let the urge to tolerify it build, but don't relent. Just live there in the intolerable world, refusing to tolerate it. See whether you feel that growing, burning desire to make the world be different. Let parts of yourself harden. Let your resolve grow. It is here, in the face of the intolerable, that you will be able to tap into intrinsic motivation. [...]
You draw boundaries towards questions.
As the links I've posted above indicate, no, lists don't necessarily require questions to begin noticing joints and carving around them.
Questions are helpful however, to convey the guess I might already have and to point at the intension that others might build on/refute. And so...
Your list doesn't have any questions like that
...I have had some candidate questions in the post since the beginning, and later even added some indication of the goal at the end.
EDIT: You also haven't acknowledged/objected to my response to your "any attempt to analyse the meaning independent of the goals is confused", so I'm not sure if that's still an undercurrent here.
In Where to Draw the Boundaries, Zack points out (emphasis mine):
The one replies:
But reality doesn't come with its joints pre-labeled. Questions about how to draw category boundaries are best understood as questions about values or priorities rather than about the actual content of the actual world. I can call dolphins "fish" and go on to make just as accurate predictions about dolphins as you can. Everything we identify as a joint is only a joint because we care about it.
No. Everything we identify as a joint is a joint not "because we care about it", but because it helps us think about the things we care about.
There are more relevant things in there, which I don't know if you have disagreements with. So maybe it's more useful to crux with Zack's main source. In Where to Draw the Boundary, Eliezer gives an example:
And you say to me: "It feels intuitive to me to draw this boundary, but I don't know why—can you find me an intension that matches this extension? Can you give me a simple description of this boundary?"
I take it this game does not work for you without a goal more explicit than the one I have in the postscript to the question?
(Notice that inferring some aspects of the goal is part of the game; in the specific example Eliezer gave, they're trying to define Artwhich is as nebulous an example as it could be. Self-deception is surely less nebulous than Art.)
I was looking for this kind of engagement, which asserts/challenges either intension or extension:
You come up with a list of things that feel similar, and take a guess at why this is so. But when you finally discover what they really have in common, it may turn out that your guess was wrong. It may even turn out that your list was wrong.
It seemed to me that avoiding fallacies of compression was always a useful thing (independent of your goal, so long as you have the time for computation), even if negligibly. Yet these questions seem to be a bit of a counterexample in mind, namely that I have to be careful when what looks like decoupling might be decontextualizing.
Importantly, I can't seem to figure out a sharp line between the two. The examples were a useful meditation for me, so I shared them. Maybe I should rename the title to reflect this?
(I'm quite confused by my failure of conveying the point of the meditation, might try redoing the whole post.)
Yes, this is the interpretation.
If I'm doing X wrong (in some way), it's helpful for me to notice it. But then I notice I'm confused about when decoupling context is the "correct" thing to do, as exemplified in the post.
Rationalists tend to take great pride in decoupling and seeing through narratives (myself included), but I sense there might be some times when you "shouldn't", and they seem strangely caught up with embeddedness in a way.
I think I might have made a mistake in putting in too many of these at once. The whole point is to figure out which forms of accusations are useful feedback (for whatever), and which ones are not, by putting them very close to questions we think we've dissolved.
Take three of these, for example. I think it might be helpful to figure out whether I'm "actually" enjoying the wine, or if it's a sort of a crony belief. Disentangling those is useful to make better decisions for myself, in say, deciding to go to a wine-tasting if status-boost with those people wouldn't help.
Perhaps similarly, I'm better off knowing if my knowledge of whether this food item is organic is interfering with my taste experience.
But then in the movie example, no one would dispute the knowledge is relevant to the experience! Going back to our earlier ones, maybe just the knowledge there was relevant, and "genuinely" making it a better experience?
Maybe my degree of liking is a function of both "knowledge of organic origin" and "chemical interactions with tongue receptors" just like my degree of liking of a movie is a function of both "contextual buildup from the narrative" and "the currently unfolding scene"?
How about when you apply this to "you only upvoted that because of who wrote it"? Maybe that's a little closer home.
[ETA: posted a Question instead]
Question: What's the difference, conceptually, between each of the following if any?
"You're only enjoying that food because you believe it's organic"
"You're only enjoying that movie scene because you know what happened before it"
"You're only enjoying that wine because of what it signals"
"You only care about your son because of how it makes you feel"
"You only had a moving experience because of the alcohol and hormones in your bloodstream"
"You only moved your hand because you moved your fingers"
"You're only showing courage because you've convinced yourself you'll scare away your opponent"
For example:
- Do some of these point legitimately or illegitimately at self-deception?
- Are some of these a confusion of levels and others less so?
- Are some of these instances of working wishful thinking?
- Are some of these better seen as actions rather than rationalizations?
So... it looks like the second AI-Box experiment was technically a loss.
Not sure what to make of it, since it certainly imparts the intended lesson anyway. Was it a little misleading that this detail wasn't mentioned? Possibly. Although the bet was likely conceded, a little disclaimer of "overtime" would have been nice when Eliezer discussed it.
I was also surprised. Having spoken to a few people with crippling impostor syndrome, the summary seemed to be "people think I'm smart/skilled, but it's not Actually True."
I think the claim in the article is they're still in the game when saying that, just another round of downplaying themselves? This becomes really hard to falsify (like internalized misogyny) even if true, so I appreciate the predictions at the end.
I like the idea of it being closer to noise, but there are also reasons to consider the act of advertising theft, or worse:
- It feels like the integrity of my will is attacked, when ads work and I know somewhere that I don't want it to; a divide and conquer attack on my brain, Moloch in my head.
- If they get the most out of marketing it to parts of my brain rather than to me as a whole, there is optimization pressure to keep my brain divided, to lower the sanity waterline.
- Whenever I'm told to "turn off adblocker", for that to work for them, it's premised on me being unable to have an adblocker inside my brain, preying on what's uncontrollably automatic for me. As if to say: "we both know how this works". It makes me think of an abuser saying to their victim: "go fetch my belt".
There's a game of chicken in "who has to connect potential buyers to sellers, the buyers or the sellers?" and depending on who's paying to make the transaction happen, we call it "advertisement" or "consultancy".
(You might say "no, that distinction comes from the signal-to-noise ratio", so question: if increasing that ratio is what works, how come advertisements are so rarely informative?)
As a meta-example, even to this I want to add:
- There's this other economy to keep in mind of readers scrolling past walls of text. Often, I can and want to make what I'm saying cater to multiple attention spans (a la arbital?), and collapsed-by-default comments allow the reader to explore at will.
- A strange worry (that may not be true for other people) is attempting to contribute to someone else's long thread or list feels a little uncomfortable/rude without reading it all/carefully. With collapsed-by-default, you could set up norms that it's okay to reply without engaging deeply.
- It would be nice to have collapsing as part of the formatting
- With this I already feel like I'm setting up a large-ish personal garden that would inhibit people from engaging in this conversation even if they want to, because there's so much going on.
- And I can't edit this into my previous comment without cluttering it.
- There's obviously no need for having norms of "talking too much" when it's decoupled from the rest of the control system
- I do remember Eliezer saying in a small comment somewhere long ago that "the thumb rule is to not occupy more than three places in the Recent Comments page" (paraphrased).
- I noticed a thing that might hinder the goals of longevity as described here ("build on what was already said previously"): it feels like a huge cost to add a tiny/incremental comment to something because of all the zero-sum attention games it participates in.
It would be nice to do a silent comment, which:- Doesn't show up in Recent Comments
- Collapsed by default
- (less confident) Doesn't show up in author's notifications (unless "Notify on Silent" is enabled in personal settings)
- (kinda weird) Comment gets appended automatically to previous comment (if yours) in a nice, standard format.
- The operating metaphor is to allow the equivalent of bulleted lists to span across time, which I suppose would mostly be replies to yourself.
- It feels strange to keep editing one comment, and too silent. Also disrupts flow for readers.
- I don't see often that people have added several comments (via edit or otherwise) across months, or even days. Yet people seem to use a lot of nested lists here. Hard to believe that those list-erious ways go away if spread out in time.
Often, people like that will respond well to criticism about X and Y but not about Z.
One (dark-artsy) aspect to add here is that the first time you ask somebody for criticism, you're managing more than your general identity, you're also managing your interaction norms with that person. You're giving them permission to criticize you (or sometimes, even think critically about you for the first time), creating common knowledge that there does exist a perspective from which it's okay/expected for them to do that. This is playing with the charity they normally extend to you, which might mean that your words and plans will be given less attention than before, even though there might not be any specific criticism in their head. This is especially relevant for low-legibility/fluid hierarchies, which might collapse and impede functioning from the resulting misalignment, perhaps not unlike your own fears of being "crushed", but at the org level.
Although it's usually clear that you'd want to get feedback rather than manage this (at least, I think so), it's important to notice as one kind of anxiety surrounding criticism. This is separate from any narcissistic worries about status, it can be a real systemic worry when you're acting prosocially.
Incidentally Eliezer, is this really worth your time?
This comment might have caused a tremendous loss of value, if Eliezer took Marcello's words seriously here and so stopped talking about his metaethics. As Luke points out here, despite all the ink spilled, very few seemed to have gotten the point (at least, from only reading him).
I've personally had to re-read it many times over, years apart even, and I'm still not sure I fully understand it. It's also been the most personally valuable sequence, the sole cause of significant fundamental updates. (The other sequences seemed mostly obvious --- which made them more suitable as just incredibly clear references, sometimes if only to send to others.)
I'm sad that there isn't more.
Ping!
I've read/heard a lot about double crux but never had the opportunity to witness it.
EDIT: I did find one extensive example, but this would still be valuable since it was a live debate.
This one? From the CT-thesis section in A first lesson in meta-rationality.
the objection turns partly on the ambiguity of the terms “system” and “rationality.” These are necessarily vague, and I am not going to give precise definitions. However, by “system” I mean, roughly, a set of rules that can be printed in a book weighing less than ten kilograms, and which a person can consciously follow.11 If a person is an algorithm, it is probably an incomprehensibly vast one, which could not written concisely. It is probably also an incomprehensibly weird one, which one could not consciously follow accurately. I say “probably” because we don’t know much about how minds work, so we can’t be certain.
What we can be certain is that, because we don’t know how minds work, we can’t treat them as systems now. That is the case even if, when neuroscience progresses sufficiently, they might eventually be described that way. Even if God told us that “a human, reasoning meta-systematically, is just a system,” it would be useless in practice. Since we can’t now write out rules for meta-systematic reasoning in less than ten kilograms, we have to act, for now, as if meta-systematic reasoning is non-systematic.
Ideally, I'd make another ninja-edit that would retain the content in my post and the joke in your comment in a reflexive manner, but I am crap at strange loops.
Cold Hands Fallacy/Fake Momentum/Null-Affective Death Stall
Although Hot Hands has been the subject of enough controversy to perhaps no longer be termed a fallacy, there is a sense in which I've fooled myself before with a fake momentum. I mean when you change your strategy using a faulty bottomline: incorrectly updating on your current dynamic.
As a somewhat extreme but actual example from my own life: when filling out answersheets to multiple-choice questions (with negative marks for incorrect responses) as a kid, I'd sometimes get excited about having marked almost all of the questions near the end, and then completely, obviously, irrationally decide to mark them all. This was out of some completion urge, and the positive affect around having filled in most of them. This involved a fair bit of self-deception to carry out, since I was aware at some level that I left some of them previously unanswered because I was in fact unsure, and to mark them I had to feel sure.
Now, for sure you could make the case that maybe there are times when you're thinking clearer and when you know the subject or whatever, where you can additionally infer this about yourself correctly and then rationally ramp up the confidence (even if slight) in yourself. But this wasn't one of those cases, it was the simple fact that I felt great about myself.
Anyway the real point of this post is that there's a flipside (or straightforward generalization) of this: we can talk about this fake inertia for subjects at rest or at motion. What I mean is there's this similar tendency to not feel like doing something because you don't have that dynamic right now, hence all the clichés of the form "first blow is half the battle". In a sense, that's all I'm communicating here, but seeing it as a simple irrational mistake (as in the example above) really helped me get over this without drama: just remind yourself of the bottomline and start moving in the correct flow, ignoring the uncalibrated halo (or lack thereof) of emotion.
There's a whole section on voting in the LDT For Economists page on Arbital. Also see the one for analytic philosophers, which has a few other angles on voting.
From what I can tell from your other comments on this page, you might already have internalized all the relevant intuitions, but it might be useful anyway. Superrationality is also discussed.
Sidenote: I'm a little surprised no one else mentioned it already. Somehow arbital posts by Eliezer aren't considered as canon as the sequences, maybe it's the structure (rather than just the content)?
I usually call this lampshading, and I'll link this comment to explain what I mean. Thanks!
Thank you for this comment. I went through almost exactly the same thing, and might have possibly tabled it at the "I am really confused by this post" stage had I not seen someone well-known in the community struggle with and get through it.
My brain especially refused to read past the line that said "pushing it to 50% is like throwing away information": Why would throwing away information correspond to the magic number 50%?! Throwing away information brings you closer to maxent, so if true, what is it about the setup that makes 50% the unique solution, independent of the baseline and your estimate? That is, what is the question?
I think it's this: in a world where people can report the probability for a claim or the negation of it, what is the distribution of probability-reports you'd see?
By banning one side of it as Rafael does, you get it to tend informative. Anyway, this kind of thinking makes it seem like it's a fact about this flipping trick and not fundamental to probability theory. I wonder if there are more such tricks/actual psychology to adjust for to get a different answer.
While you're technically correct, I'd say it's still a little unfair (in the sense of connoting "haha you call yourself a rationalist how come you're failing at akrasia").
Two assumptions that can, I think you'll agree, take away from the force of "akrasia is epistemic failure":
- if modeling and solving akrasia is, like diet, a hard problem that even "experts" barely have an edge on, and importantly, things that do work seem to be very individual-specific making it quite hard to stand on the shoulders of giants
- if a large percentage of people who've found and read through the sequences etc have done so only because they had very important deadlines to procrastinate
...then on average you'd see akrasia over-represented in rationalists. Add to this the fact that akrasia itself makes manually aiming your rationality skills at what you want harder. That can leave it stable even under very persistent efforts.
I'm interested in this. The problem is that if people consider the value provided by the different currencies at all fungible, side markets will pop up that allow their exchange.
An idea I haven't thought about enough (mainly because I lack expertise) is to mark a token as Contaminated if its history indicates that it has passed through "illegal" channels, ie has benefited someone in an exchange not considered a true exchange of value, and so purists can refuse to accept those. Purist communities, if large, would allow stability of such non-contaminated tokens.
Maybe a better question to ask is "do we have utility functions that are partial orders and thus would benefit from many isolated markets?", because if so, you wouldn't have to worry about enforcing anything, many different currencies will automatically come into existence and be stable.
Of course, more generally, you wouldn't quite have complete isolation, but different valuations of goods in different currencies, without "true" fungibility. I think it is quite possibe that our preference orderings are in fact partial and the current one-currency valuation of everything might be improved.
The expectations you do not know you have control your happiness more than you know. High expectations that you currently have don't look like high expectations from the inside, they just look like how the world is/would be.
But "lower your expectations" can often be almost useless advice, kind of like "do the right thing".
Trying to incorporate "lower expectations" often amounts to "be sad". How low should you go? It's not clear at all if you're using territory-free un-asymmetric simple rules like "lower". Like any other attempt at truth-finding, it is not magic. It requires thermodynamic work.
The thing is, the payoff is rather amazing. You can just get down to work. As soon as you're free of a constant stream of abuse from beliefs previously housed in your head, you can Choose without Suffering.
The problem is, I'm not sure how to strategically go about doing this, other than using my full brain with Constant Vigilance.
Coda: A large portion of the LW project (or at least, more than a few offshoots) is about noticing you have beliefs that respond to incentives other than pure epistemic ones, and trying not to reload when shooting your foot off with those. So unsurprisingly, there's a failure mode here: when you publicly declare really low expectations (eg "everyone's an asshole"), it works to challenge people, urges them to prove you wrong. It's a cool trick to win games of Chicken but as usual, it works by handicapping you. So make sure you at least understand the costs and the contexts it works in.
I think a counterexample to "you should not devote cognition to achieving things that have already happened" is being angry at someone who has revealed they've betrayed you, which might acause them to not have betrayed you.
Is metarationality about (really tearing open) the twelfth virtue?
It seems like it says "the map you have of map-making is not the territory of map-making", and gets into how to respond to it fluidly, with a necessarily nebulous strategy of applying the virtue of the Void.
(this is also why it always felt like metarationality seems to only provide comments where Eliezer would've just given you the code)
The parts that don't quite seem to follow is where meaning-making and epistemology collide. I can try to see it as a "all models are false, some models are useful" but I'm not sure if that's the right perspective.
I want to ask this because I think I missed it the first few times I read Living in Many Worlds: Are you similarly unsatisfied with our response to suffering that's already happened, like how Eliezer asks, about the twelfth century? It's boldface "just as real" too. Do you feel the same "deflation" and "incongruity"?
I expect that you might think (as I once did) that the notion of "generalized past" is a contrived but well-intentioned analogy to manage your feelings.
But that's not so at all: once you've redone your ontology, where the naive idea of time isn't necessarily a fundamental thing and thinking in terms of causal links comes a lot closer to how reality is arranged, it's not a stretch at all. If anything, it follows that you must try and think and feel correctly about the generalized past after being given this information.
Of course, you might modus tollens here.
Soares also did a good job of impressing this in Dive In:
In my experience, the way you end up doing good in the world has very little to do with how good your initial plan was. Most of your outcome will depend on luck, timing, and your ability to actually get out of your own way and start somewhere. The way to end up with a good plan is not to start with a good plan, it's to start with some plan, and then slam that plan against reality until reality hands you a better plan.
The idea doesn't have to be good, and it doesn't have to be feasible, it just needs to be the best incredibly concrete plan that you can come up with at the moment. Don't worry, it will change rapidly when you start slamming it into reality. The important thing is to come up with a concrete plan, and then start executing it as hard as you can — while retaining a reflective state of mind updating in the face of evidence.
I don't think the "idea of scientific thinking and evidence" has so much to do with throwing away information as adding reflection, post which you might excise the cruft.
Being able to describe what you're doing, ie usefully compress existing strategies-in-use, is probably going to be helpful regardless of level of intelligence because it allows you to cheaply tweak your strategies when either the situation or the goal is perturbed.