Posts
Comments
I really worry about this and it has become quite a block. I want to support fragile baby ontologies emerging in me amidst a cacophony of "objective"/"reward"/etc. taken for granted.
Unfortunately, going off and trying to deconfuse the concepts on my own is slow and feedback-impoverished and makes it harder to keep up with current developments.
I think repurposing "roleplay" could work somewhat, with clearly marked entry and exit into a framing. But ontological assumptions absorb so illegibly that deliberate unseeing is extremely hard, at least without being constantly on guard.
Are there other ways that you recommend (from Framestorming or otherwise?)
I wrote a quick draft about this a month ago, working through the simple math of this kind of update. I've just put it up.
Another (very weird) counterpoint: you might not see the "swarm coming" because the annexing of our cosmic endowment might look way stranger than the best strategy human minds can come up with.
I remember a safety researcher once mentioned to me that they didn't necessarily expect us to be killed, just contained, while superintelligence takes over the universe. The argument being that it might want to preserve its history (ie. us) to study it, instead of permanently destroying it. This is basically as bad as also killing everyone, because we'd still be imprisoned away from our largest possible impact. Similar to the serious component in "global poverty is just a rounding error".
Now I think if you add that our "imprisonment" might be made barely comfortable (which is quite unlikely, but maybe plausible in some almost-aligned-but-ultimately-terrible uncanny value scenarios), then it's possible that there's never a discontinuous horror that we would see striking us; instead we will suddenly be blocked from our cosmic endowment without our knowledge. Things will seem to be going essentially on track. But we never quite seem to get to the future we've been waiting for.
It would be a steganographic takeoff.
Here's a (only slightly) more fleshed out argument:
If
- deception is something that "blocks induction on it" (eg. you can't run a million tests on deceptive optimizers and hope for the pattern on the tests to continue), and if
- all our "deductions" are really just an assertion of induction at higher levels of abstraction (eg. asserting that Logic will continue to hold)
then deception could look "steganographic" when it's done at really meta levels, exploiting our more basic metaphysical mistakes.
Currently (although less so now than, say, ten years ago), for LW-ish ideas to come out of the mind of a human into a broken, conformist world requires a conjunction of the idea and rugged individualism. And so we see individualism over-represented. What community-flavored/holism-flavored values might come out of generations growing up with these ideas, I wonder?
Objections
Reading Objections
It's hard to skim
I don't think these are necessarily mutually exclusive, but we might have to work with alternative formats. But that's a good idea anyway. Here's an example for Bostrom's Astronomical Waste.
It's hard to distinguish sincere claims from hyperbolic ones
This is not really a problem if they're being honest around it. I'm recommending being serious about your silliness, not letting go of your seriousness.
All of your favorite writers probably use it already to some degree. Scott Alexander for instance has explicitly talked about using "microhumor", which can be a combination of hyperbole and hedging (or hypobole? Maybe that's what this skill should be called.)
This can be more challenging for non-native speakers, in both the cultural and language sense. But that's also a tradeoff with, for example, jargon usage.
People might get away with a response of "I was clearly joking" when you provide critique
Yes, they might. There is no language in which it is impossible to write bad programs. If there is such a confusion, they can be transparent about the point of the tone.
It's hard to reference
There are two reasons for this. One is the "can't skim" objection addressed earlier. The other is optics. I don't have much to say if you're balancing optics. Damn Moloch.
I find this style annoying/insulting
Check if it might be worth informing this aesthetic, given the benefits. Also, as noted earlier, a lot of your favorite writers might be using it already, so well that you don't notice it. Which requires practice.
If you're reasonably practiced at it and still think it's net negative, I'm interested in hearing your objection!
Writing Objections
It's hard to structure
I think this is true, but not incredibly so. The costs diminish. It might even come easier to you.
It looks weird/silly
I'd say in a good way, when done right. That's a feature.
Also, formalese is the weird and unnaturally stiff one.
I'd rather just state my credences seriously instead of these acrobatics
Nate has a post called Confidence All The Way Up. It's a pretty great post, about covering each layer of uncertainty with meta-certainty, all the way up. But it doesn't have many pointers for communicating a summary of many layers succinctly. A facetious tone is a stickier and concise way of communicating your imprecision than writing out several layers of uncertainty by hand. It's (arguably) less distracting. Importantly, it's less painful, and so you're more likely to actually do it. Additionally, it's pretty well-known that precision of content alone does not translate to precision in the reader's world-model. So taking some responsibility for the whole thing makes sense.
In sum: solemnly stating your (meta)uncertainty might give people the sense that you're pretty certain about your (meta)uncertainty. If that is indeed what you want to communicate, great. Either way, learning how to modulate tone can add to your repertoire.
I'm not funny
You don't have to be! But you can work on adding choice in the style you use.
***
Here's a quote from Eliezer to round it off, from Against Maturity (emphasis mine):
Robin is willing to tolerate formality in journals, viewing it as possibly even a useful path to filtering out certain kinds of noise; to me formality seems like a strong net danger to rationality that filters nothing, just making it more difficult to read.
Robin seems relatively more annoyed by the errors in the style of youth, where I seem relatively more annoyed by errors in the style of maturity.
And so I take a certain dark delight in quoting anime fanfiction at people who expect me to behave like a wise sage of rationality. Why should I pretend to be mature when practically every star in the night sky is older than I am?
It's true that Open Philanthropy's public communication tends toward a cautious, serious tone (and I think there are good reasons for this); but beyond that, I don't think we do much to convey the sort of attitude implied above. [...] We never did any sort of push to have it treated as a fancy report.
The ability to write in a facetious tone is wonderful addition to one's writing toolset, equivalent to the ability to use fewer significant digits. This is a separate feature from the feature of "fun to read" and "irreverent". People routinely mistake formalese for some kind of completeness/rigor, and the ability to counter that incorrect inference in them is very useful.
This routine mistake is common knowledge (at least after this comment ;). So "We never did any sort of push to have it treated as fancy" becomes about as defensible as "We never pushed for the last digits in the point estimate '45,321 hair on my head' to be treated as exact", which... is admittedly kinda unfair. Mainly because it's a much harder execution than replacing some digits with zeros.
Eliezer is facetiously pointing out the non-facetiousness in your report (and therefore doing some of the facetiousness-work for you), and here you solemnly point out the hedged-by-default status of it, which... is admittedly kinda fair. But I'm glad this whole exchange (enabled by facetiousness) happened, because it did cause the hedged-by-default-ness to be emphasized.
(I know you said you have good reasons for a serious tone. I do believe there are good reasons for it, such as having to get through to people who wouldn't take it seriously otherwise, or some other silly Keynesian beauty contest. But my guess is that those constraints would apply less on what you post to this forum, which is evidence that this is more about labor and skill than a carefully-arrived-at choice.)
Yes, and if that's bottlenecked by too few people being good filters, why not teach that?
I would guess that a number of smart people would be able to pick up the ability to spot doomed "perpetual motion alignment strategies" if you paid them a good amount to hang around you for a while.