Posts
Comments
I haven't played this, but I've watched a video of Japanese comedians playing it, which actually does give a sense of how it works.
There's a (IMO very obvious) algorithm for winning this with literally zero communication: play card N after N seconds have elapsed. I don't know how easy it is to precisely count double-digit-second intervals, but it doesn't seem that interesting to find out. It seems pretty clear that steelmanning the rules means not counting seconds.
So what you end up with is a game of reading precise system-2 information (numbers), translating it into nebulous system-1 body language, that the other players need to be able to process back into a precise number.
Re: cells defecting by becoming gametes, I think you were maybe a bit too terse. I believe I've figured out what's going on, but let me run it by you:
*Within the organism*, there's no selection pressure for cells to become gametes--mutations are random variations, not strategic actors, so a leaf is no more likely to 'decide' to become a flower than the reverse (which would also be harmful overall). The organism *does* have an incentive to keep the random mutation rate down, but no reason to *specifically* combat cells 'defecting' in this way.
And actually, if flowers are especially costly, the organism might evolve specific "no accidental flowers" adaptations--but for reasons unrelated to coordination problems.
Meanwhile, on a species level, there might be a bias in favor of the flower-instead-of-leaf mutations appearing in the gene pool, since these can show up via gamete mutations or leaf mutations, whereas most mutations can only appear via gamete mutations. Intuitively this seems unlikely to be a big deal, but I do wonder if tweaking the parameters could make it significant enough to make a specific adaptation to fight it worthwhile.
This makes a lot more sense with some background on what a ribozyme is, which I lacked before reading this. AIUI certain sequences of RNA fold up in a way that makes them act as enzymes.
Though the real point isn't about biology, but rather generic coordination mechanisms...
FWIW I first read this post before this comment was written, then happened to think about it again today and had this idea, and came here to post it.
I do think it's a dangerous fallacy to assume mutually-altruistic equilibria are optimal--'I take care of me, you take care of you' is sometimes more efficient than 'you take care of me, I take care of you'.
Maybe someone needs to study whether Western countries ever exhibit "antisocial cooperation," that is, an equilibrium of enforced public contributions in an "inefficient public goods game" where each of four players gets 20% of the central pool. Might be more likely if you structure it as tokens starting out in the center and players have the option to take them? (Call it the 'enclosure game', perhaps)
So the big question here is, why are zetetic explanations good? Why do we need or want them when civilization will happily supply us with finished bread, or industrial yeast, or rote instructions for how to make sourdough from scratch? The paragraph beginning "Zetetic explanations are empowering" starts to answer, but a little bit vaguely for my tastes. Here's my list of possible answers:
1) Subjective reasons. They're fun or aesthetically pleasing. This feels like a throwaway reason, and doesn't get listed explicitly in the OP unless 'empowering' unpacks to 'subjectively pleasing', but I wouldn't throw it away so fast--if enough people find them fun, that alone could justify a campaign to put more zetetic explanations in the world.
2) They let you test what you're told. This is one of the reasons given in OP. Unfortunately, not every subject is amenable to zetetic explanation, and as long as I have to make up my mind about lots of science without zetetic understanding, I don't see zetetic explanation being an important part of my fake science filter.
3) They let you discover new things, whereas following rote instructions will only let you do what's been done before. This is true, but I think it usually takes a large base of zetetic understanding to do new useful things. If I tried to create new fermented foods based solely on having read this post, I probably wouldn't achieve anything useful. But if I did want to create novel fermented foods, I'd want to load up on lots more zetetic knowledge.
4) General increased wisdom? Maybe a zetetic understanding of bread ripples through your knowledge, leading you to a slightly better understanding of biology, the process of innovation, nutrition, and a variety of related fields, and if you keep amassing zetetic understandings of things it'll add up and you'll be smarter about everything. It's a nice story, but I'm not convinced it's true.
I think what we need is some notion of mediation. That is, a way to recognize that your liver's effects on your bank account are mediated by effects on your health and it's therefore better thought of as a health optimizer.
This has to be counteracted by some kind of complexity penalty, though, or else you can only ever call a thing a [its-specific-physical-effects-on-the-world]-maximizer.
I wonder if we might define this complexity penalty relative to our own ontology. That is, to me, a description of what specifically the liver does requires lots of new information, so it makes sense to just think of it as a health optimizer. But to a medical scientist, the "detoxifies..." description is still pretty simple and obviously superior to my crude 'health optimizer' designation.
if there's a sufficiently large amount of sufficiently precise data, then the physically-correct model's high accuracy is going to swamp the complexity penalty
I don't think that's necessarily true?
Probutility of winning = 1 USD
Perceived chemical-ness is a very rough heuristic for the degree of optimization a food has undergone for being sold in a modern economy (see http://slatestarcodex.com/2017/04/25/book-review-the-hungry-brain/ for why this might be something you want to avoid). Very, very rough--you could no doubt list examples of 'non-chemicals' that are more optimized than 'chemicals' all day, as well as optimizations that are almost certainly not harmful. And yet I'd wager the correlation is there.
Okay, I think I see where you're coming from. I've definitely updated towards considering the OP proposal scarier. Thanks for spelling things out.
Why assume that there is such a thing?
I took Benquo to be saying there was such a qualitative difference. I already agree there are lots of reasons Duncan's proposal would likely do more harm than good.
Unilateral imposition of rules.
What Duncan is proposing is a general societal agreement to allow the Punch Bug game, on a dubious but IMO sincerely-held theory that this would be to the general benefit. It's no more a unilateral imposition than a law you voted against.
So I definitely will join you in condemning the no-opt-out rule. The ghettoization proposal... honestly, I think it was too absurd to me to even generate a coherent image, but if I try to force my imagination to produce one it's pretty horrible.
I'm not sure I see the folding-in problem as keenly as you do. I read Duncan as saying "there's a problem in that we freak out too much about accidental micro harms. My proposed solution is a framework of intentional micro-harms". The first part is on firmer ground than the second, but I don't think it's illegitimate to pair them.
And it's the deep creepiness of the no-punchback rule that I mainly don't get. Like, if the puncher only said "Punch Bug", and the possibility of a punch back were not discussed, I think the default assumption would be that a punch back is forbidden. That's pretty what it means for the original punch to be socially sanctioned. Making the "no punch back" part explicit is, I guess, rubbing the punchee's face in that fact? Is the face-rubbing the problem?
Wait, maybe I get it? Is the terrifying scenario being envisioned, essentially that of a bully saying, "I'm hurting you. For fun. And I've found a socially-sanctioned way to do it, so you're beyond the reach of the forces you normally count on to prevent that!"
Perhaps thinking of 'bullies' as a group is the key insight here? I don't believe Punch Bug is primarily a form of bullying, but the *marginal* impact of banning opt-out *is* mostly to facilitate bullying. That, I could get being deeply creeped out by.
Since I'm in another thread doing a thing that's sort of weirdly adjacent to supporting Duncan's post, let me say what I think of it overall.
I played a bit of Punch Bug as a kid (actually, it was 'Punch Buggy' where I grew up). In my social circles the punches weren't hard, it was basically a token gesture to make spotting a Beetle first into a form of 'winning'. I'd compare it to nonviolent games such as jinx or five minutes to get rid of that word). Personally I found all of these a little fun, a little annoying, no big deal either way.
I don't think my opinions of these games has changed much since then. I wouldn't particularly mind getting randomly hit lightly in the arm, and I might enjoy the little victory of being the one to make a spot. Duncan has my permission to play Punch Bug with me if we ever meet IRL (not that he cares).
**But**, what I personally think of the game does not dictate my opinion. People differ a lot, and it strikes me as an egregious and reckless typical mind fallacy for Duncan to assume that everyone would experience the game the way he does if they gave it a chance, no matter how much they say otherwise. I'm guessing that his evidence for his beliefs is an observation that people who play Punch Bug are more resilient overall. This is probably true, but can be fully explained by causation in the other direction. And not only would many people experience the game as a net negative, but also a significant tail would experience *outsized* negative impact, often to the point of not being able to travel a public road in company at all.
Moreover, it's quite easy for this to be an opt-in game. If Duncan were proposing that people consider for themselves whether they want to train their micro-resilience by opting in, I'd have no problem with his post. But instead he wants to overrule the judgment of people who think the Punch Bug game would be bad for them, in the hopes of building a world more to his liking. This is an indefinite linear combination of dangerous epistemic arrogance and solipsistic disregard for others' well-being,
What I'm trying to figure out is what important qualitative trait Punch Bug shares with a day of pogroms, that an absence of noise ordinances doesn't also share. (All three of these things share the traits of being bad policy, and of hurting some more than others)
'Involvement of physical violence' is one such trait, and you could build a colorable argument that we shouldn't encourage even small amounts of physical violence, but I didn't think that was Benquo's whole argument.
Other than that, there's the no-punch-back thing. I guess I just don't get the significance of the distinction between a punch back on the spot (which the game forbids), and a punch back later when you see a Beetle (which the game encourages). The latter is more annoying to use as a form of deterrence, sure, but not impossible.
Let me clarify: I believe that if you took all of the people who currently want to play Punch Bug, and put them all in one one community, they would continue to play Punch Bug. They would *not* find that the absence of unwilling victims spoiled the fun, because unwilling victims were never the source of the fun.
"A punch" and "a punch in the arm" are quite different, largely in that the latter is unlikely to cause brain injury.
(Posted early by accident, ETA:)
That said, I get the argument about training people to ignore street violence. I'm a bit doubtful of the effect size here, given that I think there are clear markers of a friendly hit, but I could be persuaded otherwise.
As for no loudback: suppose a neighborhood had a policy against loud noise unless you register a party. Only one party can be registered per night. Registration is first come first served. Tell me how this "no loudback" role changes anything?
Alternatively, would you withdraw your objection if the game were "punch bug maybe punch back", where the punched party is allowed to return the punch if they wish?
Probably best to taboo 'asymmetric' at this point. Based on your example I thought it meant "explicitly discriminatory" and not just "disparately impactful".
I get that part. Yes, the Punch Bug game is disparately impactful against those who value not-being-punched more than they value getting-to-punch, especially if they value getting-to-punch at zero. You could say the same about many things, such as throwing loud parties.
That said, I think there's an important difference between a policy chosen in spite of the fact that it harms some people, and one chosen because of that fact. Yes, the latter has been known to masquerade as the former, but I don't think that's what's going on here (this is what I proposed as a crux). I also think that policies that tend to harm a preexisting group are suspect in a way that ones that harm an essentially-random set of people aren't. "People who don't want to punch and be punched" isn't a random group, but it's also nowhere near as suspect a group as "Jews" (maybe this is our crux?).
With those mitigating factors in place, allowing Punch Bug seems to me more like allowing loud parties and less like declaring a day of pogroms. The only thing that aligns it with the pogroms is the involvement of physical violence--and even then, I'd suspect most people would plot 'punch in the arm' closer to 'annoyingly loud music' than to 'mass murder' on the scale of harms. It's only because we as a society draw a line in the sand at nonconsensual physical violence that the punch is in any sense closer to the murder. But this line in the sand is exactly what Duncan is asking us to reconsider, and I don't think you intend to say there's no way to reconsider that line without setting off the mass-murder alarms (unless you do... a third possible crux).
(And to be clear, none of the above conflicts with doing a cost-benefit analysis and saying Punch Bug is a bad idea overall. IMO playing the game by default is dubious at best, and making opt-out onerous or impossible is a terrible idea. Duncan seems to have missed the fact that the vast majority of people age out of the game for reasons unrelated to his thesis. I could go on...)
Let me back up. Zvi convinced me there was a big important click to be had here, and I'm bothered that I haven't had the click. My current understanding of your argument is unpersuasive. That probably means it's an incorrect understanding.
Maybe our crux is that I don't think the Punch Bug game was ever significantly about hurting people who don't want to play it?
If after reading this thread you don't think that, I worry that you haven't groked the thing Benquo is trying to point at.
I definitely haven't grokked the thing Benquo is trying to point at, at all. (I'm plenty Jewish by any anti-semite's definition, fwiw). I don't see what's asymmetric about the 'no punch back' rule at all--the punchee is free to spot the next bug, in which case they will become the beneficiary of the 'no punch back' rule.
Here's something puzzling me: in terms of abstract description, enlightenment sounds a lot like dissociation. Yet I'm under the impression that those who experience the former tend to find it Very Good, while those who experience the latter tend to find it Very Bad.
Two spins only works for two possible answers. Do you need N spins for N answers?
Many norm violations have specific victims.
I don't think it's just a matter of long vs. short term that makes or breaks backwards chaining--it's more a matter of the backwards branching factor.
For chess, this is enormous--you can't disjunctively consider every possible mate, nor can you break them into useful categories to reason about. And for each possible mate, there are too many immediate predecessors to them to get useful informaton. You can try to break the mates into categories and reason about those, but the details are so important here that you're unlikely to get any insights more useful than "removing the opponent's pieces while keeping mine is a good idea".
Fighting a war is a bit better--since you mention Imperial Japan in another comment, let's sketch their thought process. (I might garble some details, but I think it'll work for our purposes) Their end goal was roughly that western powers not break up the Japanese Empire. Ways this might happen: a) Western powers are diplomatically convinced not to intervene. b) Japan uses some sort of deterrent threat to convince Western powers not to intervene. c) Japan's land forces can fight off any attempted attack on their empire. d) Japan controls the seas, so foreign powers can't deliver strong attacks. This is a short enough list that you can consider them one by one, and close enough to exhaustive to make the exercise have some value. Choosing the latter pretty much means abandoning a clean backward chain, which you should be willing to do, but the backwards chain has already done a lot for you! And it's possible that with the US's various advantages, a decisive battle was the only way to get even a decent chance at a war win, in which case the paths do victory do converge there and Japan was right to backwards chain from that, even if it didn't work out in the end.
As for defense budgets, you might consider that we're backwards chaining on the question "How to make the world better on a grand scale?" You might get a few options: a) Reduce poverty, b) cure diseases, c) prevent wars, d) mitigate existential risk. Probably not exhaustive, but again, this short list contains enough of the solution space to make the exercise worthwhile. Looking into c), you might group wars into categories and decide that "US-initiated invasions" is a large category that could be solved all at once, much more easily than, say, "religious civil wars". And from there, you could very well end up thinking about the defense budget.
Datum: The existence of this prize has spurred me to put actual some effort into AI alignment, for reasons I don't fully understand--I'm confident it's not about the money, and even the offer of feedback isn't that strong an incentive, since I think anything worthwhile I posted on LW would get feedback anyway.
My guess is that it sends the message that the Serious Real Researchers actually want input from random amateur LW readers like me.
Also, the first announcement of the prize rules was in one ear and out the other for me. Reading this announcement of the winners is what made it click for me that this is something I should actually do. Possibly because I had previously argued on LW with one of the winners in a way that made my brain file them as my equal (admittedly, the topic of that was kinda bike-sheddy, but system 1 gonna system 1).
This. I've decided that I'm done with organizing paper. Anything I'll ever need to read again, I make digital from the start. But I still use paper routinely, in essentially write-only fashion.
This is also a great thing about whiteboards--they foreclose even the option of creating management burden for yourself.
Honestly I'm not sure Oracles are the best approach either, but I'll push the Pareto frontier of safe AI design wherever I can.
Though I'm less worried about the epistemic flaws exacerbating a box-break--it seems an epistemically healthy AI breaking its box would be maximally bad already--but more about the epistemic flaws being prone to self-correction. For instance, if the AI constructs a subagent of the 'try random stuff, repeat whatever works' flavor.
The practical difference is that the counterfactual oracle design doesn't address side-channel attacks, only unsafe answers.
Internally, the counterfactual oracle is implemented via the utility function: it wants to give an answer that would be accurate if it were unread. This puts no constraints on how it gets that answer, and I don't see any way extend the technique to cover the reasoning process.
My proposal is implemented via a constraint on the AI's model of the world. Whether this is actually possible depends on the details of the AI; anything of a "try random stuff, repeat whatever gets results" nature would make it impossible, but an explicitly Bayesian thing like the AIXI family would be amenable. I think this is why Stuart works with the utility function lately, but I don't think you can get a safe Oracle this way without either creating an agent-grade safe utility function or constructing a superintelligence-proof traditional box.
I'm not sure your refutation of the leverage penalty works. If there really are 3 ↑↑↑ 3 copies of you, your decision conditioned on that may still not be to pay. You have to compare
P(A real mugging will happen) x U(all your copies die)
against
P(fake muggings happen) x U(lose five dollars) x (expected number of copies getting fake-mugged)
where that last term will in fact be proportional to 3 ↑↑↑ 3. Even if there is an incomprehensibly vast matrix, its Dark Lords are pretty unlikely to mug you for petty cash. And this plausibly does make you pay in the Muggle case, since P(fake muggings happen) is way down if 'mugging' involves tearing a hole in the sky.
I think I disagree with your approach here.
I, and I think most people in practice, use reflective equilibrium to decide what our ethics are. This means that we can notice that our ethical intuitions are insensitive to scope, but also that upon reflection it seems like this is wrong, and thus adopt an ethics different from that given by our naive intuition.
When we're trying to use logic to decide whether to accept an ethical conclusion counter to our intuition, it's no good to document what our intuition currently says as if that settles the matter.
A priori, 1,000 lives at risk may seem just as urgent as 10,000. But we think about it, and we do our best to override it.
And in fact, I fail pretty hard at it. I'm pretty sure the amount I give to charity wouldn't be different in a world where the effectiveness of the best causes were an order of magnitude different. I suspect this is true of many; certainly anyone following the Giving What We Can pledge is using an ancient Schelling Point rather than any kind of calculation. But that doesn't mean you can convince me that my "real" ethics doesn't care how many lives are saved.
When we talk about weird hypotheticals like Pascallian deals, we aren't trying to figure out what our intuition says; we're trying to figure out whether we should overrule it.
I get that old formalism isn't viable, but I don't see how that obviates the completeness question. "Is it possible that (e.g.) Goldbach's Conjecture has no counterexamples but cannot be proven using any intuitively satisfying set of axioms?" seems like an interesting* question, and seems to be about the completeness of mathematics-the-social-activity. I can't cash this out in the politics metaphor because there's no real political equivalent to theorem proving.
*Interesting if you don't consider it resolved by Godel, anyway.
>If you don't assume that mathematics is a formal logic, then worrying about mathematics does not lead one to consider completeness of mathematics in the first place.
To make sure I understand this right: This is because there are definitely computationally-intractable problems (e.g. 3^^^^^3-digit multiplication), so mathematics-as-a-social-activity is obviously incomplete?
Okay, I was kinda bored while reading this, but after reading it I asked myself how much modest epistomology I used in my life. I realized I wasn't even at the level of ignoring my immodest inside-view estimates —I wasn't generating them!
I'm now in the process of seriously evaluating the success chances of the creative ideas I've had over the years, which I'm realizing I never actually did. I put real (though hobby-level) work into one once, and I've long regarded quitting my day job someday as "a serious possibility", but I just felt not allowed to generate an honest answer to "how likely would this be to succeed".
And guess what, this evaluation shows I'm an idiot for keeping my ideas on the back burner as much as I have.
Agreed with Raemon that this was kinda boring. Chapters/sections weren't part of it for me, either. Just seemed to beat a dead horse a bit, especially after the rest of InEq.
I wouldn't have bothered with this criticism, except that I find the divided reaction interesting.
Anyone else hearing "Ride of the Valkyries" in their head?
Upvoted because I enjoyed reading it, and therefore personally want more stuff like it. Its shortcomings are real, in particular the concept of "not enough money to facilitate transactions" needs to be fleshed out. I only want more like it on the assumption that this doesn't funge against other Yudkowsky posts.
I think the Gaffe Theory is approximately correct. My sense is that there are two Overton Windows, one for what serious candidates can say, and one for what a mainstream publication can print an op-ed about.
I think I have a similar problem. I sometimes just fake the signal. Partly I worry that my insincerity shows, but I also suspect that guilt/shame displays are just becoming devalued in general.
My best solution is to display a (genuine) determination to do better in the future--in fact, I've basically made that my personal definition of an apology. The only trouble is that I can't do this when I don't actually feel I've acted wrongly, which is especially a problem insofar as guilt for things that aren't your fault is sometimes expected. c.f. some theories about survivors' guilt.
Why speak in riddles? Because sometimes solving a puzzle teaches you more than being the solution.
As an observation about coffee, Zizek's statement is true in its way but not especially useful. His broader point is "you should think about history and context more." So he presents you with two physically identical items, coffee without milk and coffee without cream, so that you can be surprised by noticing that there's potentially an important difference, and that surprise will make you update towards considering context and history as well as present physical makeup.
Interestingly, this is actually ameliorated by culture being cut along socioeconomic lines. So the people who try to wear a given style mostly have similar wealth, and therefore most of the variation in their stylistic quality is not caused by wealth variation.
One point you neglect that would be especially relevant in the AGI scenario is leakiness of accumulated advantage. When the advantage is tech, the leaks take the pretty concrete form of copying the tech. But there's also a sense that in a globalized world, undeveloped nations will often grow faster, catching up to the more prosperous nations.
Leakiness probably explains why Britain was never strong enough to conquer Europe despite having the Industrial Revolution first.
I thought you were suggesting I shouldn't have posted this on frontpage, in which case we'd obviously disagree. If not, then we agree.
I don't consider the second point a disagreement, since we're both sort of ambivalent. I'm pretty sure there are people who would think I'm unambiguously wrong not to be signed up, and they're who I was looking for.
On the first point--this actually seems substantial, maybe worth pursuing. I think initial-distribution measures carry a substantial risk of backfiring and making the poor poorer, while redistribution does not--seems hard to expect the same results if this is the case. This isn't necessarily a crux for me, but I'll hear more about your position before I try to find a proper DC.
I agree that on LW 1.0, this would belong under discussion rather than main. But as far as I can tell, LW 2.0 non-frontpage posts have much less visibility than old discussion posts, to the point that this type of thread would not be viable.
Perhaps our double crux is "Non-frontpage LW 2.0 posts are a viable platform for open-type threads"? Or maybe it's "It's better to be unable to have open-type threads than to crowd the front page with them"?
In economic policy, redistribution measures (e.g. UBI) are a better idea than trying to change the initial distribution (e.g. minimum wage).
It is not especially irrational to forego cryonics.
So I was actually considering in-thread discussion to be a valid option--'one-on-one' meaning, in that case, that only two people would participate in a given subthread. If you think that's too optimistic, I might reconsider it. But I will definitely try to make the top point clearer, maybe
Discussions are to be one-on-one. Do not jump into anyone else's thread.
I find this easier to parse from a non-neutral perspective: If all bad comments are (currently) overtly bad, you might think we could ban overt bad comments and win at moderation. But in fact, once the ban is in effect, the bad commenters might switch to covert bad comments instead.
The ban isn't necessarily wrong, but this effect has to be considered in the cost-benefit analysis.
That's the correct solution for food weights, but this is sort of beyond philh's point, which is just that those you govern will adapt their behavior to the rules you put in place.
These differences are so profound and far-reaching—and so especially relevant for people with “our sort” of minds—that I hesitate to even begin enumerating them (though I’ll attempt to, upon request; but they should be obvious, I think!
I request this enumeration, if your offer extends to interlopers and not just Duncan.
(The differences I can think of are instant vs asynchronous communication, nonverbal+verbal vs. verbal only, and speaking only to one another vs. having an audience. But I don't see why these are *inevitably* so profound and far-reaching.
This makes me want to try it :)
Would anyone else be interested in a (probably recurring if successful) "Productive disagreement practice thread"? Having a wider audience than one meetup's attendance should make it easier to find good disagreements, while being within LW would hopefully secure good faith.
I imagine a format where participants make top-level comments listing beliefs they think likely to generate productive disagreement, then others can pick a belief to debate one-on-one.