Pattern's Shortform Feed


The Echo Fallacy

Hall of Mirrors or Plato's Cave

(Currently complete, but subject to change with post - unless different versions are separate posts?)




  • Open
  • Addressed by document's later contents


(Numbering is within sections and only reflects order within text (appearance), but doesn't cross categories.)



In a statistical sense, that would mean the loss function would be something like: the expected squared difference between the percent a proposal gets in the legislature, and the percent it would have gotten in a direct referendum, given that the proposal is drawn from some predefined distribution over possible proposals.

I'm curious how this loss function differs from the results being the same, rather than the ratios. I'd guess that it would get smoother results, but the main issue would probably be interactions. (Systematic rather than random variance.)

Which brings us to the third way better outcomes might be possible: by making representatives accountable primarily not on a moment-by-moment basis, but on a once-per-election time cycle, we might nudge them to think from a slightly more future-oriented perspective.

The question of how to get a future oriented perspective seems an empirical one.


For example, imagine the fictional planet mentioned in The Hitchhikers' Guide to the Galaxy, where people vote to elect lizards. Though the complete lack of representativity makes this a poor democracy, it would clearly be even less democratic if the losing lizard committed electoral fraud.

Obligatory mention that gerrymandering might be detectable by measuring variance in outcomes (of elections).

If the "effective dimension" is low, then a relatively small legislature can do quite a good job at representing even an infinite population.

But the election would take forever.

If the effective dimension is high, though, then the ideological distance between a randomly-chosen voter and the closest member of the legislature, will tend to be almost as high as the distance between any two randomly-chosen voters. In other words, there will be so many ways for any two people to differ, that the very idea that one person could "represent" another well begins to break down (unless the size of the legislature grows exponentially in the number of effective dimensions)

Unless people are organized into groups based around a group that literally negotiate over the tradeoffs, possibly reversing the 'usual' direction of voters -> legislature. With the last step being the one that makes the Laws.


What is the effective dimension of political ideology in reality? It depends how you measure it. If you look at US incumbent politicians' voting records using a methodology like DW-NOMINATE

The choice of dimensions seems like a bigger deal than the number.

But if I'm wrong, and the variation in political opinions/interests that are politically salient in an ideal world is much higher, then the very "republican idea" of representative democracy is problematic.

And the question of whether things (like the factors captured by the dimensions, as well as 'representability') interact with each other to greatly affect outcomes (laws or whatever the means in question is that itself leads to outcomes).


Having individual representation, where each voter can point to the representative they helped elect, and each representative was elected by an equal number of voters, is good from several perspectives.

Why? Getting a more diverse set of perspectives seems doable via:

Electing one or more officials with different numbers of votes, but giving them power based on the number of votes they received. (Officials with less power may have less of an impact most of the time.* But if they're able to bring new ideas to the table that work...)

*Ignoring the possibility of officials with powers based on circumstance, or within certain domains.

For instance, in this case, we could optimize for piety even while also ensuring unbiasedness and minimizing variance.

Variance seems like it might be a red herring, given a focus on outcomes/exemplariness.


3a from the next section seems like it might be relevant.

In practice, it's impossible to create a voting method that elects S equally-weighted candidates without wasting some votes; typically at least somewhere between 1/2S and 1/(S+1) of them.

Imprecision: instead of dividing voters into S separate groups with one representative per group, the method might divide them into fewer groups with more than one representative each. This almost certainly would lead to higher bias and/or variance at the next step.

Weird methods:

Representatives are chosen from Group A, but chosen by Group B voters. (There's also changing power amounts as mentioned previously.)


Note that in rating a voting method, we're doing this "backwards": we constructed the "set of responsible voters" from the candidate, not vice versa.

In some ways there seemed to be a similar circularity around the use for "utilitarianism" whcih seemed to mean 'representativeness of democracy' in context.

.... (I'm making good progress writing this but I've gotta go now; to be continued)

I look forward to seeing more in this section.

Addressed by document's later contents:


When you're choosing between a finite number of options that are known at the time of voting, you need a single-winner voting method. It's pretty easy to define a way to measure "how good" such a method is: we can just use utilitarianism.


Based on the rest of this document, you mean a utility function*, or something else very different from utilitarianism.

This (later) part makes it more clear:

The legislative outcome is an approximation of the popular outcome which is an approximation of the ideal outcome; if the first step of approximating goes wrong, there's no reason to believe the second step will fix it.


But electing a legislature is something different. How do you define the utility of a legislature that has a 60/40 split between two parties, versus one that has 70/30 split?

Depends on the system - The utility is based on the actions of the legislature (and also it's cost and thus size, especially if the people/voters have to pay for it).

(Later) Addressed here:

The legislative outcome is an approximation of the popular outcome which is an approximation of the ideal outcome; if the first step of approximating goes wrong, there's no reason to believe the second step will fix it.


In practice, "all" tends to be limited by some eligibility criteria, but I'd say that most such criteria make the system less democratic, other than maybe age and residency.
The normative case that this kind of democracy is a good idea tends to rest on the Condorcet jury theorem: that is, if the average voter is more likely to be "right" than "wrong" (for instance, more likely to choose the option which will maximize global utility), then the chances the referendum outcome will be "right" quickly converge towards 100% as the number of voters increases.

a) Voters could also be selected to 'create/represent' conflict, and see which side wins. (If the remaining population that hasn't voted isn't big enough to overturn the current win/result, the voting stops, or the result is passed. (An error rate/margin could also be included.))

b) This could be reversed; the voters are selected in order to create a body in which

the average voter is more likely to be "right" than "wrong"

and where additional voters improve this probability.


which will maximize global utility

The question of cost is also a matter of global utility.


If instead of voters voting, a body that deliberates on the question/s this can allow for more options being evaluated, and perhaps more "right" being found - though the bigger the body (and the search space), the longer this might take, and costs probably should be taken into account.

Later addressed here:

First, it may even be that there is some optimum size for reasoned debate and

(And it may be different for "exemplary representatives".)


on a fooding

on a footing

((OK, "effective dimension" isn't exactly that; it measures not only how many "relatively big" dimensions there are, but also how "relatively small" the rest are. I'm being deliberately vague about how precisely I'd define "effective dimension" because I suspect that unless you ignore variation below a certain noise threshold the ED is actually infinite in the limit of infinite voters.))

The multiple parenthesis are an interesting style choice.

The "ongoing" nature of this post is interesting and I like it. Having read it as it is, I think it worked really well, and some of the empty sections could fit together in a different post. Though since they don't exist I don't know how important the connections between them in one document are. (Maybe having this version as is might be useful.)

Generally: Resting/Taking a break. (There might be more information on this in neuroscience, I'm not deeply familiar with that.)

Specifically: Starting over.

Normally if you stick with might be the benefits of:

1. if you're productive you can continue to be

2. Not having to spend time re-loading context, figuring out what you were thinking/doing before

Sometimes 2 is an advantage. (Particularly if you stop doing something that isn't productive.) If your approach isn't working, 'starting over' can be very beneficial without thinking about the problem in the mean time. For an extreme example, if you stop doing something for not just days but weeks or months, and don't think about it, and then you do it again, you have to figure things out again (knowledge, approach, etc.) to a fair extent (and restart context).

This can help in moving from 'this isn't working' equilibria to 'things working great'.

(Having a different perspective can also be related to 'unconscious' thought or making (random) connections. Over a larger time frame there's more things that connections can be made between. Above I emphasized losing context and starting afresh, rather than connecting/borrowing contexts.)

Are there any other posts by your co-author?

Message received. To avoid confusion, it might help if your comment (above) is deleted within 29 days.

The problem is solved if the limit as x approaches infinity, of p(x)*x, where x is the utility the mugger offers, is 0. (This is the case if p(x) <= x^2. If that's an upper bound, however loose, then the problem is solved.)

This seems like a better idea.

It could just be the effects of taking a break; i.e., the benefits of not consciously thinking about the problem, instead of a benefit stemming from unconscious thought.

Are all statements improvable?

Why deploy the method against your side?

Thinking that some things aren't all right to acknowledge might be more fundamental.

I was guessing that "all of the shadow stuff is about how people think of themselves (i.e. identity. I am _, I am not _.) because it's something people get tied up in, and it's a reason someone might want to deny something.

I also think of Perfectionism (and it's opposites, not trying (if the standard is unobtainable*)) as being (related to) fear of failure.

*This might cash out as:

"I'm good at X" -> does well, puts in a lot of effort (Maybe judges people for having low standards, or has different personal standards, whether high, nonjudgemental, distributed, etc.), may seek it out + challenges in domain

"I'm bad at Y" -> doesn't try, scrapes by, avoids/ugh field/procrastinates, says 'it doesn't matter'/'i don't care', judges self, maybe dirty pain

(It's not super easy to delineate 'enjoys/seeks out thing' from (consistently) 'works to get better at it'.)

The authors prove that EPIC is a pseudometric, that is, it behaves like a distance function, except that it is possible for EPIC(R1, R2) to be zero even if R1 and R2 are different. This is desirable, since if R1 and R2 differ by a potential shaping function, then their optimal policies are guaranteed to be the same regardless of transition dynamics, and so we should report the “distance” between them to be zero.

If EPIC(R1, R2) is thought of as two functions f(g(R1), g(R2)), where g returns the optimal policy of its input, and f is a distance function for optimal policies, then f(OptimalPolicy1, OptimalPolicy2) is a metric?

One nice thing is that, roughly speaking, rewards are judged to be equivalent if they would generalize to any possible transition function that is consistent with DT. This means that by designing DT appropriately, we can capture how much generalization we want to evaluate.

Can more than one DT be used, so there's more than one measure?

This is a useful knob to have: if we used the maximally large DT, the task would be far too difficult, as it would be expected to generalize far more than even humans can.

There's a maximum?

\>\! followed by a space, without the \s, on a new line:

Though you might have to write something on that line for it to appear.

Hitting enter at the end of that spoiler tag just makes it longer.

Backspace at the beginning of a new line can keep the tag from going longer when you want it to end.

I wonder what Serious Black is going to do after he breaks out of Azkaban, being more "rational" and all - or would the fic depict less rationality, and more negative impacts from being stuck in Azkaban for what, a decade?

And do p-zombies behave differently? Or are the Dementor kissed noticeably different in some other way?

Drop generic Baye's rule recommendations in favor of applications:

  • Results oriented - what works works, theory or no theory. Example: Making a good first impression formally, it may be obvious that there are better and worse things to say/methods of delivery than others. Past a certain point, this might not be super useful - but the basics matter.
  • Types, and balance. Some people like talking. Some people like being around lots people. If someone likes talking, maybe listening more helps in that situation. But tendencies aren't the be all, end all, anymore than moods are. Things that are generally the case about a person matter, but current circumstances (in the moment/thinking fast rather than working out ahead of time) can be a big deal as well - especially when they differ.
  • Noticing mistakes, or benefits just from thinking about things more.

Your friend might disagree because the idea of general methods are counter-intuitive. (Different types of people, different things that appeal, etc.) People do generally exhibit some pattern (like their interests) which are important, and important to them (shared interests can bring people together).

Emotional logic is orthogonal to formal logic,

Not completely.

we can’t always use one type of intelligence to make decisions related to another type of intelligence.

Indeed. Here the relation might be visible (and useful) around basic stuff. (Being nicer is more effective, etc.)

While there might not be a lot of overlap, maybe someday a computer will be able to infer 'frustration' from someone punching it - without otherwise being filled to the brim with emotional intelligence. (If only as a result of hardcoding.)

I recommend creating an anonymous survey. (If you're only looking for data.)

Is the shadow value always identity related? (You are good/[identity X which is good]/not? Perception/model of self worth?)

I was suggesting crypto for a different reason: as a trivial inconvenience/barrier to entry.

The Morituri Nolumus Mori effect, as a reminder, is the thesis that governments and individuals have a consistent, short-term reaction to danger which is stronger than many of us suspected, though not sustainable in the absence of an imminent threat. This effect is just such a hard limit - it can’t do very much except work as a stronger than expected brake. And something like it has been proposed as an explanation, not just by me two months ago but by Will MacAskill and Toby Ord, for why we have already avoided the worst disasters. Here’s Toby’s recent interview:
Following this technique, several advantages spring up: there's no room for bloat and useless features[5], engineers have an end goal they're motivated to achieve: finish work and automate the job away.

In theory open source also seems to have those advantages. (Though it doesn't necessarily cover hardware.)

is all really inefficient and roundabout compared to what's possible.

This seems accurate - but just observation itself is valuable.

Might have been mentioned in HPMOR.

For simpler ideas:

  • How about just a website where all posts are encrypted/using it requires basic knowledge of, say, PGP?
  • Or using some other existing platform that's supposed to prevent censorship?
Some parts of the universe are more amenable to abstraction than others - inside of a storm, the scope of it may be difficult to determine. From space, it may be trivial, up to a point. (Clarity from distance.)

Mostly, interconnection is limited by empty space everywhere.

Noisy intermediates Z

The image this is a caption for isn't showing up.

f(X), not all of X

The low level details may be important as a means to an end.

“an agent’s accessible action space ‘far away’ from now (i.e. far in the future) depends mainly on what resources it acquires, and is otherwise mostly independent of specific choices made right now”

Only given said resources, which is (presumably) the focus of said choices.

In short: modularity of the organism evolves to match modularity of the environment.

It seems this is more likely a result of how evolution works, i.e. modularity is good for refactoring.

I'm wondering what this thesis of this post is.

Artwork doesn't have to be about reality?

So, it's a lack of balance, and basically this:?

Ultimate understanding requires a constructive theory, but often, says Einstein, progress in theory is impeded by premature attempts at developing constructive theories in the absence of sufficient constraints by means of which to narrow the range of possible constructive theories. It is the function of principle theories to provide such constraint, and progress is often best achieved by focusing first on the establishment of such principles.
In between, there's people who appreciate the "it's about degrees of freedom, not dimensions" 'insight'.

What this means is that the framework itself is likely the least interesting part of what you have to say! If I'm reading your self-help guide, I really just want to know about the places where it can excel for me locally, or what sorts of effects you saw in yourself. All of that information is in the insights, not in the framework itself, which could be nonsense for all I know.

The piece as a whole seems to assume insights are tiny and specific, without generality.

Attempted aphorism: A theory is useful if it tells you what experiments to perform.

the exact same answer it would have output without the perturbation.

It always gives the same answer for the last digit?

Are you only looking for students?

Comment by pattern on What should I teach to my future daughter? · 2020-06-19T17:39:24.170Z · score: 2 (1 votes) · LW · GW

A list of needed practical skills includes:

How to read and write.

"This civilization" isn't some kind of apocalyptic dystopia

Cooking is useful to know, even if there isn't a quarantine on.

what very specific skills she's going to need.

General skill/set:


Sparsity and interpretability? (Stanislav Böhm et al) (summarized by Rohin): If you want to visualize exactly what a neural network is doing, one approach is to visualize the entire computation graph of multiplies, additions, and nonlinearities. While this is extremely complex even on MNIST, we can make it much simpler by making the networks sparse, since any zero weights can be removed from the computation graph. Previous work has shown that we can remove well over 95% of weights from a model without degrading accuracy too much, so the authors do this to make the computation graph easier to understand.

Are models that are trained as sparse, rather than pruned to be sparse, different from each other? (Especially in terms of interpretability.)

Comment by pattern on Short essays on various things I've watched · 2020-06-15T19:20:05.244Z · score: 2 (1 votes) · LW · GW

In the end, I thought her perspective was that evil (and war) is something in people. Gods or no gods, that fight is something everyone has to face. The work she does after that is about that fight/making the world a better place (not in a particularly 'demigod' way).

This tries to solve the problem of 'bad papers getting published', but doesn't seem to touch 'good papers not getting published'.

Comment by pattern on What are some Civilizational Sanity Interventions? · 2020-06-14T23:41:41.942Z · score: 3 (2 votes) · LW · GW
Similarly, the first past the post system used in the United States gives rise to the spoiler effect, which penalizes third parties by increasing the odds that that


Wonder Woman has the overt message that ordinary people can't do very much (though not literally nothing). You need someone with super powers to beat up the villain and then everything will be fine.

Until the end.

What do you mean by "programmable"?

Is it possible to add new features, you hadn't previously thought of? How easy?

Comment by pattern on Cartesian Boundary as Abstraction Boundary · 2020-06-13T07:13:33.415Z · score: 2 (1 votes) · LW · GW

I'm interested in fixed*, but with the ability to write programs that write programs.

(For a particular purpose.)

*and maybe the other things you grouped with it?

When people ask me why I think atheism isn't a religion

First of all, "it" is.

Secondly, theism isn't a religion.

but I suspect some posts are not nearly as useful or fun to readers in 2020 as their karma makes them appear.

Maybe it's because they're old. (You also might want to treat karma as a noisy signal.)

or a general pattern (e.g. posts from LW 1.0 often have a higher score than they deserve because [reason]).

The older a post is, the more time it's had for people to read and upvote it. (Admittedly, this seems reasonable in that if karma reflect the (positive) impact it's had on people's lives, old posts would still have this advantage.)


As for a specific example, I'd say The Sequences, because it doesn't seem like there's been a sequel.

I also thing there's less engagement on LW.* While it might depends on the part of twitter, there's a lot more replies going on. Sometimes it seems like there's a 100 replies to a tweet, in contrast to posts with zero comments. This necessarily means replies will overlap a lot more than they do on LW. Imagine getting 3 distinct comments to a short post on LW, versus a thread of tweets, with 30 responses that mostly boil down to the same 3 responses that are being sent because people are responding without seeing other responses. (And if there's hundreds of very similar responses, asking people to read responses is asking people to read a very boring epic.)

And getting one critical reply, versus the same critical reply from 10 people, even when it's the same fraction of responses, probably affects people differently - if only because it's annoying to see the same message over and over again.

*This could be the case (the medium probably helps) even if that engagement was all positive.

The idea goes that even if you explained how the brain does everything it does, you haven't explained why this doing is accompanied by subjective experience. It's rooted in this idea that subjective experience is somehow completely isolated from, and separate to behavior.
This supposed isolation is already a little suspicious to me.

I'm not sure that this "isolation" exists, either.

I think this idea is useful because it might you explain to a computer what makes ants different from people, or computers different from ants and people, or how anesthetic works? What is joy?

Or how could we tell if robots/aliens have subjective experience?

What if when we looked at the sky, and the ocean, we both used the English word "Blue", but you experienced what I experience when I look at what we both agree is called "Green"?

Related question: Do variations between individuals in terms of eyes have effects, beyond colorblind versus not?

This is a good answer.

(Or maybe not: maybe even if everyone bought into the realpolitik analysis, they would still think that democratic institutions were in their personal best interest, and would oppose disruption no less fervently?)
I happen to think that the realpolitik analysis is basically correct, but propagating that knowledge may represent a negative externality.

Why do people oppose disruption?


  • Personal cost. (Establishing a dictatorship may sound like it's in a general's "interest". But maybe generals value being able to not get murdered in their sleep over the "benefits".)
  • Legacy/how history remembers you, or Parenthood*.
  • No one else is doing it.
  • The authority of generals comes from somewhere else.

*If a general has kids, their kids getting to live somewhere other than a military dictatorship may have "warm fuzzies".

I can barely imagine a cabal of the majority of high ranking military officials agreeing to back a candidate that lost an election....
Suppose that ... election is widely believed to have been fraudulent[.]

I can't quite imagine this scenario, but it is interesting to think about a military revolting and demands non-fraudulent elections. (During quarantine, a zero voter fraud plan seems more feasible.)

The links too and from the footnotes are broken.

You mean we can fiddle with the explicit assumptions we use with synthesis mindset?

I haven't fully digested your framework yet. (Connecting 'synthesis', and these other mindsets, to experience.) I mean that, if you have explicit assumptions then:

  • They're easier to examine
  • they can be messed with (as a way of generating hypothesis/ideas) It's 'random' but constrained enough that it has better odds of hitting something useful, or figuring out why something is wrong improves your understanding.

The way to use synthesis mindset is to allow your mind to wander through free association while suspending judgment of the accuracy or relevance of the hypotheses you encounter. It also helps to deliberately remove assumptions about certain ideas, and to toy with mixing multiple ideas together.

(Emphasis added.)

I might just be describing your 'synthesis' concept. (Or something similar, with a more systematic focus.*)

Math examples:

  • If multiplication is repeated addition, then what's repeated multiplication? Repeated powers?
  • What kind of space doesn't obey the triangle inequality?
  • If the sum of the interior angles of a shape are always the same, then squash the shape flat to find the sum.

The obvious disadvantage of synthesis mindset is that it has little power to screen the hypotheses it finds for accuracy or relevance. Synthesis cannot learn to smoothly navigate situations, nor to compare various options, nor to judge its hypotheses against the territory. For many serious problems, synthesis is necessary but insufficient to identify an effective solution.

*These don't seem like problems in math. (Except with untranslated or high level hypothesis. And smooth navigation takes time to build up.)


replacing the goal of 'utility' with information

I read this:

"Conversely, compared to analysis, organization mindset may miss some points of mismatch between its maps and reality, and can fail to apply enough distinct checking to catch flaws in its plans."

and thought if you focus on gaining information instead of some other goal, that downside might go away. Information is a funny resource, but it can be accumulated over time. And before trying to do XYZ where X, Y, and Z are simple and therefore XYZ is simple, sometimes there's the option of first doing them individually (which should be easy because they are simple in theory).

Comment by pattern on Your best future self · 2020-06-07T01:19:47.273Z · score: 2 (1 votes) · LW · GW

Comment by pattern on Consequentialism and Accidents · 2020-06-07T01:13:11.628Z · score: 2 (1 votes) · LW · GW
an act of someone who unwillingly saves 100 lives he was trying to kill

Has anyone ever done this?

The obvious disadvantage of synthesis mindset is that it has little power to screen the hypotheses it finds for accuracy or relevance. Synthesis cannot learn to smoothly navigate situations, nor to compare various options, nor to judge its hypotheses against the territory. For many serious problems, synthesis is necessary but insufficient to identify an effective solution.

An explicit set of assumptions can be fiddled with more directly. (A triangle may, by definition, have non-zero area, or it may not.)

Conversely, compared to analysis, organization mindset may miss some points of mismatch between its maps and reality, and can fail to apply enough distinct checking to catch flaws in its plans.

The goal of 'utility' can be replaced with information (expected or otherwise).



