Posts
Comments
In what sense are you using the word "trilemma"? I'm either not familiar with the usage or missing a big message of the post.
(The common definition of "trilemma" I'm most familiar with presents three desiderata, of which it's possible to achieve at most two.)
I, too, am excited.
However, the post assumes that 1) there is (or should be) one correct answer, 2) which is of the form: (1, 0, 0, 0) or a permutation thereof, and 3) the material is independent of the system (does not include probability, for example).
These are assumed for the sake of explanation, but none are necessary; in fact, the scoring rule and analysis go through verbatim if you have questions with multiple answers in the form of arbitrary vectors of numbers, even if they have randomness. The correct choice is still to guess, for each potential answer, your expectation of that answer's realized result.
just because "I don't want to see more of this" doesn't mean it's up to me to influence whether anyone else can see it.
I feel like this proves more than you want. For example, is it up to you to influence whether someone sees more of something, just because you want to see more of it?
Similarly, it's also helpful to get a reason for up votes, but enforcing that a reason be given can reduce the amount of informationaggregation that will occur, on some margins. What justifies an asymmetry between how we aggregate positive information and how we aggregate negative information? Or would you also argue that up votes should come with reasons?
I mean a weighted sum where weights add to unity.
You need an exponentially increasing reward for your argument to go through. In particular, this doesn't prove enough:
Since at each moment in time, you face the exact same problem (linearly increasing reward, αexponentially decaying survival rate)
The problem isn't exactly the same, because the ratio of (linear) growth rate to current value is decreasing over time. At some point, the value equals (is the right expression, I think?), and your marginal value of waiting is 0 (and decreasing), and you sell.
If the ratio of growth rate to current value is constant over time, then you're in the same position at each step, but then it's either the St. Petersburg paradox or worthless.
Sorry, I'm writing pretty informally here. I'm pretty sure that there are senses in which these arguments can be made formal, though I'm not really interested in going through that here, mostly because I don't think formality wins us anything interesting here.
Some notes, though: (still in a fairly informal mode)
My intuition that the only way to combine the two estimates without introducing a bias or assumed prior is by a mixture comes from treating each estimate (treated as a random variable) as a true estimate plus some idiosyncratic noise. Then any function of them yields an expression in terms of true estimate, each respective estimator's noise, and maybe other constants. But "unbiased" implies that setting the noise terms to 0 should set the expression equal to the true estimate (in expectation). Without making assumptions about the actual distribution of true values, this needs to just be 1 times the true estimate (plusmaybe some other noise you don't want, which I think you can get rid of). And the only way you get there from the noisy estimates is a mixture.
By "assembly", I'm proposing to treat each estimate as a larger number of estimates with the same mean and larger variance, such that they form equivalent evidence. Intuitively, this works out if the count goes as the square of the variance ratio. Then I claim that the natural thing to do with many estimates each of the same variance is to take a straight average.
But they're distributions, not observations.
Sure, formally each observer's posterior is a distribution. But if you treat "observer 1's posterior is Normally distributed, with mean and standard deviation " as an observation you make as a Bayesian (who trusts observer 1's estimation and calibration), it gets you there.
Ah, okay. In that case, here are a few attempts to ground the idea philosophically:

It's the "priorfree" estimate with the least error. See that unbiased "priorfree" estimates must be mixtures of the (unbiased) estimates, and that biased estimates are dominated by being scaled to fit. So the best you can do is to pick the mixture that minimizes variance, which this is.

It actually is the point that maximizes the product of likelihoods (equivalently, the joint likelihood, since the estimate errors are assumed to be independent). You can see this by remembering that the Normal pdf is the inverse exponential quadratic, so you maximize the product of likelihoods by maximizing the sum of loglikelihoods, which happens where the loglikelihood slopes are each the negative of the other, which happens when distances are inversely proportional to the x^2 coefficients (or the weights are inversely proportional to the variances).

There's a pseudofrequentist(?) version of this, where you treat each estimate as an assembly of (highervariance) estimates at the same point, notice that the count is inversely proportional to the variance, and take the total population mean as your estimator. (You might like the mean for its L2minimizing properties.)

A Bayesian interpretation is that, given the improper prior uniformly distributed over numbers and treating the two as independent pieces of evidence, the given formula gives the mode of the posterior (and, since the posterior is Normal, gives its mean and median as well).
Are any of those compelling?
Are you asking for a justification for averaging independent estimates to achieve an estimate with lower errors? "Blended estimate" isn't a specific term of art, but the general idea here is so common that I'm not sure _what_ the most common term for it is.
And the theoretical justification  under assumptions of independent and Normal errors  is at the post, where the author demonstrates that there's a lower error from the weighted average (and that their choice of weights minimizes the error). Am I missing something here?
Arimaa is the(?) classic example of a chesslike board game that was designed to be hard for AI (albeit from an age before "AI" mostly meant ML).
From David Wu's paper on the bot that finally beat top humans in 2015:
Why is Arimaa computerresistant? We can identify two major obstacles.
The first is that in Arimaa, the perturn branching factor is extremely large due to the combinatorial possibilities produced by having four steps per turn. Even after identifying equivalent permutations of steps as the same move, on average there are about 17000 legal moves per turn (Haskin, 2006). This is a serious impediment to search.
Obviously, a high branching factor alone doesn’t imply computerresistance, particularly if the standard of comparison is with human play: high branching factors affect humans as well. However, Arimaa has a property common to many computerresistant games: that “per amount of branching” the board changes slowly. Indeed, pieces move only one orthogonal step at a time. This makes it possible to effectively plan ahead, cache evaluations of local positions, and visualize patterns of good moves, all things that usually favor human players.
The second obstacle is that Arimaa is frequently quite positional or strategic, as opposed to tactical. Capturing or trading pieces is somewhat more difficult in Arimaa than in, for example, Chess. Moreover, since the elephant cannot be pushed or pulled and can defend any trap, deadlocks between defending elephants are common, giving rise to positions sparse in easy tactical landmarks. Progress in such positions requires good longterm judgement and strategic understanding to guide the gradual maneuvering of pieces, posing a challenge for positional evaluation.
It's easy to play armchair statistician and contribute little, but I want to point out that the empirics cited here are effectively just anecdotes. The paper studies 13 pairs and 13 individuals in three assignments in one class at UUtah. Its estimate of relative time costs is only significant to ~ because development time has variance of (if I backsolved correctly) 65%, which...seems about right. Still, it seems like borderline abuse of frequentist statistics to argue that a twotailed p<0.05 should be required to reject the hypothesis that pairs finish projects in half the wallclock time of individuals (which is the null the analysis assumes).
That said, the author correctly identifies that quality matters significantly more than speed. The quality metric, however, is "assignment tests passed" in throwaway academic projects, eliding the questions of what quality failures would or wouldn't be caught by the review / CI workflows that an industrial project would be going through anyway.
So, finger to the wind, this study feels like it suggests that a pair spends 15% more personhours (once they get used to each other) before turning their schoolwork in, and do 15% more of the work of the assignment than a student working alone. Consistent with the higher reported workenjoyment numbers! Definitely a stronger showing than I would have guessed! But definitely not wellabstracted by "no significant result for time; significant improvement for quality".
What am I missing here?
(continued, to address a different point)
B and C seem like arguments against "simple" (i.e., evenodds) bets as well as weird (e.g., "70% probability") bets, except for C's "like bets where I'm surer...about what's going on", which is addressed by A (sibling comment).
Your point about differences in wealth causing different people to have different thresholds for meaningfulness is valid, though I've found that it matters much less than you'd expect in practice. It turns out that people making upwards of $100k/yr still do not feel good about opening up their wallet you give you $3. In fact, it feels so bad that if you do it more than a few times in a row, you really feel the need to examine your own calibration, which is exactly the success condition.
I've found that the small ritual of exchanging pieces of paper just carries significantly more weight than would be implied by their relation to my total savings. (For this, it's surprisingly important to exchange actual pieces of paper; electronic payments make the whole thing less real, ruining the whole point.)
Finally, it's hard to argue with someone's utility function, but I think that some rationalists get this one badly wrong by failing to actually multiply real numbers. For example, if you make a $10 bet (as defined in my sibling comment) every day for a year at the true probabilities, the standard decision of your profit/loss on the year is <$200, or $200/365 per day, which seems like a very small annual cost to practice being better calibrated and evaluate just how wellcalibrated you are.
Hi! I've done a fair amount of betting beliefs for fun and calibration over the years; I think most of these issues are solvable.
A is a solved problem. The formulation that I (and my local social group) prefer goes like "The buyer pays $X*P% to the seller. The seller pays $X to the buyer if the event comes true."
The precise payoffs aren't the important part, so long as they correspond to quoted probabilities in the correct way (and agreed sizes in a reasonable way). So this convention makes the probability you're discussing an explicit part of the bet terms, so people can discuss probabilities instead of confusing themselves with payoffs (and gives a clear upper bound for possible losses). Then you can work out exact payoffs later, after the bet resolves.
(As a worked example, if you thought a probability was less than 70% and wanted to bet about $20 with me, if you "sold $20 at 70%" in the above convention, you'd either win $2070%=$14 or lose $20($2070%)=$6. But it's even easier to see that you selling a liability of $20p(happens) for $2070% is good for you if you think p(happens)<70%.)
You've right that odds are a terrible convention for betting on probabilities unless you're trying to hide the actual numbers from your counterparties (which is the norm in retail sports betting).
I also think that if the "sixth friend" donates $10k in line with each other friend's values and beliefs (as a result of social expectation, not contract), then there's no particular benefit to being the one who has to handle the money, and you don't need to trust in multiyear commitments.
Your suggestion is correct, though it seemed too messy (and nonessential) to explain for the sake of an offthecuff proposal. I added a footnote to clarify this above, though.
Proposal: Five friends in this situation write $10k checks[1] to a sixth. They all have a long chat about their altruist values and beliefs. The sixth donates $60k to a variety of EA causes.
Question: Just how likely / unpleasant would the ensuing IRS audit be?
(There's also a microdonorlottery version of this, except the individual contributions are personal gifts and the full $60k is a charitable donation.)
[1] Actually, you want this to be something like $7k, since the tax deduction from donating is worth [your marginal income tax rate] on the amount, roughly 30%. Formally, $10k less the tax benefits from donating $10k.
That's exactly correct. It's a standard taxationbegetsmisallocation scenario.
For reference, PI's current rules have this effect to roughly 03% per contract, potentially adding across multiple contracts in a bundle. Prices closer to 50% are worse (though prices further away have their own biases, as Zvi explains).
Yeah, Zvi is (unsurprisingly) right; the change in margining rules (after I wrote that post) makes it much better to sell the lowvalue contracts, and the withdrawal fees amortize if you're in for the longer term.
To new rules, and on the back of my envelope, Zvi's 12% "arbitrage" is something like a few percent good: maybe it covers withdrawal fees on its own, and likely will do so after a few rounds. The opportunity cost of capital is a whole 'nother issue...
I also strongly endorse the punchline that trading (even on the margins of trading costs) is some of the best rationalist training you can find.
Huh, I hadn't noticed that they didn't tie up the potential fees on your winnings. Hypotheses:
 bug introduced when they moved from gross margining to net margining years and didn't reconsider fees withholding
 doesn't actually matter; they don't give up ~anything by letting some people carrying small balances make free trades
 it's really hard to abuse this into free trades repeatedly
 the withholding here is too complicated and feelbad to explain
 other
Ah, that makes sense.
Separately, I'm not entirely convinced by that second bullet point  it seems like a nonomniscient state planner in a nonstationary environment would benefit from being able to determine the desired level of redistribution after the wealthy have accrued their income as wealth, rather than needing to get it right as they earned it.
(I'm assuming away the confiscatory impulse here, naturally; in practice, the political economy of confiscation causes serious issues for deferred decisions about distribution like this.)
Can you explain more why the tax rate on the riskfreerate portion of investment income should be 0? A positive rate here implements a proxy wealth tax (without raising the reporting problems of a direct wealth tax), and a nonzero wealth tax might be part of an optimal tax policy (e.g., for the lotsofsmalltaxes argument, if no other reason).
(I'm not sure that this is right, and am mostly asking this question from a stance of exploratory uncertainty.)
To be clear, this is the loworder approximation around 0; as explained in Paul's link (sibling to this) the effect away from zero involves the shape of the supply and demand curves through the relevant region of prices (and the stated claim holds when they're linear).
My guess is that one gets a reasonable start by framing more tasks as selfdelegation (adding the middle step of "decide to do" / "describe task on todo list" / "do task"), then periodically reviewing the taskscompeted list and pondering the benefits and feasibility of outsourcing a chunk of the "do task"s.
Creating a record of taskintentions has a few benefits in making selfreflection possible; reflecting on delegation opportunities is a special case.
Oh, you're right. The net incentive to catch cheaters is actually... 1/(k(k1)^2), then? The relative incentive story is worse, though still better in total than the positive version, and better still if you assume a constantsize disincentive to be caught cheating.
Another (related?) advantage is that the incentives to manipulate and catch manipulation are much better balanced with the negative ("you're out") version. Consider:
 Perfectly cheating in the positive version improves your chances of winning by (n1)/n, and stopping you from doing so improves each other person's chances by 1/n.
 Perfectly cheating in a round of the negative version improves your chances of winning by 1/(k(k1)), where k is the number of people in to start the round. Stopping you from doing so improves each other person's chances by the same amount.
 The total (summed) incentive to manipulate in the negative version is (n1)/n, the same as in the positive case.
(Disclaimer: I'm a financial professional, but I'm not anyone's investment advisor, much less yours.)
You mention diversity as an advantage because it reduces your risk, but this framing is missing the crucial point that you can transmute a portfolio that's 0.5x as risky as your baseline to one that returns twice as much as your baseline. (Wei_Dai mentions this, but obliquely.) The trick is to use leverage, which is not as hard to get (or as expensive, or as complicated) as you think.
To be explicit about it: if you increase the perdollar riskiness of your portfolio without increasing the perdollar expected value, then after you leverage it down until it's at the optimal level of risk for your utility function (which you were going to do, right?), you will have lower expected returns.
The relevant question is "how much lower?" (which is precisely to say "how much does it increase your perdollar risk?"), which I answer in my response to Wei_Dai (nephew to this). The answer turns out to be "very little", but in order to get there, you have to be asking the right question first.
I don't have a good intuition about how costly this actually is in practice, if you only play with 10% of your portfolio.
tl;dr extremely little.
Here's some numbers I made up:
 Let the market's single common factor explain 90% of the variance of each stock.
 Let the remaining 10%s be idiosyncratic and independent.
 Let stocks have equal volatility (and let all risk be described by volatility).
Now compare a portfolio that's $100 of each of a hundred stocks with one that's $90 of each of a hundred plus $1k of another stock. (I'll model each stock as 0.75 times the market factor plus 0.25 a samevariance idiosyncratic factor.) Compared to a $10k singlestock portfolio...
 the equalweighted portfolio has like
 the shotcaller's portfolio has like
...for an increase in of 4.5 basis points. So, pretty negligible.
Even if the market's single factor explains only half of the variance of each stock, the increased risk of the shotcaller's portfolio is just 40 basis points ( vs ). In the extreme case where stocks are uncorrelated, the increased risk is +34.5%, though I think that that's unrealistically generous to the diversification strategy.
Since an increase in volatilityperdollar of basis points means that you give up basis points of your expected returns, I'm going to say that this effect is negligible in the "10% of portfolio" setting.
Can you define "rational/consistent"? The terms are a bit overloaded, especially in this community, and making your definition precise is itself most of the answer to your question.
For example, you give some good examples of nontransitive decideractors, and if some of them are "rational", then nontransitive preferences can be rational, as you point out. Alternatively, one definition of "consistent" is that a decideractor will always reach the same decision when it has the same information, regardless of what other options it has previously rejected, which requires transitive preferences.
Is the idea something like: the actor themself is uncertain about the value of the project, and the kickstarter also helps them find out whether it's worth doing
Nope! The paper's model for this result assumes that the value conditioned on success is known to the proposer, so that the proposer's only incentive is to maximize their own profits by setting the payout odds and threshold. The (nonobvious to me) result that the paper proves is that this coincidentally minimizes the probability of upcascades:
A higher decision threshold excludes more DOWN cascades while it is less likely to be reached. We show that the concern about potential DOWN cascades dominates the concern about likelihood to reach the target. To maximize the proceeds, the proponent endogenously sets the target to the smallest number that in equilibrium completely excludes DOWN cascades in the same spirit as Welch (1992), with the caveat that the proponent utilizes both price and target to achieve this. Consequently, with endogenous issuance pricing, there is no DOWN cascade which stops private information aggregation, and good projects are always financed while bad projects are never financed, when the crowd base N becomes very large. In other words, financing efficiency and information aggregation efficiency approaches the first best as N grows bigger, despite the presence of information cascades.
My (uninformed) guess is "humor, of the slightly mindsharpening sort".
Hypothesis: (formed before running the test below) "clearly too high" will be more of an overestimate than "clearly too low" is an underestimate; most midpoints will be high; this will bias estimates high (especially through repeated selfanchoring to high midpoints).
Anecdata: I tried using this method to estimate the number of gates in Hong Kong International Airport. (I'm currently standing in a line in HKIA, and have flown through it at least a dozen times in my life.)
 40 ~ 10,000
 40 ~ 5,020
 40 ~ 2,530
 40 ~ 1285
 40 ~ 662 (maybe 351 is neither obviously too high or too low, so I should take it? or maybe it's slightly high and so...)
 40 ~ 351 (195 I don't have a strong opinion about, so I'll take it)
So, depending on how I decide to stop, I get estimates around 195 or 351. Apparently the true answer is 90, supporting my hypothesis.
Proposal: Maybe use geometric mean instead of arithmetic mean?
I tried to clear my head and try again with geomean:
 40 ~ 10,000
 40 ~ 632 (ehhh, 159 is more likely to high than too low)
 40 ~ 159 (80 is too low)
 80 ~ 159 (113)
There's one suspect step around 159, but my guess here is that being anchored by one higher number (632) rather than five (5020, 2530, 1285, 662, 351) is enough to actually make me think "probably less than 159" rather than "no strong opinion about 195". Terminating at 159 also is an outperformance of arithmetic mean, but feels a bit more like luck around where the midpoints fell.
What's up with "FDIC insurance up to $1,000,000"? Wikipedia claims that FDIC insurance only covers $0.25mln/bank unless I have a joint account, retirement account, or some kind of legal entity which I'm not.
I don't think that there is offtheshelf insurance for "I lose my job (prospects) or otherwise choose to have lower lifetime earnings, in some way I did not foresee."
It would be confusing to me if most EAs had equity investments they could not afford to bear a crash loss in, especially those involved in an emergency fund scheme. Why add market exposure to the portfolio of donations + scheme membership + liquid cash? (Especially, why as market exposure you expect other scheme participants to share?)
That said, it is possible to buy insurance against a market crash. Probably not as a centralized service.
So shouldn't that inequality apply to 0.AAA... (base eleven) and 0.999... (base ten) as well? (A debatable point maybe).
Not debatable, just false. Formally, the fact that for all does not imply that .
If I were to poke a hole in the (proposed) argument that 0.[k 9s]{base 10} < 0.[k As]{base 11} (0.9<0.A; 0.99<0.AA;...), I'd point out that 0.[2*k 9s]{base 10} > 0.[k As]{base 11} (0.99>0.A; 0.9999>0.AA;...), and that this gives the opposite result when you take (in the standard sense of those terms). I won't demonstrate it rigorously here, but the faulty link here (under the standard meanings of real numbers and infinities) is that carrying the inequality through the limit just doesn't create a necessarilytrue statement.
0.111...{binary} is 1, basically for the Dedekind cut reason in the OP, which is not basedependent (or representationdependent at all)  you can define and identify real numbers without using Arabic numerals or place value at all, and if you do that, then 0.999...=1 is as clear as not(not(true))=true.
I think your comment is unnecessarily hedged  do you think that you'd find much disagreement among LWers who interact with FHI/GMUEcon over whether people there sometimes (vs never) fail to do levelone things?
I think I understand the connotation of your statement, but it'd be easier to understand if you strengthened "sometimes" to a stronger statement about academia's inadequacy. Certainly the rationality community also sometimes fails to do even the obvious, levelone intelligent character things to enable them to achieve their goals  what is the actual claim that distinguishes the communities?
I'm confused; it seems like evidence against the claim that you can get arbitrary amounts of value out of learning generic rationality skills, but I don't see it as "devastating" to the claim you can get significant value, unless you're claiming that "spent years learning all that stuff, and now do it as a day job; some of them 16 hours a day" should imply only a lessthansignificant improvement. Or am I missing something here?
cf. my comment cousin to this one; I misunderstood what the term was pointing at at first, though I stand by my complaint that that's a problem with the term.
Thanks; I legitimately misunderstood at first read whether "counterspell" was intended to apply to the invocations thrown out by bad arguers or the concise and specific distillations the OP is presenting for use. On reread, I agree that it's supposed to be a set of useful tools.
I remain convinced of the specific claim that "counterspell" is bad jargon (though I don't think it's good practice to cite my own confusion too strongly; the incentives there aren't great). I agree that MtG''s paradigm where more general counterspells are more expensive seems like a good fit for thinking about rhetorical (and perhaps epistemic) tactics, though I reiterate that that's not how they work in many other settings, and that ambiguous baggage is worse than no baggage for this sort of thing. The question of whether identifying counterspells with magic is supposed to be a positive or negative association is additional gratuitous confusion  I think your claim that the magic metaphor implies they don't work is wrong, but I'm not 85% sure.
Strong approval of the overall goal of the post, but here's a semantic criticism:
In accordance with the Rationalist tradition that requires everything to have a nerdy scifi or fantasy name
I parse this as an injoke (and appreciate it as such), but I do think that regularly minting new jargon that's likely to carry substantial, conflicting(!) contextual baggage (not all of it appropriate[1]) is...a bad norm for an epistemic community to have.
I also think that the deeper tradition of jargonforging (as Eliezer practiced it) involved names that sounded nerdy, but _not_ scifi or fantasy  _cf._ Knowing About Biases Can Hurt People, which uses "Fully General Counterargument" in much the same way(?) as you're using "Counterspell". "Fully General Counterargument" is slightly more unwieldy, but apart from that is a better piece of jargon  it's less loaded and by leaning even harder into the implicit snark that winks at the fact that the purported counter isn't working at all, makes it even more clear that no, this is not a useful thing.
[1] To belabor this point, MtG counterspells are fully general (edit: okay, not **fully** general, and see Slider below), and a reasonable fit for the term as you're using it, but D&D (3.5e, at least) counterspells are based on negating a spell by casting a copy of it, which is not what you mean at all. I don't actually know what "counterspells" are in WoD/Mage, but the risk that they're even further afield from your intention should be another strike against using the alreadyloaded handle.
preference "my decisions should be mine"  and many people seems to have it
Fair. I'm not sure how to formalize this, though  to my intuition it seems confused in roughly the same way that the concept of free will is confused. Do you have a way to formalize what this means?
(In the absence of a compelling deconfusion of what this goal means, I'd be wary of hacking epistemics in defense of it.)
There are "friends" who claim to have the same goals as me, but later turns out that they have hidden motives.
Agreed and agreed that there's a benefit to removing their affordance to exploit you. That said, why does this deserve more attention than the inverse case (there are parties you do not trust who later turn out to have benign motives)?
"give the power over my final decisions to small random events around me" seems like a slightly confused concept if your preferences are truly indifferent. Can you say more about why you see that as a problem?
The potential adversary seems like a more straightforward problem, though one exciting possibility is that lightness of decisions lets a potential cooperator manipulate your decisions on favor of your common interests. And presumably you already have some system for filtering acquaintances into adversaries and cooperators. Is the concern that your filtering is faulty, or something else?
[Commitment] eventually often turn to be winning strategy, compared to the flexible strategy of constant updating expected utility.
Some realworld games are reducible to the game of Chicken. Commitment is often a winning strategy in them. Though I'm not certain that it's a commitment to a particular set of beliefs about utility so much as a morecomplex decision theory which sits between utility beliefs and actions.
In summary, if the acquaintances whose info you update on are sufficiently unaligned with you and your decision theory always selects the addition that your posterior assigns the highest utility, then your actions will be "overupdating on the evidence" if your beliefs are properly Bayesian. But I don't think the best response is to bias yourself towards underupdating.
Do you bite the bullet that this means the set of things you morally value changes discontinuously and predictably as things move out of your light cone? (Or is there some way you value things less as they are less "in" your light cone, in some nonbinary way?)
I think it's from SICP that programs are meant to be read by humans and only incidentally for computers to execute; I've been trying for more than a year now to write a blog post about the fundamental premise that, effortweighted, we almost never write new programs from scratch, and mostly are engaged in transmuting one working program into another working program. Programs are not only meant to be read by humans, but edited by humans.
I think if you start from the question of how much effort it is to write a new program on a blank page, most languages will come out looking the same, and the differences will look like psychological constructs. If you ask, however, how much effort it is to change an existing piece of a code base to a specific something else, you start to see differences in epistemic structure, where it matters how many of the possible mutations that a human algorithm might try will nonobviously make the resulting program do something unexpected. And that, as you point out, opens the door to at least some notion of universality.
C, the most widespread generalpurpose programming language, does things that are extremely difficult or impossible in highly abstract languages like Haskell or LISP
Can you give an example? I'm surprised by this claim, but I only have deep familiarity with C of these three. (My primary functional language includes mutable constructs; I don't know how purely functional languages fare without them.)
I was trying to construct a proof along similar lines, so thank you for beating me to it!
Note that 2 is actually a case of 1, since you can think of the "walls" of the simplex as being bets that the universe offers you (at zero odds).
(This comment isn't an answer to your question.)
If I'm understanding properly, you're trying to use the set of bets offered as evidence to infer the common beliefs of the market that's offering them. Yet from a Bayesian perspective, it seems like you're assigning P( X offers bet B  bet B has positive expectation ) = 0. While that's literally the statement of the Efficient Markets Hypothesis, presumably you  as a Bayesian  don't actually believe the probability to be literally 0.
Getting this right and generalizing a bit (presumably, you think that P( X offers B  B has expectation +epsilon ) < P( X offers B  B has expectation +BIG_E )), should make the market evidence more informative (and cases of arbitrage less dividebyzero, breakyourmath confusing).
I'm confused what the word "fairly" means in this sentence.
Do you mean that they make a zeroexpectedvalue bet, e.g., 1:1 odds for a fair coin? (Then "fairly" is too strong; nondegenerate odds (i.e., not zero on either side) is the actual required condition.)
Do you mean that they bet without fraud, such that one will get a positive payout in one outcome and the other will in the other? (Then I think "fairly" is redundant, because I would say they haven't actually bet on the outcome of the coin if the payouts don't correspond to coin outcomes.)
A related idea in nonpunishment of "wrong" reports that have insufficient support (again in the commonprior/privateinfo setting) comes from this paper [pdf] (presented at the same conference), which suggests collecting reports from all agents and assigning rewards/punishments by assuming that agents' reports represent their private signal, computing their posterior, and scoring this assumed posterior. Under the model assumptions, this makes it an optimal strategy for agents to truly reveal their private signal to the mechanism, while allowing the mechanism to collect noncascaded base data to make a decision.
In general, I feel like the academic literature on market design / mechanism design has a lot to say about questions of this flavor.
Abstract: Considering information cascades (both upwards and downwards) as a problem of incentives, better incentive design holds some promise. This academic paper suggests a model in which making truthfinding rewards contingent on reaching a certain number of votes prevents downcascades, and where an informed (selfinterested) choice of payout odds and threshold can also prevent upcascades in the limit of a large population of predictors.
1) cf. avturchin from the question about distribution across fields, pointing out that upcascades and downcascades are both relevant concerns, in many contexts.
2) Consider information cascades as related to a problem of incentives  in the comments of the Johnichols post referenced in the formalization question, multiple commentators point out that the model fails if agents seek to express their marginal opinion, rather than their true (posterior) belief. But incentives to be right do need to be built into a system that you're trying to pump energy into, so the question remains of whether a different incentive structure could do better, while still encouraging truthfinding.
3) UpCascaded Wisdom of the Crowd (Cong and Xiao, working paper) considers the informationaggregation problem in terms of incentives, and consider the incentives at play in an allornothing crowdfunding model, like venture capital or Kickstarter (assuming that a 'no' vote is irrevocable like a 'yes' vote is)  'yes' voters win if there is a critical mass of other 'yes' voters and the proposition resolves to 'yes'; they lose if there is a critical mass and the proposition resolves to 'no'; they have 0 loss/gain if 'yes' doesn't reach a critical mass; 'no' voters are merely abstaining from voting 'yes'.
Their main result is that if the payment of incentives is conditioned on the proposition gaining a fixed number of 'yes' votes, a population of symmetric, commonprior/privateinfo agents will avoid downcascades, as a single 'yes' vote that breaks a downcascade will not be penalized for being wrong unless some later agent intentionally votes 'yes' to put the vote over the 'yes' threshold. (An agent i with negative private info still should vote no, because if a later agent i' puts the vote over the 'yes' threshold based in part on i's false vote, then i expects to lose on the truthevaluation, since they've backed 'yes' but believe 'no'.)
A further result from the same paper is that if the actor posing the proposition can set the payout odds and the threshold in response to the common prior and known infodistribution, then a propositionposer attempting to minimize downcascades (perhaps because they will cast the first 'yes' vote, and so can only hope to win if the vote resolves to 'yes') will be incentivized to set odds and a threshold that coincidentally minimize the chance of upcascades. In the largepopulation limit, the number of cascades under such an incentive design goes to 0.
4) I suspect (but will not here prove) that augmenting Cong and Xiao's allornothing "crowdfunding for 'yes'" design with a parallel "crowdfunding for 'no'" design  i.e., 'no' voters win (resp. lose) iff there is a critical mass of 'no' voters and the proposition resolves 'no' (resp. 'yes')  can further strengthen the defenses against upcascades (by making it possible to cast a more informed 'no' vote conditioned on a later, moreinformed agent deciding to put 'no' over the threshold).
[this answer was duplicated when I mistakenly copied my comment into an answer and then moved the comment to an answer.]