## Posts

Comment by Idan Arye on The Control Group Is Out Of Control · 2021-06-23T16:38:59.977Z · LW · GW

January 2021 have witnessed the GameStop short squeeze where many small investors, self organized via Reddit, bought a stock in order to hold it and cause financial damage to several hedge funds that shorted it. It was all over the news and was eventually diffused when the brokerage companies sold their clients stocks without their consent.

This resolution triggered great outrage. The traders and their supporters claimed that hedge funds were toying with the economy for a long time now, ruining companies and the families who depended on them, and it was considered okay because they played by the rules. Now that the common folks play by the same rules - the rules were changed so that they cannot play.

(To be fair - the brokerage companies that sold their stocks did have a legal standing in doing so. But this is just an anecdote for my main point, so I'd rather not delve into this technicality)

This post was written years before that, but the sentiment is timeless. Is it really okay to constantly change the rules of science just to deny access to a certain group?

Comment by Idan Arye on Debunked And Well-Refuted · 2021-06-16T23:23:23.975Z · LW · GW

If you've never acknowledged that other study, there is a possibility that you'll consider it objectively once introduced to it.

Comment by Idan Arye on Don't Sell Your Soul · 2021-04-09T01:06:28.147Z · LW · GW

Section IV, clause A:

Buyer and Seller agree that the owner of the Soul may possess, claim, keep, store, offer, transfer, or make use of it in whole or in part in any manner that they see fit to do so, conventional or otherwise, including (but not limited to) the purposes described in this Section (IV). Example uses of the Soul which would be permitted under these terms include (but are not limited to):

• ...
• Long term storage, usage, or preservation of the Soul in a state which would prevent it from taking the course of development, evolution, or relocation it may otherwise take naturally or due to the actions or material status of the Seller.

Am I interpreting it wrong, or is this clause permitting the buyer to kill the seller?

Comment by Idan Arye on Strong Evidence is Common · 2021-03-14T18:58:41.736Z · LW · GW

Isn't that the information density for sentences? With all the conjunctions, and with the limitness of the number of different words that can appear in different places of the sentence, it's not that surprising we only get 1.1 bits per letter. But names should be more information dense - maybe not the full 4.7 (because some names just don't make sense) but at least 2 bits per letter, maybe even 3?

I don't know where to find (or how to handle) a big list of full names, so I'm settling for the (probably partial) lists of first names from https://www.galbithink.org/names/us200.htm (picked because the plaintext format is easy to process). I wrote a small script: https://gist.github.com/idanarye/fb75e5f813ddbff7d664204607c20321

When I run it on the list of female names from the 1990s I get this:

$./names_entropy.py https://www.galbithink.org/names/s1990f.txt Entropy per letter: 1.299113499617074 Any of the 5 rarest name are 1:7676.4534883720935 Bits for rarest name: 12.906224226276189 Rarest name needs to be 10 letters long Rarest names are between 4 and 7 letters long #1 Most frequent name is Christin, which is 8 letters long Christin is worth 5.118397576228959 bits Christin would needs to be 4 letters long #2 Most frequent name is Mary, which is 4 letters long Mary is worth 5.380839995073667 bits Mary would needs to be 5 letters long #3 Most frequent name is Ashley, which is 6 letters long Ashley is worth 5.420441711983749 bits Ashley would needs to be 5 letters long #4 Most frequent name is Jesse, which is 5 letters long Jesse is worth 5.4899422055346445 bits Jesse would needs to be 5 letters long #5 Most frequent name is Alice, which is 5 letters long Alice is worth 5.590706018293878 bits Alice would needs to be 5 letters long And when I run it on the list of male names from the 1990s I get this:$ ./names_entropy.py https://www.galbithink.org/names/s1990m.txt
Entropy per letter: 1.3429318549784128

Any of the 11 rarest name are 1:14261.4
Bits for rarest name: 13.799827993443198
Rarest name needs to be 11 letters long
Rarest names are between 4 and 8 letters long

#1 Most frequent name is John, which is 4 letters long
John is worth 5.004526222833823 bits
John would needs to be 4 letters long

#2 Most frequent name is Michael, which is 7 letters long
Michael is worth 5.1584658860672485 bits
Michael would needs to be 4 letters long

#3 Most frequent name is Joseph, which is 6 letters long
Joseph is worth 5.4305677416620135 bits
Joseph would needs to be 5 letters long

#4 Most frequent name is Christop, which is 8 letters long
Christop is worth 5.549228103371756 bits
Christop would needs to be 5 letters long

#5 Most frequent name is Matthew, which is 7 letters long
Matthew is worth 5.563161441124633 bits
Matthew would needs to be 5 letters long

So the information density is about 1.3 bits per letter. Higher than 1.1, but not nearly as high as I expected. But - the rarest names in these list are about 1:14k - not 1:1m like OP's estimation. Then again - I'm only looking at given names - surnames tend to be more diverse. But that would also give them higher entropy, so instead of to figure out how to scale everything let's just go with the given names, which I have numbers for (for simplicity, assume these lists I found are complete)

So - the rare names are about half as long as the number of letters required to represent them. The frequent names are anywhere between the number of letters required to represent them and twice that amount. I guess that is to be expected - names are not optimized to be an ideal representation, after all. But my point is that the amount of evidence needed here is not orders of magnitude bigger than the amount of information you gain from hearing the name.

Actually, due to what entropy is supposed to represent, on average the amount of information needed is exactly the amount of information contained in the name.

Comment by Idan Arye on Strong Evidence is Common · 2021-03-14T12:49:05.898Z · LW · GW

The prior odds that someone’s name is “Mark Xu” are generously 1:1,000,000. Posterior odds of 20:1 implies that the odds ratio of me saying “Mark Xu” is 20,000,000:1, or roughly 24 bits of evidence. That’s a lot of evidence.

There are 26 letters in the English alphabet. Even if, for simplicity, our encoding ignores word boundaries and message ending, that's  bits per letter so hearing you say "Mark Xu" is 28.2 bits of evidence total - more than the 24 bits required.

Of course - my encoding is flawed. An optimal encoding should assign "Mark Xu" with less bits than, say, "Rqex Gh" - even though they both have the same amount of letters. And "Maria Rodriguez" should be assigned an even shorter message even though it has more than twice the letters of "Mark Xu".

Measuring the amount of information given in messages is not as easy to do on actual real life cases as it is in theory...

Comment by Idan Arye on Defending the non-central fallacy · 2021-03-12T23:44:00.922Z · LW · GW

Realistically, how high would the tax burden have to be for you to accept those costs of secession?

France's 2015 taxes of 75% made rich people secede, so we can take that as a supremum on the minimal tax burden that can make people secede. Of course - France's rich didn't have to go live in the woods - they had the option to go to other countries. Also, they did not have the option to not go to any country, because all the land on earth is divided between the countries.

I agree that the main benefits for the rich to remain in under the state's rule and pay taxes is to be able to do business with its citizens. And of course - to be able to pass through the land - otherwise they won't be able to physically do said business. So the core question is:

Does the state have the right to prevent its citizens from doing business with whoever they want?

They practice that power - that's a fact. They send the police to stop business that's not licensed by the state. But should this be considered an act of violence, or as an act of protecting their property?

Comment by Idan Arye on Defending the non-central fallacy · 2021-03-12T00:53:11.322Z · LW · GW

I think there is some academic merit in taking this example to the extreme and assuming that the rich person is responsible to 100% of the community's resources, and they alone can fund the its entire activity, and if they secede alone the community is left with nothing. They can't protect people in their streets because they can't afford a police. They can't punish criminals because they can't afford a prison. They may be left with their old roads, but without maintenance they quickly wear out while the rich person can build new ones. Their permission to do business means nothing because they have no means to enforce it (no police) - they can't even make a credible embargo because the rich person is the only one you can offer jobs and the only one who has goods to sell, so the incentive to break the embargo is huge. The rich person has all the power and zero incentive to give in to the community which will take it away and give their "fair share" of  of it in return.

Of course - this extreme scenario never happens in real life, because in real life there are always alternatives. There are more rich people, to begin with, so no single rich person can hold all the power. People can start their own business, breaking the 100% dependency on the rich class from our example. And - maybe most importantly - modern society has a huge middle class that holds (as a socioeconomic class) a considerable share of the power.

So, a real life rich person cannot have a full Shapley value like our hypothetical rich person, and the poor people's Shapley value is more than zero. Still - a rich person's Shapley value is much much higher than a poor person's, and therefore there is a point where taxation is heavy enough to make it worthwhile for them to secede.

Comment by Idan Arye on Defending the non-central fallacy · 2021-03-11T22:55:39.722Z · LW · GW

I was replying to ShemTealeaf's claim that the rich person still has an incentive to stay - remaining under the protection of the community's court system. I was arguing that what the rich person needs from the community's court system is not its resources (which the rich person was providing anyway, and would dry out once they secede) but its social norms - the people's agreement to respect it's laws, which mean they would not attack the rich person. My point is that if the reach person's incentive to stay is to not get robbed and killed by the community - then we can't really say that they are allowed to opt out.

Of course - if they poor people that remain the community will not attack the rich person once they leave - then they are indeed allowed to opt out, but in that case their incentive to stay is gone.

Comment by Idan Arye on Defending the non-central fallacy · 2021-03-11T20:15:46.582Z · LW · GW

In this hypothetical scenario, the rich person was the sole source of funding for the community's services. Once they opt out, the community will no longer be able to pay the police, and since all the police salaries came from the rich person's pockets - the rich person will be able to use the same amount of money previously used to pay the police force to finance their own private security.

Same for all the other services the community was providing.

Of course, the community will still have all the infrastructure and equipment that was purchased with the rich person's taxes in the past, and the rich person will start with nothing - but this is just a temporary setback. In a few years the rich person will build new infrastructure and the community's infrastructure will not hold for long if they keep using it without being able to afford its maintenance.

This leaves us with the core community service the rich person was enjoying. The only service that does not (directly) cost money to provide. Social norms.

As you said - once the rich person opts out of the community, the members of the community is no longer obliged to refrain from robbing or kill them. And they have an incentive to do so. They may no longer be able to pay their police in the long run, but it'll take some time for all the cops to quit and it'll take some time for the rich person to build their own security force (unless they have prepared it in advance? They probably did), so if they act quick enough they can launch an attack and have a good chance at winning. And even if they get delayed and the balance of armed forces swifts - large enough masses of poor people can take down the rich with their armed guards.

So this is what's going to stop the rich person from opting out. The threat of violence if they do so. In that light - can we still say they are allowed to opt out?

Comment by Idan Arye on Defending the non-central fallacy · 2021-03-10T16:43:27.967Z · LW · GW

Most[1] logical fallacies are obvious when arranged in their pattern, but when you encounter them in the wild they are usually transformed by rhetorics to mask that pattern. The "lack of rhetorical skills", then, may not be bad argumentation by itself - but it does help exposing it. If a pickpocket is caught in the act, it won't help them to claim that they were only caught because they were not dexterous enough and it's unfair to put someone in jail for a lack of skill. The fact remains that they tried to steal, and it would still be a crime if they were proficient enough to succeed. Similarly, just because one's rhetorical skills are not good enough to mask a bad argument does not make it a good argument.

A more important implication of my take on the nature of logical fallacies is that it is not enough to show that an argument fits the fallacy's pattern - the important part of countering it is showing how, when rearranged in that pattern, the argument loses its power to convince. If it still makes sense even in that form.

Note that in all of Scott's examples, he never just said "X is a noncentral member of Y" and left it at that. He always said "we usually hate Y because most of its members share the trait Z, but X is not Z and only happens to be in Y because of some other trait W, which we don't have such strong feeling about".

So, if we take your first example (the one about eating meat) and fully rearrange it by the noncentral fallacy not only with X and Y but also with Z and W, the counter-argument would look something like that:

It's true that animal farming (X) is technically cruelty (Y), but the central members of cruelty are things like torture and child abuse. What these things have in common is that they hurt humans (Z), and this is the reason why we should frown upon cruelty. Animal farming does not share that trait. Animal farming is only included in the cruelty category because it involves involuntary suffering (W) - a trait that we don't really care about.

Does this breakdown make the original argument lose its punch? Not really. Certainly not as much as breaking down the "MLK was a criminal" argument to the noncentral fallacy pattern makes that argument lose its punch. Here, at most, the breakdown exposes the underlying reasoning, and shifts the discussion from "whether or not meat is technically a cruelty" to "to what extent do animals deserve to be protected from involuntary suffering".

Which is a good thing. I believe the goal noticing logical fallacies is not to directly disprove claims, but to strip them from the rhetorical dressing and expose the actual argument underneath. That underlying argument can be bad, or it can be good - but it needs to be exposed before it can be properly discussed.

1. I say "most", but the only exception I can think of is the proving too much fallacy. And even then - that's only because there is no common template like other fallacies have. But that doesn't mean that arguments that inhibit that fallacy cannot be transformed to expose it - in this case, to normalize the fallacy one has to reshape it to a form where the claim, instead of being a critical part of its logic, is just a placeholder that can contain anything and still make the same amount of sense.

So, there is still an normal form involved. But instead of a normal form for the fallacy, the proving too much fallacy is about finding the normal form of the specific argument you are trying to expose the fallacy in, and showing how that form can be used for proving too much. I guess this makes the proving too much fallacy a meta-fallacy? ↩︎

Comment by Idan Arye on Privacy vs proof of character · 2021-02-28T22:46:23.371Z · LW · GW

If Alice can sacrifice her privacy to prove her loyalty, she'll be force to do so to avoid losing to Bob - who already sacrificed his privacy to prove his loyalty and not lose to Alice. They both sacrificed their privacy to get an advantage over each other, and ended up without any relative advantage gained. Moloch wins.

Comment by Idan Arye on Coincidences are Improbable · 2021-02-24T19:53:57.048Z · LW · GW

Coincidences can be evidence for correlation and therefore evidence for causation, as long as one remembers that evidence - like more things than most people feel comfortable with - are quantitative, not qualitative. A single coincidence, of even multiple coincidences, can make a causation less improbable - but it can still be considered very improbable until we get much more evidence.

Comment by Idan Arye on Oliver Sipple · 2021-02-20T21:53:19.671Z · LW · GW

Manslaughter? Probably not - you did not contribute to that person's death. You are, however, guilty of:

1. Desecration of the corpse.
2. Obstructing the work of the sanitation workers (it's too late for paramedics) that can't remove the body from the road because of the endless stream of cars running over it.
3. You probably didn't count 100k vehicles running over that body. A bystander who stayed there for a couple of days could have, but since you are one of the drivers you probably only witness a few cars running over that person - so as far as you know there is a slim chance they are still alive.

I may be taking the allegory too far here, but I feel these offenses can map quite well. Starting from the last - being able to know that all the damage is done. In Sipple's case, this is history so it's easy to know that all the damage was already done. He can't be outed again. His family will not be harassed again by their community, and will not estrange him again. His life will not be ruined again, and he will not die again.

Up next - interfering with the efforts to make things better. Does this really happen here? I don't think so. On the contrary - talking about this, establishing that this is wrong, can help prevent this from happening to other people. And it's better to talk about cases from the past, where all the damage is already done, than about current cases that still have damage potential.

This leaves us with the final issue - respecting the dead. Which is probably the main issue, so I could have just skipped the other two points, but I took the trouble of writing them so I might as well impose on you the trouble of reading them. Are we really disrespecting Oliver Sipple by talking about him?

Given all that - I don't think talking about this case should be considered as a violation of Sipple's wish to not be outed.

Comment by Idan Arye on Oliver Sipple · 2021-02-20T12:26:35.310Z · LW · GW

Is pulling the lever after the trolley had passed still a murder?

Comment by Idan Arye on Luna Lovegood and the Chamber of Secrets - Part 11 · 2020-12-28T14:50:59.653Z · LW · GW

Even if you could tell - Voldemort was Obliviated while knocked out and then transfigured before having the chance to wake up, so there never was an opportunity to verify that the Obliviation worked.

Comment by Idan Arye on Luna Lovegood and the Chamber of Secrets - Part 6 · 2020-12-10T16:25:45.232Z · LW · GW

I don't think so - the Vow is not an electric collar that shocks Harry every time he tries to destroy the world. This would invite ways to try and outsmart the Vow. Remember - the allegory here is to AI alignment. The Vow is not just giving Harry deterrents - it modifies his internal reasoning and values so that he would avoid world destruction.

Comment by Idan Arye on The Incomprehensibility Bluff · 2020-12-07T17:35:50.794Z · LW · GW

One thing to keep in mind is that even if it does seem likely that the suspected bluffer is smarter and more knowledgeable than you, the bar for actually working on the subject is higher than the bar for understanding a discussion about it. So even if you are not qualified enough to be an X researcher or an X lecturer, you should still be able to understand a lecture about X.

Even if the gap between you two is so great that they can publish papers on the subject and you can't even understand a simple lecture, you should still be able to understand some of that lecture. Maybe you can't follow the entire derivation of an equation but you can understand the intuition behind it. Maybe you get lost in some explanation but can understand an alternative example.

Yes - it is possible that you are so stupid and so ignorant and that the other person is such a brilliant expert that even with your sincere effort to understand and their sincere effort to explain as simply as possible you still can't understand even a single bit of it because the subject really is that complicated. But at this point the likability of this scenario with all these conditions is low enough that you should seriously consider the option that they are just bluffing.

Comment by Idan Arye on Luna Lovegood and the Chamber of Secrets - Part 5 · 2020-12-06T01:55:02.484Z · LW · GW

By the way, I wouldn't be surprised if "the end of the world" is Moody's stock response to "what's the worst that could happen?" in any context.

(this is no longer spoiler so we no longer need to hide it)

I'm not sure about that. That could be Harry's stock response - "there was always a slight probability for the end of the world and this suggestion will not completely eliminate that probability". But Moody's? I would expect him to quickly make a list of all the things that could go wrong for each suggested course of action.

Comment by Idan Arye on Luna Lovegood and the Chamber of Secrets - Part 5 · 2020-12-05T16:21:37.540Z · LW · GW

Are potential HPMOR spoilers acceptable in the comments here? I'm not really sure - the default is to assume they aren't, but the fanfic itself contains some, so to be sure I'll hide it just in case:

Can Harry really discuss the idea of destroying the world so casually? Shouldn't his unbreakable oath compel him to avoid anything that can contribute to it, and abandon the idea of building the hospital without permit as soon as Moody jokes (is that the correct term when talking about Moody?) about it causing the end of the world?

Comment by Idan Arye on Luna Lovegood and the Chamber of Secrets - Part 4 · 2020-12-04T21:56:43.016Z · LW · GW

I notice we are seeing Luna getting ridiculed for her reputation rather then directly for her actions. Even when it's clear how her reputation is a result of her actions - for example they laugh at her for having an imaginary pet, but never once have we seen other students looking at weird when she interacts with Wanda.

Is this intentional? Because we are getting this story from Luna's PoV? Does she consider her reputation unjustified because her behavior does not seem weird to her?

Comment by Idan Arye on Luna Lovegood and the Chamber of Secrets - Part 3 · 2020-12-01T21:13:13.284Z · LW · GW

I'm a bit surprised the twins had the patience and concentration to sit with Luna and help her go over the map over and over.

Comment by Idan Arye on Extortion beats brinksmanship, but the audience matters · 2020-11-17T15:54:30.030Z · LW · GW

Wouldn't increasing the number of offenders improve the effectiveness of brinkmanship compared to extortion? Since the victim is only bound by a deal with the offender, they can surrender and reject future deals from the other potential offenders. This makes surrendering safer and therefore more attractive compared to extortion, where surrendering to one extorter would invite more extortions.

Comment by Idan Arye on Bayesians vs. Barbarians · 2020-11-08T14:51:41.666Z · LW · GW

The moral of Ends Don't Justify Means (Among Humans) was that even if philosophical though experiments demonstrate scenarios where ethical rules should be abandoned for the greater good, real life cases are not as clear cut and we should still obey these moral rules because humans cannot be trusted when they claim that <unethical plan> really does maximize the expected utility - we cannot be trusted when we say "this is the only way" and we cannot be trusted when we say "this is better than the alternative".

I think this may be the source of the repulsion we all feel toward the idea of selecting soldiers in a lottery and forcing them to fight with drugs and threats of execution. Yes, dying in a war is better than being conquered by the barbarians - I'd rather fight and risk death if the alternative is to get slaughtered anyway together with my loved ones after being tortured, and if the only way to avoid that is to use abandon all ethics than so be it.

But...

Even in a society of rationalists, the leaders are still humans. Not benevolent ("friendly" is not enough here) superintelligent perfect Bayesian AIs. Can we really trust them that this is the only way to win? Can we really trust them to relinquish that power once the war is over? Will living under the barbarians rule be worse than living in a (formerly?) rationalist society that resorted to totalitarianism? Are the barbarians really going to invade us in the first place?

Governments lie about such things in order to grab more power. We have ethics for a reason - it is far too dangerous to rationalize that we are too rational to be bound by these ethics.

Comment by Idan Arye on Purchase Fuzzies and Utilons Separately · 2020-11-04T17:00:47.309Z · LW · GW

I may be straying from your main point here, but...

Could you really utilize these 60 seconds in a better, more specialized way? Not any block of 60 seconds - these specific 60 seconds, that happened during your walk.

Had you not encountered that open trunk, would you open your laptop in the middle of that walk and started working on a world changing idea or an important charity plan? Unlikely - if that was the case you were already sitting somewhere working on that. You went out for a walk, not for work.

Would you, had you not encountered that open trunk, finish your walk 60 seconds earlier, went to sleep 60 seconds earlier, woke up 60 seconds earlier, started your workday 60 seconds earlier, and by doing all that moved these 60 seconds to connect with your regular productivity time? This is probably not the case either - if it was, that would mean you intentionally used that hard earned fuzz as an excuse to deliberately take one minute off your workday, and that would take small mindedness you do not seem to possess.

No - that act was an Action of Opportunity. Humans don't usually have a schedule to tight and so accurate that every lost minute messes it up. There is room for leeway, where you can push such gestures without compromising your specialized work.

Comment by Idan Arye on Why Our Kind Can't Cooperate · 2020-11-03T09:00:43.540Z · LW · GW

Should arguers be encouraged, then, to not write all the arguments if favor of their claim in order to leave more room for those who agree with them to add their own supporting arguments?

This requires either refraining from fully exploring the subject (so that you don't think of all the arguments you can) or straight out omitting arguments you thought of. Not exactly Dark Side, but not fully Light Side either...

Comment by Idan Arye on What is the right phrase for "theoretical evidence"? · 2020-11-02T21:01:14.130Z · LW · GW

The difference can be quite large. If we get the results first, we can come up with Fake Explanations why the masks were only 20% effective in the experiments where in reality they are 75% effective. If we do the prediction first, we wouldn't predict 20% effectiveness. We wouldn't predict that our experiment will "fail". Our theory says masks are effective so we would predict 75% to begin with, and when we get the results it'll put a big dent in our theory. As it should.

Comment by Idan Arye on What is the right phrase for "theoretical evidence"? · 2020-11-02T16:24:15.554Z · LW · GW

Maybe "destroying the theory" was not a good choice of words - the theory will more likely be "demoted" to the stature of "very good approximation". Like gravity. But the distinction I'm trying to make here is between super-accurate sciences like physics that give exact predictions and still-accurate-but-not-as-physics fields. If medicine says masks are 99% effective, and they were not effective for 100 out of 100 patients, the theory still assigned a probability of  that this would happen. You need to update it, but you don't have to "throw it out". But if physics says a photon should fire and it didn't fire - then the theory is wrong. Your model did not assign any probability at all to the possibility of the photon not firing.

And before anyone brings 0 And 1 Are Not Probabilities, remember that in the real world:

• There is a probability photon could have fired and our instruments have missed it.
• There is a probability that we unknowingly failed to set up or confirm the conditions that our theory required in order for the photon to fire.
• We do not assign 100% probability to our theory being correct, and we can just throw it out to avoid Laplace throwing us to hell for our negative infinite score.

This means that the falsifying evidence, on its own, does not destroy the theory. But it can still weaken it severely. And my point (which I've detoured too far from) is that the perfect Bayesian should achieve the same final posterior no matter at which stage they apply it.

Comment by Idan Arye on What is the right phrase for "theoretical evidence"? · 2020-11-02T14:38:12.927Z · LW · GW

I think you may be underestimating the impact of falsifying evidence. A single observation that violates general relativity, assuming we can perfectly trust its accuracy and rule out any interference from unknown unknowns - would shake our understanding of physics if it comes tomorrow, but had we encountered the very same evidence a century ago our understanding of physics would have already been shaken (assuming the falsified theory wouldn't be replaced with a better one). To a perfect Bayesian, the confidence at general relativity in both cases should be equal - and very low. Because physics are lawful - the don't make "mistakes" - we are the ones who are mistaken at understanding them, so a single violation is enough to make a huge dent no matter how many confirming evidence we have managed to pile up.

Of course, in real life we can't just say "assuming we can perfectly trust its accuracy and rule out any interference from unknown unknowns".  The accuracy of our observations is not perfect, and we can't rule out unknown unknowns, so we must assign some probability to our observation being wrong. Because of that, a single violating evidence is not enough to completely destroy the theory. And because of that, newer evidence should have more weight - our instruments keep getting better so our observations today are more accurate. And if you go far enough back you can also question the credibility of the observations.

Another issue, which may not apply to physics but applies to many other fields, is that the world does change. A sociology experiment form 200 years ago is evidence on society from 200 years ago, so the results of an otherwise identical experiment from recent years should have more weight when forming a theory of modern society, because society does change - certainly much more than physics change.

But to the hypothetical perfect Bayesian the chronology itself shouldn't matter - all they have to do is take all that into account when calculating how much they need to update their beliefs, and succeeding to do so it doesn't matter in which order they apply the evidences.

Comment by Idan Arye on What is the right phrase for "theoretical evidence"? · 2020-11-02T12:15:42.812Z · LW · GW

You need to be very careful with this approach, as it can easily lead to circular logic where map X is evidence for map Y because they both come from the same territory, and may Y is evidence for map X because they both come from the same territory, so you get a positive feedback loop that updates them both to approach 100% confidence.

Comment by Idan Arye on What is the right phrase for "theoretical evidence"? · 2020-11-02T12:01:53.858Z · LW · GW

This clarification gave me enough context to write a proper answer.

That sounds like a promising idea. It seems like it needs some tweaking though. I want be able to say something like "the theoretical evidence suggests". If you replace "theoretical evidence" with "application", it wouldn't make sense. You'd have to replace it with something like "application of what we know about X", but that is too wordy.

Just call it "the theory" then - "the theory suggests" is both concise and conveys the meaning well.

Comment by Idan Arye on What is the right phrase for "theoretical evidence"? · 2020-11-02T11:58:54.525Z · LW · GW

I'm basing this answer on a clarifying example from the comments section:

I believe that what I am trying to point at is indeed evidence, in the Bayesian sense of the word. For example, consider masks and COVID. Imagine that we empirically observe that they are effective 20% of the time and ineffective 80% of the time. Should we stop there and take it as our belief that there is a 20% chance that they are effective? No!

Suppose now that we know that when someone with COVID breathes, particles containing COVID remain in the air. Further suppose that our knowledge of physics would tell us that someone standing two feet away is likely to breathe in these particles at some concentration. And further suppose that our knowledge of how other diseases work tell us that when that concentration of virus is ingested, it is likely that you will get infected. When you incorporate all of this knowledge about physics and biology, it should shift your belief that masks are effective. It shouldn't stay put at 20%. We'd want to shift it upward to something like 75% maybe.

When put like this, these "evidence" sound a lot like priors. The order should be different though:

1. First you deduce from the theory that masks are, say, 90% effective. These are the priors.
2. Then you run the experiments that show that masks are only effective 20% of the time.
3. Finally you update your beliefs downward and say that masks are 75% effective. These are the posteriors.

To a perfect Bayesian the order shouldn't matter, but we are not perfect Bayesians and if we try to do it the other way around and apply the theory to update the probabilities we got from the experiments, we would be able to convince ourselves the probability is 75% no matter how much empirical evidence that says otherwise we have accumulated.

Comment by Idan Arye on What is the right phrase for "theoretical evidence"? · 2020-11-02T00:10:30.220Z · LW · GW

These are not evidence at all! They are the opposite of evidence. Evidence are something from the territory that you use to update your map - what you are describing goes the opposite direction - it comes from the map to say something specific about the territory.

"Using the map to say something about the territory" sounds like "predictions", but in this case it does not seem like you intend to update your beliefs based on whether or not the predictions come true - in fact, you specify that the empirical evidence is already going against these predictions, and you seem perfectly content with that.

So... maybe you could call it "application"? Since you are applying your knowledge?

Or, since they explicitly go against the empirical evidence, how about we just call it "stubbornness"?

Comment by Idan Arye on Raised in Technophilia · 2020-10-21T11:51:29.391Z · LW · GW

My father used to say that if the present system had been in place a hundred years ago, automobiles would have been outlawed to protect the saddle industry.

Maybe not outright outlawed, but automobiles were used to be regulated to the point of uselessness: https://en.wikipedia.org/wiki/Red_flag_traffic_laws

Comment by Idan Arye on When (Not) To Use Probabilities · 2020-10-18T15:15:20.525Z · LW · GW

This reminds me of your comparison of vague vs precise theories in A Technical Explanation of Technical Explanation - if both are correct, then the precise theory is more accurate then the vague one. But if the precise theory is incorrect and the vague is correct, the vague theory is more accurate. Preciseness is worthless without correctness.

While the distinction there was about granularity, I think the lesson that preciseness is necessary but not sufficient for accuracy applies here as well. Using numbers makes your argument seem more mathematical, but unless they are the correct numbers - or even a close enough estimate of the correct numbers - can't make your argument more accurate.

Comment by Idan Arye on Feeling Moral · 2020-10-16T13:27:57.150Z · LW · GW

"Lives saved don’t diminish in marginal utility", as you have said, but maybe hiccups do? A single person in a group of 10 hiccuppers is not as unfortunate as a lone hiccupper standing with 9 other people who don't have hiccups. So even if the total negative utility of 10 hiccuppers is worse than that of one hiccupper, it's not 10 times worse.

Since the utility function doesn't have to be linear function in the number of hiccuppers (it only has to be monotonic) there is no reason why it can't be bounded, forever lower (in absolute value) than the value of a single human life.

Comment by Idan Arye on Feeling Moral · 2020-10-16T13:05:09.830Z · LW · GW

Say we have a treatment of curing hiccups. Or some other inconvenience. Maybe even all medical inconveniences. We have done all the research and experiments and concluded that the treatment is perfectly safe - except there is no such thing as "certainty" in Bayesianism so we must still allocate a tiny probability to the event our treatment may kill a patient - say, a one in a googol chance. The expected utility of the treatment will now have a  component in it, which far outweighs any positive utility gained from the treatment, which only cures inconveniences, a mere real number that cannot be overcome the negative  no matter how small the probability of that  is nor how much you multiply the positive utility of curing the inconveniences.

Comment by Idan Arye on Brainstorming positive visions of AI · 2020-10-08T17:59:03.218Z · LW · GW

Instead of creating a superintelligent AGI to perform some arbitrary task and watch it allocate all the Earth's resources (and the universe's resources later, but we won't be there to watch it) to optimize it, we decide to give it the one task that justifies that kind of power and control - ruling over humanity.

The AGI is more competent than any human leader, but we wouldn't want a human leader who's values we disagree with even if they are very competent - and the same applies to robotic overlords. So, we implement something like Futarchy, except:

• Instead of letting the officials generate policies, the AGI will do it.
• Instead of using betting markets we let the AGI decide which policy best fulfills the values.
• Instead of voting for representatives that'll define the values, the AGI will talk with each and every one of us to build a values profile, and then use the average of all our values profiles to build the values profile used for decision making.
• Even better - if it has enough computation power it can store all the values profiles, calculate the utility of each decision according to each profile, calculate how much the decision will affect of each voter, and do a weighed average.

So the AGI takes over, but humanity is still deciding what it wants.

Comment by Idan Arye on Honoring Petrov Day on LessWrong, in 2019 · 2020-09-26T19:26:28.973Z · LW · GW

Petrov's choice was not about dismissing warnings, it's about picking on which side to err. Wrongfully alerting his superiors could cause a nuclear war, and wrongfully not alerting them would disadvantage his country in the nuclear war that just started. I'm not saying he did all the numbers, used Bayes's law to figure the probability there is an actual nuclear attack going on, assigned utilities to all four cases and performed the final decision theory calculations - but his reasoning did take into account the possibility of error both ways. Though... it does seem like his intuition gave utility much more weight than to probabilities.

So, if we take that rule for deciding what to do with a AGI, it won't be just "ignore everything the instruments are saying" but "weight the dangers of UFAI against the missed opportunities from not releasing it".

Which means the UFAI only needs to convince such a gatekeeper that releasing it is the only way to prevent a catastrophe, without having to convince the gatekeeper that the probabilities of the catastrophe are high or that the probabilities of the AI being unfriently are low.

Comment by Idan Arye on A Priori · 2020-09-26T19:12:39.960Z · LW · GW

That isn't what you need to show. You need to show that the semantics have no ontological implications, that they say nothing about the territory .

Actually, what I need to show is that the semantics say nothing extra about the territory that is meaningful. My argument is that the predictions are canonical representation of the belief, so it's fine if the semantics say things about the territory that the predictions can't say, as long as everything it says that does not affect the predictions is meaningless. At least, meaningless in the territory.

The semantics of gravity theory says that the force that pulls objects together over long range based on their mass is called "gravity". If you call that force "travigy" instead, it will cause no difference in the predictions. This is because the name of the force if a property of the map, not the territory - if it was meaningful in the territory it should have had impact on the predictions.

And I claim that the "center of the universe" is similar - it has no meaning in the territory. The universe has no "center" - you can think of "center of mass" or "center of bounding volume" of a group of objects, but there is no single point you can naturally call "the center". There can be good or bad choices for the center, but not right or wrong choices - the center is a property of the map, not the territory.

If it had any effect at all on the territory, it should have somehow affected the predictions.

Comment by Idan Arye on A Priori · 2020-09-25T14:49:47.173Z · LW · GW

If you take a heliocentric theory, and substitute "geocentric" for "heliocentric", you get a theory that doens't work in the sense of making correct predictions. You know this, because in previous comments you have already recognised the need for almost everything else to be changed in a geocentric theory in order to make it empirically equivalent to a heliocentric theory.

I only change the title, I don't change anything they theory says. So its predictions are still the same as the heliocentric model.

But you are arguing against realism, in that you are arguing that theories have no content beyond their empirical content, ie their predictive power. You are denying that they are have any semantic (non empirical content), and, as an implication of that, that they "mean" or "say" nothing about the territory. So why would you care that one theory in more complex than another, so long as its predictions are accurate?

The semantics are still very important as a compact representation of predictions. The predictions are infinite - the belief will have to give a prediction for every possible scenario, and scenariospace is infinite. Even if the belief is only relevant for a finite subset of scenarios, it'd still have to say "I don't care about this scenario" an infinite number of times.

Actually, it would make more sense to talk about belief systems than individual beliefs, where the belief system is simply the probability function P. But we can still talk about single beliefs if we remember that they need to be connected to a belief system in order to give predictions, and that when we compare two competing beliefs we are actually comparing two belief systems where the only difference is that one has belief A and the other has belief B.

Human minds, being finite, cannot contain infinite representations - we need finite representations for our beliefs. And that's where the semantics come in - they are compact rules that can be used to generate predictions for every given scenario. And they are also important because the amount of predictions we can test is also finite. So even if we could comprehend the infinite prediction field over scenariospace, we wouldn't be able to confirm a belief based on a finite number of experiments.

Also, with that kind of representation, we can't even come up with the full representation of the belief. Consider a limited scenario space with just three scenarios X, Y and Z. We know what happened in X and Y, and write a belief based on it. But what would that belief say about Z? If the belief is represented as just its predictions, without connections between distinct predictions, how can we fill up the predictions table?

The semantics help us with that because they have less degrees of freedom. With N degrees of freedom we can match any  number of observations, so we need  observations to even start counting them as evidence. I not sure how to come up with a formula for the number of degrees of freedom a semantic representation of a belief has - this depends not only on the numerical constants but also on the semantics - but some properties of it are obvious:

1. The prediction table representation has infinite degrees of freedom, since it can give a prediction for each scenario independently from the predictions given to the other scenarios.
2. A semantic representation that's strictly more simple than another semantic representation - that is, you can go from the simple one to the complex one just by adding rules - then the simpler one has less degrees of freedom than the complicated one. This is because the complicated one has all the degrees of freedom the simpler one had, plus more degrees of freedom from the new rules (just adding the rules is some degrees of freedom, even if the rule itself does not contain anything that can be tweaked)

So the simplicity of the semantic representation is meaningful because it means less degrees of freedom and thus requires less evidence, but it does not make the belief "truer" - only the infinite prediction table determines how true the belief is.

Comment by Idan Arye on Decoherence is Falsifiable and Testable · 2020-09-24T23:56:31.906Z · LW · GW

We require new predictions not because the theory is newer than some other theory it could share predictions, but because the predictions must come before the experimental results. If we allow theories theories to rely on the results of already known experiments, we run into two problems:

1. Overfitting. If the theory only needs to match existing results it can be constructed a way that matches all these results - instead of trying to match the underlying rules that generated these results.
2. We may argue ourselves into believing our theory made predictions that match our results, regardless of whether the theory naturally makes these predictions.

Now, if the new theory is a strictly simpler versions of an old one - as in "we don't even need X" simpler - then these two problems are nonissue:

1. If the more complicated theory did not overfit, the simpler version is as good as guaranteed to not overfit either.
2. We don't need to guess what our new theory would predict if we didn't know the results - the old theory already did those predictions before knowing the results, and it should be straightforward to show it wasn't using the part we removed.

So... I will allow it.

Comment by Idan Arye on A Priori · 2020-09-23T12:44:32.248Z · LW · GW

OK, I continued reading, and in Decoherence is Simple Eliezer makes a good case for Occam's Razor as more than just a useful tool.

In my own words (:= how I understand it) more complicated explanations have a higher burden of proof and therefore require more bits of evidence. If they give the same predictions as the simpler explanations, then each bit of evidence counts for both the complicated and the simple beliefs - but the simpler beliefs had higher prior probabilities, so after adding the evidence their posterior probabilities should keep being higher.

So, if a simple belief A started with -10 decibels and a complicated belief B started with -20 decibels, and we get 15 decibels of evidence supporting both, the posterior credibility of the beliefs are 5 and -5 - so we should favor A. Even if we get another 10 decibels of evidence and the credibility of B becomes 5, the credibility of A is now 15 so we should still favor it. The only way we can favor B is if we get enough evidence that support B but not A.

Of course - this doesn't mean that A is true and B is false, only that we assign a higher probability to A.

So, if we go back to astronomy - our neogeocentric model has a higher burden of proof than the modern model, because it contains additional mysterious forces. We prove gravity and relativity and the work out how centrifugal forces work and that's (more or less) enough for the modern model, and the exact same evidences also support the neogeocentric model - but they are not enough for it because we also need evidence for the new forces we came up with.

Do note, though, that the claim that "there is no mysterious force" is simpler than "there is a mysterious force" is taken for granted here...

Comment by Idan Arye on A Priori · 2020-09-20T16:55:21.669Z · LW · GW

They assert different things because they mean different things, because the dictionary meanings are different.

The Quotation is not the Referent. Just because the text describing them is different doesn't mean the assertions themselves are different.

Eliezer identified evolution with the blind idiot god Azathoth. Does this make evolution a religious Lovecraftian concept?

Scott Alexander identified the Canaanite god Moloch with the principle that forces you to sacrifice your values for the competition. Does this make that principle an actual god? Should we pray to it?

I'd argue not. Even though Eliezer and Scott brought the gods in for the theatrical and rhetorical impact, evolution is the same old evolution and competition is the same old competition. Describing the idea differently does not automatically make it a different idea - just like describing  as  does not make it a different function.

In case of mathematic functions we have a simple equivalence law: . I'd argue we can have a similar equivalence law for beliefs -  where A and B are beliefs and X is an observation.

This condition is obviously necessary because if  even though  and we find that , that would support A and therefore also B (because they are equivalent) which means an observation that does not match the belief's predictions supports it.

Is it sufficient? My argument for its sufficiency is not as analytical as the one for its necessity, so this may be the weak point of my claim, but here it goes: If , even though they give the same predictions, then something other than the state and laws of the universe is deciding whether a belief is true or false (actually - how much accurate is it). This undermines the core idea of both science and Bayesianism that beliefs should be judged by empirical evidences. Now, maybe this concept is wrong - but if it is, Occam's Razor itself becomes meaningless because if the explanation does not need to match the evidences, then the simplest explanation can always be "Magic!".

Comment by Idan Arye on A Priori · 2020-09-20T15:38:11.487Z · LW · GW

In the thought experiment we are considering , the contents of the box can be er be tested. Nonetheless $10 and$100 mean different things.

I'm not sure you realize how strong a statement "the contents of the box can be never be tested" is. It means even if we crack open the box we won't be able to read the writing on the bill. It means that even if we somehow tracked all the $20 and all the$100 bills that were ever printed, their current location, and whether or not they were destroyed, we won't be able to find one which is missing and deduce that it is inside the box. It means that even if we had a powerful atom-level scanner that can accurately map all the atoms in a given volume and put the box inside it, it won't be able to detect if the atoms are arranged like a $20 bill or like a$100 bill. It means that even if a superinteligent AI capable of time reversal calculations tried to simulate a time reversal it wouldn't be able to determine the bill's value.

It means, that the amount printed on that bill has no effect on the universe, and was never affected by the universe.

Can you think of a scenario where that happens, but the value of dollar bill is still meaningful? Because I can easily describe a scenario where it isn't:

Dollar bills were originally "promises" for gold. They were signed by the Treasurer and the secretary of the Treasury because the Treasury is the one responsible for fulfilling that promise. Even after the gold standard was abandoned, the principle that the Treasury is the one casting the value into the dollar bills remains. This is why the bills are still signed by the Treasury's representatives.

So, the scenario I have in mind is that the bill inside the box is a special bill - instead of a fixed amount, it says the Treasurer will decide if it is worth 20 or 100 dollars. The bill is still signed by the Treasurer and the secretary of the Treasury, and thus has the same authority as regular bills. And, in order to fulfill the condition that the value of the bill is never known - the Treasurer is committed to never decide the worth of that bill.

Is it still meaningful to ask, in this scenario, if the bill is worth $20 or$100?

Comment by Idan Arye on A Priori · 2020-09-20T14:59:23.914Z · LW · GW

I'm not sure I follow - what do you mean by "didn't work"? Shouldn't it work the same as the heliocentric theory, seeing how every detail in its description is identical to the heliocentric model?

Comment by Idan Arye on A Priori · 2020-09-20T12:28:12.165Z · LW · GW

So if I copied the encyclopedia definition of the heliocentric model, and changed the title to "geocentric" model, it would be a "bad, wrong , neo-geocentric theory [that] is still a geocentric theory"?

Comment by Idan Arye on A Priori · 2020-09-20T12:25:11.585Z · LW · GW

If A and B assert different things, we can test for these differences. Maybe not with current technology, but in principle. They yield different predictions and are therefore different beliefs.

Comment by Idan Arye on A Priori · 2020-09-19T22:40:16.696Z · LW · GW

But this is not the dictionary definition of the geocentric model we are talking about - this we have twisted it to have the exact same predictions as the modern astronomical model. So it no longer asserts the same things about the territory as the original geocentric model - its assertions are now identical to the modern model. So why should it still hold the same meaning as the original geocentric model?

Comment by Idan Arye on A Priori · 2020-09-19T22:17:07.411Z · LW · GW

If a universe where the statement is true is indistinguishable from a universe where the statement is false, then the statement is meaningless. And if the set of universes where statement A is true is identical to the set of universes where statement B is true, then statement A and statement B have the same meaning whether or not you can "algebraically" convert one to the other.

Comment by Idan Arye on A Priori · 2020-09-19T12:11:25.860Z · LW · GW

If the content of the box is unknown forever, that means that it doesn't matter what's inside it because we can't get it out.