Posts
Comments
You are confused.
How about this:
Heads: Two people with red jackets, one with blue.
Tails: Two people with red jackets, nine hundred and ninety nine thousand, nine hundred and ninety-seven people with blue jackets.
Lights off.
Guess your jacket color. Guess what the coin came up. Write down your credences.
Light on.
Your jacket is red. What did the coin come up?
[ Also, re
Given that the implicit sampling method is random and independent (due to the fair coin), the credence in heads is a million to 1, thus you very likely are in the head's world.
Did you mean 'tails'? ]
no one arguing for Anthropic arguments can prove (or strongly motivate) the inverse position [that minds don't work like indistinguishable bosons]
And I can't prove there isn't a teapot circling Mars. It is a very strange default or prior, that two things that look distinct, would act like numerically or logically indistinguishable entities.
I happen not to like the paradigm of assuming independent random sampling either.
I skimmed your linked post.
First, a simple maybe-crux:
If you couldn't have possibly expected to observe the outcome not A, you do not get any new information by observing outcome A and there is nothing to update on.
There is not outcome-I-could-have-expected-to-observe that is the negation of existence. There are outcomes I could have expected to observe that are alternative characters of existence, to the one I experience. For example, "I was born in Connecticut" is not the outcome I actually observed, and yet I don't see how we can say that it's not a logically coherent counterfactual, if logically coherent counterfactuals can be said to exist at all.
Second, what is your answer to Carlsmith's "God's extreme coin toss with jackets"?
God flips a fair coin. If heads, he creates one person with a red jacket. If tails, he creates one person with a red jacket, and a million people with blue jackets.
- Darkness: God keeps the lights in all the rooms off. You wake up in darkness and can’t see your jacket. What should your credence be on heads?
- Light+Red: God keeps the lights in all the rooms on. You wake up and see that you have a red jacket. What should your credence be on heads?
Agreed.
And given that the earliness of our sentience is the very thing the Grabby Aliens argument is supposed to explain, I think this non-dependence is damning for it.
Incorrect how? Bayes doesn't say anything about the Standard Model.
You can't say "equiprobable" if you have no known set of possible outcomes to begin with.
Genuine question: what are your opinions on the breakfast hypothetical? [The idea that being able to give an answer to "how would you feel if you hadn't eaten breakfast today?" is a good intelligence test, because only idiots are resistant to "evaluating counterfactuals".]
This isn't just a gotcha; I have my own opinions and they're not exactly the conventional ones.
The level where there is suddenly "nowhere further to go" - the switch from exciting, meritocratic "Space Race" mode to boring, stagnant "Culture" mode - isn't dependent on whether you've overcome any particular physical boundary. It's dependent on whether you're still encountering meaningful enemies to grind against, or not. If your civilization first got the optimization power to get into space by, say, a cutthroat high-speed Internet market [on an alternate Earth this could have been what happened], the market for high-speed Internet isn't going to stop being cutthroat enough to encourage innovation just because people are now trying to cover parcels of 3D space instead of areas of 2D land. And even in stagnant "Culture" mode, I don't see why [members/branches of] your civilization would choose to get dumber [lose sentience or whatever other abilities got them into space].
Suppose you survive the attack, but all stars within 1000 light years around you are destroyed.
I question why you assign significant probability to this outcome in particular?
Have you read That Alien Message? A truly smart civilization has ways of intercepting asteroids before they hit, if they're sufficiently dumb/slow - even ones that are nominally really physically powerful.
Welcome to the Club of Wise Children Who Were Anthropically Worried About The Ants. I thought it was just me.
Just saying "it turned out this way, so I guess it had to be this way" doesn't resolve my confusion, in physical or anthropic domains. The boson thing is applicable [not just as a heuristic but as a logical deduction] because in the Standard Model, we consider ourselves to know literally everything relevant there is to know about the internal structures of the two bosons. About the internal structures of minds, and their anthropically-relevant differences, we know far less. Maybe we don't have to call it "randomness", but there is an ignorance there. We don't have a Standard Model of minds that predicts our subjectively having long continuous experiences, rather than just being Boltzmann brains.
it could happen as an accident; with billions of space probes across the universe, random mutations may happen, and the mutants that lost sentience but gained a little speed would outcompete the probes that follow the originally intended design.
This is, indeed, what I meant by "nonsentient mesa-optimizers" in OP:
This mesa-optimizer will then mesa-optimize all the sentience away [this is a natural conclusion of several convergent arguments originating from both computer science and evolutionary theory]
Why do you expect sentience to be a barrier to space travel in particular, and not interstellar warfare? Interstellar warfare with an intelligent civilization seems much harder than merely launching your von Neumann probe into space.
I agree with you that "civilizations get swept by nonsentient mesa-optimizers" is anthropically frequent. I think this resolves the Doomsday Argument problem. Hanson's position is different from both mine and yours.
It sounds to me like you're rejecting anthropic reasoning in full generality. That's an interesting position, but it's not a targeted rebuttal to my take here.
Random vs nonrandom is not a Boolean question. "Random" is the null value we can give as an answer to the question "What is our prior?" When we are asking ourselves "What is our prior?", we cannot sensibly give the answer "Yes, we have a prior". If we want to give a more detailed answer to the question "What is our prior?" than "random"/"nothing"/"null"/"I don't know", it must have particular contents; otherwise it is meaningless.
I was anthropically sampled out of some space, having some shape; that I can say definite things about what this space must be, such as "it had to be able to support conscious processes", does not obviate that, for many purposes, I was sampled out of a space having higher cardinality than the empty set.
As I learn more and more about the logical structure by which my anthropic position was sampled, it will look less and less "random". For example, my answer to "How were you were sampled from the space of all possible universes?" is basically, "Well, I know I had to be in a universe that can support conscious processes". But ask me "Okay, how were you sampled from the space of conscious processes?", and I'll say "I don't know". It looks random.
Huh, I didn't know Hanson rejected the Doomsday Argument! Thanks for the context.
What do you mean [in your linked comment] by weighting civilizations by population?
What do you mean by "update our credences-about-astrobiology-etc. accordingly [with our earliness relative to later humans]"?
Life is really, really rare, because it takes very long to develop, and it's possible that Earth got extremely lucky in ways that are essentially unreplicable across the entire accessible universe.
I am not sure how you think this is different from what I said in the post, i.e. that I think most Kolmogorov-simple universes that contain 1 civilization, contain exactly 1 civilization.
All sampling is nonrandom if you bother to overcome your own ignorance about the sampling mechanism.
Physical dependencies, yes. But past and future people don't have qualitatively more logical dependencies on one another, than multiversal neighbors.
You could make a Grabby Aliens argument without assuming alien sentience, and in fact Hanson doesn't always explicitly state this assumption. However, as far as I understand Hanson's world-model, he does indeed believe these alien civilizations [and the successors of humanity] will by default be sentient.
If you did make a Grabby Aliens argument that did not assume alien sentience, it would still have the additional burden of explaining why successful alien civilizations [which come later] are nonsentient, while sentient human civilization [which is early and gets wiped out soon by aliens] is not so successful. It does not seem to make very much sense to model our strong rivals as, most frequently, versions of us with the sentience cut out.
You want Mèngzi [ -300s Chinese quasi-anti-Machiavellian ] [silly Latinization Mencius], Ibn Sina [ 900s Islamic Aristotelian who wrote on medicine ] [silly Latinization Avicenna], and the Nyaya and Vaiśeṣika schools [ 100-1000 Hindu analytic philosophy ] [if possible try the Praśastapāda [ c600 ]].
- Whichever coordinate system we choose, the charge will keep flowing in the same "arbitrary" direction, relative to the magnetic field. This is the conundrum we seek to explain; why does it not go the other way? What is so special about this way?
- If I'm a negligibly small body, gravitating toward a ~stationary larger body, capture in a ~stable orbit subtracts exactly one dimension from my available "linear velocity", in the sense that, maybe the other two components are fixed [over a certain period] now, but exactly one component must go to zero.
Ptolemaically, this looks like the ~stationary larger body, dragging the rest of spacetime with it in a 2-D fixed velocity [that is, fixed over the orbit's period] around me - with exactly one dimension, the one we see as ~Polaris vs ~anti-Polaris, fixed in place, relative to the me/larger-body system. That is, the universe begins rotating around me cylindrically. The major diameter and minor diameter of the cylinder are dependent on the linear velocity I entered at [ adding in my mass and the mass of the heavy body, you get the period ] - but, assuming the larger body is stationary, nothing else about my fate in the capturing orbit appears dependent on anything else about my previous history - the rest is ~erased - even though generally-relative spacetime doesn't seem to preclude more, or fewer, dependencies surviving. My question is, why is this? Why don't more, or fewer, dependencies on my past momenta ["angular" or otherwise] survive?
[ TBC, I know orbits can oscillate. However, most 3D shell orbits do not look like oscillating, but locally stable, 2D orbits. ]
Two physics riddles, since my last riddle has positive karma:
-
Why do we use the right-hand rule to calculate the Lorentz force, rather than using the left-hand rule?
-
Why do planetary orbits stabilize in two dimensions, rather than three dimensions [i.e. a shell] or zero [i.e. relative fixity]? [ It's clear why they don't stabilize in one dimension, at least: they would have to pass through the center of mass of the system, which the EMF usually prevents. ]
Crossposting a riddle from Twitter:
Karl Marx writes in 1859 on currency debasement and inflation:
One finds a number of occasions in the history of the debasement of currency by English and French governments when the rise in prices was not proportionate to the debasement of the silver coins. The reason was simply that the increase in the volume of currency was not proportional to its debasement; in other words, if the exchange-value of commodities was in future to be evaluated in terms of the lower standard of value and to be realised in coins corresponding to this lower standard, then an inadequate number of coins with lower metal content had been issued. This is the solution of the difficulty which was not resolved by the controversy between Locke and Lowndes. The rate at which a token of value – whether it consists of paper or bogus gold and silver is quite irrelevant – can take the place of definite quantities of gold and silver calculated according to the mint-price depends on the number of tokens in circulation and by no means on the material of which they are made. The difficulty in grasping this relation is due to the fact that the two functions of money – as a standard of value and a medium of circulation – are governed not only by conflicting laws, but by laws which appear to be at variance with the antithetical features of the two functions. As regards its function as a standard of value, when money serves solely as money of account and gold merely as nominal gold, it is the physical material used which is the crucial factor. [ . . . ] [W]hen it functions as a medium of circulation, when money is not just imaginary but must be present as a real thing side by side with other commodities, its material is irrelevant and its quantity becomes the crucial factor. Although whether it is a pound of gold, of silver or of copper is decisive for the standard measure, mere number makes the coin an adequate embodiment of any of these standard measures, quite irrespective of its own material. But it is at variance with common sense that in the case of purely imaginary money everything should depend on the physical substance, whereas in the case of the corporeal coin everything should depend on a numerical relation that is nominal.
This paradox has an explanation, which resolves everything such that it stops feeling unnatural and in fact feels neatly inevitable in retrospect. I'll post it as soon as I have a paycheck to "tell the time by" again.
Until then, I'm curious whether anyone* can give the answer.
*who hasn't already heard it from me on a Discord call - this isn't very many people and I expect none of them are commenters here
Newton's laws of motion are already laws of futurology.
"The antithesis is not so heterodox as it sounds, for every active mind will form opinions without direct evidence, else the evidence too often would never be collected."
I already know, upon reading this sentence [source] that I'm going to be quoting it constantly.
It's too perfect a rebuttal to the daily-experienced circumstance of people imagining that things - ideas, facts, heuristics, truisms - that are obvious to the people they consider politically "normal" [e.g., 2024 politically-cosmopolitan Americans, or LessWrong], must be or have been obvious to everyone of their cognitive intelligence level, at all times and in all places -
- or the converse, that what seems obvious to the people they consider politically "normal", must be true.
Separately from how pithy it is, regarding the substance of the quote: it strikes me hard that of all people remembered by history who could have said this, the one who did was R.A. Fisher. You know, the original "frequentist"? I'd associated his having originated the now-endemic tic of "testing for statistical significance" with a kind of bureaucratic indifference to unfamiliar, "fringe" ideas, which I'd assumed he'd shared.
But the meditation surrounding this quote is a paean to the mental process of "asking after the actual causes of things, without assuming that the true answers are necessarily contained within your current mental framework".
"That Charles Darwin accepted the fusion or blending theory of inheritance, just as all men accept many of the undisputed beliefs of their time, is universally admitted. [ . . . ] To modern readers [the argument from the variability within domestic species] will seem a very strange argument with which to introduce the case for Natural Selection [ . . . ] It should be remembered that, at the time of the essays, Darwin had little direct evidence on [the] point [of whether variation existed within species] [ . . . ] The antithesis is not so heterodox as it sounds, for every active mind will form opinions without direct evidence, else the evidence too often would never be collected."
This comes on the heels of me finding out that Jakob Bernoulli, the ostensible great-granddaddy of the frequentists, believed himself to be using frequencies to study probabilities, and was only cast in the light of history as having discovered that probabilities really "were" frequencies.
"This result [Jakob Bernoulli's discovery of the Law of Large Numbers in population statistics] can be viewed as a justification of the frequentist definition of probability: 'proportion of times a given event happens'. Bernoulli saw it differently: it provided a theoretical justification for using proportions in experiments to deduce the underlying probabilities. This is close to the modern axiomatic view of probability theory." [ Ian Stewart, Do Dice Play God, pg 34 ]
Bernoulli:
"Both [the] novelty [ of the Law of Large Numbers ] and its great utility combined with its equally great difficulty can add to the weight and value of all the other chapters of this theory. But before I convey its solution, let me remove a few objections that certain learned men have raised. 1. They object first that the ratio of tokens is different from the ratio of diseases or changes in the air: the former have a determinate number, the latter an indeterminate and varying one. I reply to this that both are posited to be equally uncertain and indeterminate with respect to our knowledge. On the other hand, that either is indeterminate in itself and with respect to its nature can no more be conceived by us than it can be conceived that the same thing at the same time is both created and not created by the Author of nature: for whatever God has done, God has, by that very deed, also determined at the same time." [ Jakob Bernoulli's "The Art of Conjecturing", translated by Edith Dudley Sylla ]
It makes me wonder how many great names modern "frequentism" can even accurately count among its endorsers.
Edit:
Fisher on the philosophy of probability [ PLEASE click through, it's kind of a take-your-breath-away read if you're familiar with the modern use of "p-values" ]:
"Now suppose there were knowledge a priori of the distribution of μ. Then the method of Bayes would give a probability statement, probably a different one. This would supersede the fiducial value, for a very simple reason. If there were knowledge a priori, the fiducial method of reasoning would be clearly erroneous because it would have ignored some of the data. I need give no stronger reason than that. Therefore, the first condition [of employing the frequentist definition of probability] is that there should be no knowledge a priori.
[T]here is quite a lot of continental influence in favor of regarding probability theory as a self-supporting branch of mathematics, and treating it in the traditionally abstract and, I think, fruitless way [ . . . ] Certainly there is grave confusion of thought. We are quite in danger of sending highly-trained and highly intelligent young men out into the world with tables of erroneous numbers under their arms, and with a dense fog in the place where their brains ought to be. In this century, of course, they will be working on guided missiles and advising the medical profession on the control of disease, and there is no limit to the extent to which they could impede every sort of national effort."
[ R.A. Fisher, 1957 ]
I made the Pascal's triangle smaller, good idea.
Thank you!
Thank you for your kind comment! I disagree with the johnswentworth post you linked; it's misleading to frame NN interpretability as though we started out having any graph with any labels, weird-looking labels or not. I have sent you a DM.
While writing a recent post, I had to decide whether to mention that Nicolaus Bernoulli had written his letter posing the St. Petersburg problem specifically to Pierre Raymond de Montmort, given that my audience and I probably have no other shared semantic anchor for Pierre's existence, and he doesn't visibly appear elsewhere in the story.
I decided Yes. I think the idea of awarding credit to otherwise-silent muses in general is interesting.
Footnote to my impending post about the history of value and utility:
After Pascal's and Fermat's work on the problem of points, and Huygens's work on expected value, the next major work on probability was Jakob Bernoulli's Ars conjectandi, written between 1684 and 1689 and published posthumously by his nephew Nicolaus Bernoulli in 1713. Ars conjectandi had 3 important contributions to probability theory:
[1] The concept that expected experience is conserved, or that probabilities must sum to 1.
Bernoulli generalized Huygens's principle of expected value in a random event as
[ where is the probability of the th outcome, and is the payout from the th outcome ]
and said that, in every case, the denominator - i.e. the probabilities of all possible events - must sum to 1, because only one thing can happen to you
[ making the expected value formula just
with normalized probabilities! ]
[2] The explicit application of strategies starting with the binomial theorem [ known to ancient mathematicians as the triangle pattern studied by Pascal
and first successfully analyzed algebraically by Newton ] to combinatorics in random games [which could be biased] - resulting in e.g. [ the formula for the number of ways to choose k items of equivalent type, from a lineup of n [unique-identity] items ] [useful for calculating the expected distribution of outcomes in many-turn fair random games, or random games where all more-probable outcomes are modeled as being exactly twice, three times, etc. as probable as some other outcome],
written as :
[ A series of random events [a "stochastic process"] can be viewed as a zig-zaggy path moving down the triangle, with the tiers as events, [whether we just moved LEFT or RIGHT] as the discrete outcome of an event, and the numbers as the relative probability density of our current score, or count of preferred events.
When we calculate , we're calculating one of those relative probability densities. We're thinking of as our total long-run number of events, and as our target score, or count of preferred events.
We calculate by first "weighting in" all possible orderings of , by taking , and then by "factoring out" all possible orderings of ways to achieve our chosen W condition [since we always take the same count of W-type outcomes as interchangeable], and "factoring out" all possible orderings of our chosen L condition [since we're indifferent between those too].
[My explanation here has no particular relation to how Bernoulli reasoned through this.] ]
Bernoulli did not stop with and discrete probability analysis, however; he went on to analyze probabilities [in games with discrete outcomes] as real-valued, resulting in the Bernoulli probability distribution.
[3] The empirical "Law of Large Numbers", which says that, after you repeat a random game for many turns and add up all the outcomes, the total final outcome will approach the number of turns, times the expected distribution of outcomes in a single turn. E.g. if a die is biased to roll
a 6 40% of the time
a 5 25% of the time
a 4 20% of the time
a 3 8% of the time
a 2 4% of the time, and
a 1 3% of the time
then after 1,000 rolls, your counts should be "close" to
6: .4*1,000 = 400
5: .25*1,000 = 250
4: .2*1,000 = 200
3: .08*1,000 = 80
2: .04*1,000 = 40
1: .03*1,000 = 30
and even "closer" to these ideal ratios after 1,000,000 rolls
- which Bernoulli brought up in the fourth and final section of the book, in the context of analyzing sociological data and policymaking.
One source: "Do Dice Play God?" by Ian Stewart
[ Please DM me if you would like the author of this post to explain this stuff better. I don't have much idea how clear I am being to a LessWrong audience! ]
This is one of those subject areas that'd be unfortunately bad to get into publicly. If you or any other individual wants to grill me on this, feel free to DM me or contact me by any of the above methods and I will take disclosure case by case.
This is a just ask.
Also, even though it's not locally rhetorically convenient [ where making an isolated demand for rigor of people making claims like "scaling has hit a wall [therefore AI risk is far]" that are inconvenient for AInotkilleveryoneism, is locally rhetorically convenient for us ], we should demand the same specificity of people who are claiming that "scaling works", so we end up with a correct world-model and so people who just want to build AGI see that we are fair.
Update: My best current theory [ hasn't changed in a few months but I figured it might be worth posting ] is that composite smell data [i.e. the better part of smell processing] is passed directly from the olfactory bulb to somewhere in the entorhinal-amygdalar-temporal area, while there are a few scents that function as pheromones in the sense that we have innate responses to the scents as opposed to their associated experiences [ so, skunk and feces as well as the scent of eligible mates ] and data about these scents is relayed by thin, almost invisible projections to the hypothalamus or other nuclei in the "emotional motor system" so the behavioral responses can bootstrap.
What happens at that point depends a lot on the details of the lawbreaker's creation. [ . . . ] The probability seems unlikely to me to be zero for the sorts of qualities which would make such an AI agent dangerous.
Have you read The Sun is big, but superintelligences will not spare Earth a little sunlight?
I'll address each of your 4 critiques:
[ 1. ] In public policy making, you have a set of preferences, which you get from votes or surveys, and you formulate policy based on your best objective understanding of cause and effect. The preferences don't have to be objective, because they are taken as given.
The point I'm making in the post
Well, I reject the presumption of guilt.
is that no matter whether you have to treat the preferences as objective, there is an objective fact of the matter about what someone's preferences are, in the real world [ real, even if not physical ].
[ 2. ] [ Agreeing on such basic elements of our ontology/epistemology ] isn't all that relevant to AI safety, because an AI only needs some potentially dangerous capabilities.
Whether or not an AI "only needs some potentially dangerous capabilities" for your local PR purposes, the global truth of the matter is that "randomly-rolled" superintelligences will have convergent instrumental desires that have to do with making use of the resources we are currently using [like the negentropy that would make Earth's oceans a great sink for 3 x 10^27 joules], but not desires that tightly converge with our terminal desires that make boiling the oceans without evacuating all the humans first a Bad Idea.
[ 3. ] You haven't defined consciousness and you haven't explained how [ we can know something that lives in a physical substrate that is unlike ours is conscious ].
My intent is not to say "I/we understand consciousness, therefore we can derive objectively sound-valid-and-therefore-true statements from theories with mentalistic atoms". The arguments I actually give for why it's true that we can derive objective abstract facts about the mental world, begin at "So why am I saying this premise is false?", and end at ". . . and agree that the results came out favoring one theory or another." If we can derive objectively true abstract statements about the mental world, the same way we can derive such statements about the physical world [e.g. "the force experienced by a moving charge in a magnetic field is orthogonal both to the direction of the field and to the direction of its motion"] this implies that we can understand consciousness well, whether or not we already do.
[ 4. ] there doesn't need to be [ some degree of objective truth as to what is valuable ]. You don't have to solve ethics to set policy.
My point, again, isn't that there needs to be, for whatever local practical purpose. My point is that there is.
I think, in retrospect, the view that abstract statements about shared non-reductionist reality can be objectively sound-valid-and-therefore true follows pretty naturally from combining the common-on-LessWrong view that logical or abstract physical theories can make sound-valid-and-therefore-true abstract conclusions about Reality, with the view, also common on LessWrong, that we make a lot of decisions by modeling other people as copies of ourselves, instead of as entities primarily obeying reductionist physics.
It's just that, despite the fact that all the pieces are there, it goes on being a not-obvious way to think, if for years and years you've heard about how we can only have objective theories if we can do experiments that are "in the territory" in the sense that they are outside of anyone's map. [ Contrast with celebrity examples of "shared thought experiments" from which many people drew similar conclusions because they took place in a shared map - Singer's Drowning Child, the Trolley Problem, Rawls's Veil of Ignorance, Zeno's story about Achilles and the Tortoise, Pascal's Wager, Newcomb's Problem, Parfit's Hitchhiker, the St. Petersburg paradox, etc. ]
? Yes, that is the bad post I am rebutting.
Recently, Raginrayguns and Philosophy Bear both [presumably] read "Cargo Cult Science" [not necessarily for the first time] on /r/slatestarcodex. I follow both of them, so I looked into it. And TIL that's where "cargo-culting" comes from. He doesn't say why it's wrong, he just waves his hands and says it doesn't work and it's silly. Well, now I feel silly. I've been cargo-culting "cargo-culting". I'm a logical decision theorist. Cargo cults work. If they work unreliably, so do reductionistic methods.
I once thought "slack mattered more than any outcome". But whose slack? It's wonderful for all humans to have more slack. But there's a huge game-theoretic difference between the species being wealthier, and thus wealthier per capita, and being wealthy/high-status/dominant/powerful relative to other people. The first is what I was getting at by "things orthogonal to the lineup"; the second is "the lineup". Trying to improve your position relative to copies of yourself in a way that is zero-sum is "the rat race", or "the Red Queen's race", where running will ~only ever keep you in the same place, and cause you and your mirror-selves to expend a lot of effort that is useless if you don't enjoy it.
[I think I enjoy any amount of "the rat race", which is part of why I find myself doing any of it, even though I can easily imagine tweaking my mind such that I stop doing it and thus exit an LDT negotiation equilibrium where I need to do it all the time. But I only like it so much, and only certain kinds.]
! I'm genuinely impressed if you wrote this post without having a mental frame for the concepts drawn from LDT.
LDT says that, for the purposes of making quasi-Kantian [not really Kantian but that's the closest thing I can gesture at OTOH that isn't just "read the Yudkowsky"] correct decisions, you have to treat the hostile telepaths as copies of yourself.
Indexical uncertainty, ie not knowing whether you're in Omega's simulation or the real world, means that, even if "I would never do that", if someone is "doing that" to me, in ways I can't ignore, I have to act as though I might ever be in a situation where I'm basically forced to "do that".
I can still preferentially withhold reward from copies of myself that are executing quasi-threats, though. And in fact this is correct because it minimizes quasi-threats in the mutual copies-of-myself negotiating equilibrium.
"Acquire the ability to coerce, rather than being coerced by, other agents in my environment", is not a solution to anything - because the quasi-Rawlsian [again, not really Rawlsian, but I don't have any better non-Yudkowsky reference points OTOH] perspective means that if you precommit to acquire power, you end up in expectation getting trodden on just as much as you trod on the other copies of you. So you're right back where you started.
Basically, you have to control things orthogonal to your position in the lineup, to robustly improve your algorithm for negotiating with others.
And I think "be willing to back deceptions" is in fact such a socially-orthogonal improvement.
I think this means that if you care both about (a) wholesomeness and (b) ending self-deception, it's helpful to give yourself full permission to lie as a temporary measure as needed. Creating space for yourself so you can (say) coherently build power such that it's safe for you to eventually be fully honest.
The first sentence here, I think, verbalizes something important.
The second [instrumental-power] is a bad justification, to the extent that we're talking about game-theoretic power [as opposed to power over reductionistic, non-mentalizing Nature]. LDT is about dealing with copies of myself. They'll all just do the same thing [lie for power] and create needless problems.
You do give a good justification that, I think, doesn't create any needless aggression between copies of oneself, and which I think suffices to justify "backing self-deception" as promising:
I mean something more wholehearted. If I self-deceive, it's because it's the best solution I have to some hostile telepath problem. If I don't have a better solution, then I want to keep deceiving myself. I don't just tolerate it. I actively want it there. I'll fight to keep it there! [...]
This works way better if I trust my occlumency skills here. If I don't feel like I have to reveal the self-deceptions I notice to others, and I trust that I can and will hide it from others if need be, then I'm still safe from hostile telepaths.
[emphases mine]
"I'm not going to draw first, but drawing second and shooting faster is what I'm all about" but for information theory.
Dath ilani are canonically 3 Earthling standard deviations smarter than Earthlings, partly because they have been deliberately optimizing their kids' genomes for hundreds of years.
A decision tree that's ostensibly both normative and exhaustive of the space at hand.
I don't know, I'm not familiar with the history; probably zero. It's a metaphor. The things the two scenarios are supposed to have in common are first-time-ness, danger, and technical difficulty. I point out in the post that the AGI scenario is actually irreducibly harder than first-time heavier-than-air flight: you can't safely directly simulate intelligent computations themselves for testing, because then you're just running the actual computation.
But as for the application of "green light" standards - the actual Wright brothers were only risking their own lives. Why should someone else need to judge their project for safety?
Changed to "RLHF as actually implemented." I'm aware of its theoretical origin story with Paul Christiano; I'm going a little "the purpose of a system is what it does".
Unlike with obvious epistemic predicates over some generality [ eg "Does it snow at 40 degrees N?", "Can birds heavier than 44lb fly?" - or even more generally the skills of predicting the weather and building flying machines ], to which [parts of] the answers can be usefully remembered as monolithic invariants, obvious deontic predicates over generalities [ eg "Should I keep trying when I am exhausted?", "Will it pay to fold under pressure?" - and the surrounding general skills ] don't have generalizable answers that are independent of one's acute strategic situation. I am not against trying to formulate invariant answers to these questions by spelling out every contingency; I am unsure whether LessWrong is the place, except when there's some motivating or illustrative question of fact that makes your advice falsifiable [ I think Eliezer's recent The Sun is big, but superintelligences will not spare Earth a little sunlight is a good example of this ].
Interested to hear how you would put this with "research" tabooed. Personally I don't care if it's research as long as it works.
I agree that the somatosensory cortex [in the case of arm movements, actually mostly the parietal cortex, but also somewhat the somatosensory] needs to be getting information from the motor cortex [actually mostly the DLPFC, but also somewhat the motor] about what to expect the arm to do!
This necessary predictive-processing "attenuate your sensory input!" feedback signal, could be framed as "A [C]", such that "weak A [C]" might start giving you hallucinations.
However, in order for the somatosensory cortex to notice a prediction error and start hallucinating, it has to be receiving a stronger [let's say "D"] signal, from the arm, signifying that the arm is moving, than the "weak A [C]" signal signifiying that we moved the arm.
I don't think your theory predicts this or accounts for this anyhow?
My "Q/P" theory does.
[ "B" in your theory maps to my "quasi-volition", ie anterior cortex, or top-down cortical infrastructure.
Every other letter in your theory - the "A", "C", and "D" - all map to my "perception", ie posterior cortex, or bottom-up cortical infrastructure. ]
Thanks for clarifying. The point where I'm at now is, as I said in my previous comment,
if it's just signal strength, rather than signal speed, why bring "A" [cortico-cortical connections] into it, why not just have "B" ["quasi-volitional" connections] and "C" ["perceptual" connections]?
Thanks for being patient enough to go through and clarify your confusion!
About the pyramidal cells - I should have been more specific and said that prefrontal cortex [as opposed to primary motor cortex] - AFAIK does not have output pyramidal cells in Layer V. Those are Betz cells and basically only the primary motor cortex has them, although Wikipedia (on Betz cells and pyramidal cells) tells me the PFC has any neurons that qualify as "pyramidal neurons", too; it looks like their role in processing is markedly different from the giant pyramidal neurons found in the primary motor cortex's Layer 5.
Pyramidal neurons in the prefrontal cortex are implicated in cognitive ability. In mammals, the complexity of pyramidal cells increases from posterior to anterior brain regions. The degree of complexity of pyramidal neurons is likely linked to the cognitive capabilities of different anthropoid species. Pyramidal cells within the prefrontal cortex appear to be responsible for processing input from the primary auditory cortex, primary somatosensory cortex, and primary visual cortex, all of which process sensory modalities.[21] These cells might also play a critical role in complex object recognition within the visual processing areas of the cortex.[3] Relative to other species, the larger cell size and complexity of pyramidal neurons, along with certain patterns of cellular organization and function, correlates with the evolution of human cognition. [22]
This is technically compatible with "pyramidal neurons" playing a role in schizophrenic hallucinations, but it's not clear how there's any correspondence with the A vs B [vs C] ratio concept.
Maybe I slightly misunderstood your original theory, if you are not trying to say here, in general, that there is an "action-origination" signal that originates from one region of cortex, and in schizophrenics the major data packet ["B"] and the warning signal ["C"] experience different delays getting to the target.
If you are postulating a priori that the brain has a capability to begin the "B" and "C" signals in different regions of cortex at the same time, then how can you propose to explain schizophrenia based on intra-brain communication delay differentials at all? And if it's just signal strength, rather than signal speed, why bring "A" into it, why not just have "B" and "C"?
My own view is: Antipsychotics block dopamine receptors. I think you're right that they reduce a ratio that's something like your B/A ratio. But I can't draw a simple wiring diagram about it based on any few tracts. I would call it a Q/P ratio - a ratio of "quasi-volition", based on dopamine signaling originating with the frontal cortex and basal ganglia, and perception, originating in the back half of the cortex and not relying on dopamine.
Illustration of how the Q/P idea is compatible with antipsychotics reducing something like a B/A ratio in the motor case: Antipsychotics cause "extrapyramidal" symptoms, which are stereotyped motions of the extremities caused by neurons external to the pyramidal [corticospinal] tract. As I understand it, this is because one effect of blocking dopamine receptors in the frontal lobe is to inhibit the activity of Betz cells.
No, when I say "in parallel", I'm not talking about two signals originating from different regions of cortex. I'm talking about two signals originating from the same region of cortex, at the time the decision is made - one of which [your "B" above] carries the information "move your arm"[/"subvocalize this sentence"] and the other of which [the right downward-pointing arrow in your diagram above, which you haven't named, and which I'll call "C"] carries the information "don't perceive an external agency moving your arm"[/"don't perceive an external agency subvocalizing this sentence"].
AFAICT, schizophrenic auditory hallucinations in general don't pass through the brainstem. Neither do the other schizophrenic "positive symptoms" of delusional and disordered cognition. So in order to actually explain schizophrenic symptoms and the meliorating effect of antipsychotics, "B" and "C" themselves have to be instantiated without reference to the brainstem.
With respect to auditory hallucinations, "B" and "C" should both originate further down the frontal cortex, in the DLPFC, where there are no pyramidal neurons, and "C" should terminate in the auditory-perceptual regions of the temporal lobe, not the brainstem.
If you can't come up with a reason we should assume the strength of the "B" signal [modeled as jointly originating with the "C" signal] here is varying, but the strength of the "C" signal [modeled as sometimes terminating in the auditory-perceptual regions of the temporal lobe] is not, I don't see what weight your theory can bear except in the special case of motor symptoms - not auditory-hallucination or cognitive symptoms.
Doesn't this all rely on the idea that
Command to move my arm
and
Signal that I expect my arm to move (thus suppressing the orienting / startle reaction which would otherwise occur when my arm motion is detected)
are sent through separate [if parallel] channels/pathways? What substantiates this?
Something else I later noticed should confuse me:
OK, now let’s look at the sentences in the above excerpt one-by-one:
For the first sentence (on genes): Consider a gene that says “Hey neurons! When in doubt, make more synapses! Grow more axons! Make bigger dendritic trees!” This gene would probably be protective against schizophrenia and a risk factor for autism, for reasons discussed just above. And vice-versa for the opposite kind of gene. Nice!
Why would your theory [which says that schizophrenia is about deficient connections] predict that a gene that predisposed toward a more fully-connected connectome, would protect against schizophrenia?
Great post!
I think schizophrenia is generally recognized as involving more deficiencies in local than long-distance cortex-to-cortex communication. I don't have any particular knockdown studies for this and your Google Scholar judo seems on a level with mine. But as far as I understand it, autism is the disorder associated with deficits in longer white matter tracts; I believe this is because axons grow too densely and mesh too tightly during fetal development, while in schizophrenia the opposite happens and you end up with fewer axons [also probably fewer neurons because fewer neural progenitor divisions but this wouldn't a priori affect the shape of the connectome].
I figure, when a neurotypical person is subvocalizing, there’s communication between the motor cortex parts that are issuing the subvocalization commands (assuming that that’s how subvocalization works, I dunno), and the sensory cortex parts that are detecting the subvocalization which is now happening. Basically, the sensory cortex has ample warning that the subvocalization is coming. It’s not surprised when it arrives.
But in schizophrenia, different parts of the cortex can’t reliably talk to each other. So maybe sometimes the sensory cortex detects that a subvocalization is now happening, but hadn’t gotten any signal in advance that this subvocalization was about to be produced endogenously, by a different part of the same cortex. So when it arrives, it’s a surprise, and thus is interpreted as exogenous, i.e. it feels like it’s coming from the outside.
I don't see how your theory would make this prediction. To me it seems like your theory predicts, if anything, weaker subvocalization in schizophrenics - not subvocalization that's more perceptible. The "it inhibits the warning shot but not the actual data packet" thing frankly feels like an epicycle.
I don't see how Dehaene clarifies anything or how your theory here relies on him.