Posts
Comments
Thank you!
Thank you for your kind comment! I disagree with the johnswentworth post you linked; it's misleading to frame NN interpretability as though we started out having any graph with any labels, weird-looking labels or not. I have sent you a DM.
While writing a recent post, I had to decide whether to mention that Nicolaus Bernoulli had written his letter posing the St. Petersburg problem specifically to Pierre Raymond de Montmort, given that my audience and I probably have no other shared semantic anchor for Pierre's existence, and he doesn't visibly appear elsewhere in the story.
I decided Yes. I think the idea of awarding credit to otherwise-silent muses in general is interesting.
Footnote to my impending post about the history of value and utility:
After Pascal's and Fermat's work on the problem of points, and Huygens's work on expected value, the next major work on probability was Jakob Bernoulli's Ars conjectandi, written between 1684 and 1689 and published posthumously by his nephew Nicolaus Bernoulli in 1713. Ars conjectandi had 3 important contributions to probability theory:
[1] The concept that expected experience is conserved, or that probabilities must sum to 1.
Bernoulli generalized Huygens's principle of expected value in a random event as
[ where is the probability of the th outcome, and is the payout from the th outcome ]
and said that, in every case, the denominator - i.e. the probabilities of all possible events - must sum to 1, because only one thing can happen to you
[ making the expected value formula just
with normalized probabilities! ]
[2] The explicit application of strategies starting with the binomial theorem [ known to ancient mathematicians as the triangle pattern studied by Pascal
and first successfully analyzed algebraically by Newton ] to combinatorics in random games [which could be biased] - resulting in e.g. [ the formula for the number of ways to choose k items of equivalent type, from a lineup of n [unique-identity] items ] [useful for calculating the expected distribution of outcomes in many-turn fair random games, or random games where all more-probable outcomes are modeled as being exactly twice, three times, etc. as probable as some other outcome],
written as :
[ A series of random events [a "stochastic process"] can be viewed as a zig-zaggy path moving down the triangle, with the tiers as events, [whether we just moved LEFT or RIGHT] as the discrete outcome of an event, and the numbers as the relative probability density of our current score, or count of preferred events.
When we calculate , we're calculating one of those relative probability densities. We're thinking of as our total long-run number of events, and as our target score, or count of preferred events.
We calculate by first "weighting in" all possible orderings of , by taking , and then by "factoring out" all possible orderings of ways to achieve our chosen W condition [since we always take the same count of W-type outcomes as interchangeable], and "factoring out" all possible orderings of our chosen L condition [since we're indifferent between those too].
[My explanation here has no particular relation to how Bernoulli reasoned through this.] ]
Bernoulli did not stop with and discrete probability analysis, however; he went on to analyze probabilities [in games with discrete outcomes] as real-valued, resulting in the Bernoulli probability distribution.
[3] The empirical "Law of Large Numbers", which says that, after you repeat a random game for many turns and add up all the outcomes, the total final outcome will approach the number of turns, times the expected distribution of outcomes in a single turn. E.g. if a die is biased to roll
a 6 40% of the time
a 5 25% of the time
a 4 20% of the time
a 3 8% of the time
a 2 4% of the time, and
a 1 3% of the time
then after 1,000 rolls, your counts should be "close" to
6: .4*1,000 = 400
5: .25*1,000 = 250
4: .2*1,000 = 200
3: .08*1,000 = 80
2: .04*1,000 = 40
1: .03*1,000 = 30
and even "closer" to these ideal ratios after 1,000,000 rolls
- which Bernoulli brought up in the fourth and final section of the book, in the context of analyzing sociological data and policymaking.
One source: "Do Dice Play God?" by Ian Stewart
[ Please DM me if you would like the author of this post to explain this stuff better. I don't have much idea how clear I am being to a LessWrong audience! ]
This is one of those subject areas that'd be unfortunately bad to get into publicly. If you or any other individual wants to grill me on this, feel free to DM me or contact me by any of the above methods and I will take disclosure case by case.
This is a just ask.
Also, even though it's not locally rhetorically convenient [ where making an isolated demand for rigor of people making claims like "scaling has hit a wall [therefore AI risk is far]" that are inconvenient for AInotkilleveryoneism, is locally rhetorically convenient for us ], we should demand the same specificity of people who are claiming that "scaling works", so we end up with a correct world-model and so people who just want to build AGI see that we are fair.
Update: My best current theory [ hasn't changed in a few months but I figured it might be worth posting ] is that composite smell data [i.e. the better part of smell processing] is passed directly from the olfactory bulb to somewhere in the entorhinal-amygdalar-temporal area, while there are a few scents that function as pheromones in the sense that we have innate responses to the scents as opposed to their associated experiences [ so, skunk and feces as well as the scent of eligible mates ] and data about these scents is relayed by thin, almost invisible projections to the hypothalamus or other nuclei in the "emotional motor system" so the behavioral responses can bootstrap.
What happens at that point depends a lot on the details of the lawbreaker's creation. [ . . . ] The probability seems unlikely to me to be zero for the sorts of qualities which would make such an AI agent dangerous.
Have you read The Sun is big, but superintelligences will not spare Earth a little sunlight?
I'll address each of your 4 critiques:
[ 1. ] In public policy making, you have a set of preferences, which you get from votes or surveys, and you formulate policy based on your best objective understanding of cause and effect. The preferences don't have to be objective, because they are taken as given.
The point I'm making in the post
Well, I reject the presumption of guilt.
is that no matter whether you have to treat the preferences as objective, there is an objective fact of the matter about what someone's preferences are, in the real world [ real, even if not physical ].
[ 2. ] [ Agreeing on such basic elements of our ontology/epistemology ] isn't all that relevant to AI safety, because an AI only needs some potentially dangerous capabilities.
Whether or not an AI "only needs some potentially dangerous capabilities" for your local PR purposes, the global truth of the matter is that "randomly-rolled" superintelligences will have convergent instrumental desires that have to do with making use of the resources we are currently using [like the negentropy that would make Earth's oceans a great sink for 3 x 10^27 joules], but not desires that tightly converge with our terminal desires that make boiling the oceans without evacuating all the humans first a Bad Idea.
[ 3. ] You haven't defined consciousness and you haven't explained how [ we can know something that lives in a physical substrate that is unlike ours is conscious ].
My intent is not to say "I/we understand consciousness, therefore we can derive objectively sound-valid-and-therefore-true statements from theories with mentalistic atoms". The arguments I actually give for why it's true that we can derive objective abstract facts about the mental world, begin at "So why am I saying this premise is false?", and end at ". . . and agree that the results came out favoring one theory or another." If we can derive objectively true abstract statements about the mental world, the same way we can derive such statements about the physical world [e.g. "the force experienced by a moving charge in a magnetic field is orthogonal both to the direction of the field and to the direction of its motion"] this implies that we can understand consciousness well, whether or not we already do.
[ 4. ] there doesn't need to be [ some degree of objective truth as to what is valuable ]. You don't have to solve ethics to set policy.
My point, again, isn't that there needs to be, for whatever local practical purpose. My point is that there is.
I think, in retrospect, the view that abstract statements about shared non-reductionist reality can be objectively sound-valid-and-therefore true follows pretty naturally from combining the common-on-LessWrong view that logical or abstract physical theories can make sound-valid-and-therefore-true abstract conclusions about Reality, with the view, also common on LessWrong, that we make a lot of decisions by modeling other people as copies of ourselves, instead of as entities primarily obeying reductionist physics.
It's just that, despite the fact that all the pieces are there, it goes on being a not-obvious way to think, if for years and years you've heard about how we can only have objective theories if we can do experiments that are "in the territory" in the sense that they are outside of anyone's map. [ Contrast with celebrity examples of "shared thought experiments" from which many people drew similar conclusions because they took place in a shared map - Singer's Drowning Child, the Trolley Problem, Rawls's Veil of Ignorance, Zeno's story about Achilles and the Tortoise, Pascal's Wager, Newcomb's Problem, Parfit's Hitchhiker, the St. Petersburg paradox, etc. ]
? Yes, that is the bad post I am rebutting.
Recently, Raginrayguns and Philosophy Bear both [presumably] read "Cargo Cult Science" [not necessarily for the first time] on /r/slatestarcodex. I follow both of them, so I looked into it. And TIL that's where "cargo-culting" comes from. He doesn't say why it's wrong, he just waves his hands and says it doesn't work and it's silly. Well, now I feel silly. I've been cargo-culting "cargo-culting". I'm a logical decision theorist. Cargo cults work. If they work unreliably, so do reductionistic methods.
I once thought "slack mattered more than any outcome". But whose slack? It's wonderful for all humans to have more slack. But there's a huge game-theoretic difference between the species being wealthier, and thus wealthier per capita, and being wealthy/high-status/dominant/powerful relative to other people. The first is what I was getting at by "things orthogonal to the lineup"; the second is "the lineup". Trying to improve your position relative to copies of yourself in a way that is zero-sum is "the rat race", or "the Red Queen's race", where running will ~only ever keep you in the same place, and cause you and your mirror-selves to expend a lot of effort that is useless if you don't enjoy it.
[I think I enjoy any amount of "the rat race", which is part of why I find myself doing any of it, even though I can easily imagine tweaking my mind such that I stop doing it and thus exit an LDT negotiation equilibrium where I need to do it all the time. But I only like it so much, and only certain kinds.]
! I'm genuinely impressed if you wrote this post without having a mental frame for the concepts drawn from LDT.
LDT says that, for the purposes of making quasi-Kantian [not really Kantian but that's the closest thing I can gesture at OTOH that isn't just "read the Yudkowsky"] correct decisions, you have to treat the hostile telepaths as copies of yourself.
Indexical uncertainty, ie not knowing whether you're in Omega's simulation or the real world, means that, even if "I would never do that", if someone is "doing that" to me, in ways I can't ignore, I have to act as though I might ever be in a situation where I'm basically forced to "do that".
I can still preferentially withhold reward from copies of myself that are executing quasi-threats, though. And in fact this is correct because it minimizes quasi-threats in the mutual copies-of-myself negotiating equilibrium.
"Acquire the ability to coerce, rather than being coerced by, other agents in my environment", is not a solution to anything - because the quasi-Rawlsian [again, not really Rawlsian, but I don't have any better non-Yudkowsky reference points OTOH] perspective means that if you precommit to acquire power, you end up in expectation getting trodden on just as much as you trod on the other copies of you. So you're right back where you started.
Basically, you have to control things orthogonal to your position in the lineup, to robustly improve your algorithm for negotiating with others.
And I think "be willing to back deceptions" is in fact such a socially-orthogonal improvement.
I think this means that if you care both about (a) wholesomeness and (b) ending self-deception, it's helpful to give yourself full permission to lie as a temporary measure as needed. Creating space for yourself so you can (say) coherently build power such that it's safe for you to eventually be fully honest.
The first sentence here, I think, verbalizes something important.
The second [instrumental-power] is a bad justification, to the extent that we're talking about game-theoretic power [as opposed to power over reductionistic, non-mentalizing Nature]. LDT is about dealing with copies of myself. They'll all just do the same thing [lie for power] and create needless problems.
You do give a good justification that, I think, doesn't create any needless aggression between copies of oneself, and which I think suffices to justify "backing self-deception" as promising:
I mean something more wholehearted. If I self-deceive, it's because it's the best solution I have to some hostile telepath problem. If I don't have a better solution, then I want to keep deceiving myself. I don't just tolerate it. I actively want it there. I'll fight to keep it there! [...]
This works way better if I trust my occlumency skills here. If I don't feel like I have to reveal the self-deceptions I notice to others, and I trust that I can and will hide it from others if need be, then I'm still safe from hostile telepaths.
[emphases mine]
"I'm not going to draw first, but drawing second and shooting faster is what I'm all about" but for information theory.
Dath ilani are canonically 3 Earthling standard deviations smarter than Earthlings, partly because they have been deliberately optimizing their kids' genomes for hundreds of years.
A decision tree that's ostensibly both normative and exhaustive of the space at hand.
I don't know, I'm not familiar with the history; probably zero. It's a metaphor. The things the two scenarios are supposed to have in common are first-time-ness, danger, and technical difficulty. I point out in the post that the AGI scenario is actually irreducibly harder than first-time heavier-than-air flight: you can't safely directly simulate intelligent computations themselves for testing, because then you're just running the actual computation.
But as for the application of "green light" standards - the actual Wright brothers were only risking their own lives. Why should someone else need to judge their project for safety?
Changed to "RLHF as actually implemented." I'm aware of its theoretical origin story with Paul Christiano; I'm going a little "the purpose of a system is what it does".
Unlike with obvious epistemic predicates over some generality [ eg "Does it snow at 40 degrees N?", "Can birds heavier than 44lb fly?" - or even more generally the skills of predicting the weather and building flying machines ], to which [parts of] the answers can be usefully remembered as monolithic invariants, obvious deontic predicates over generalities [ eg "Should I keep trying when I am exhausted?", "Will it pay to fold under pressure?" - and the surrounding general skills ] don't have generalizable answers that are independent of one's acute strategic situation. I am not against trying to formulate invariant answers to these questions by spelling out every contingency; I am unsure whether LessWrong is the place, except when there's some motivating or illustrative question of fact that makes your advice falsifiable [ I think Eliezer's recent The Sun is big, but superintelligences will not spare Earth a little sunlight is a good example of this ].
Interested to hear how you would put this with "research" tabooed. Personally I don't care if it's research as long as it works.
I agree that the somatosensory cortex [in the case of arm movements, actually mostly the parietal cortex, but also somewhat the somatosensory] needs to be getting information from the motor cortex [actually mostly the DLPFC, but also somewhat the motor] about what to expect the arm to do!
This necessary predictive-processing "attenuate your sensory input!" feedback signal, could be framed as "A [C]", such that "weak A [C]" might start giving you hallucinations.
However, in order for the somatosensory cortex to notice a prediction error and start hallucinating, it has to be receiving a stronger [let's say "D"] signal, from the arm, signifying that the arm is moving, than the "weak A [C]" signal signifiying that we moved the arm.
I don't think your theory predicts this or accounts for this anyhow?
My "Q/P" theory does.
[ "B" in your theory maps to my "quasi-volition", ie anterior cortex, or top-down cortical infrastructure.
Every other letter in your theory - the "A", "C", and "D" - all map to my "perception", ie posterior cortex, or bottom-up cortical infrastructure. ]
Thanks for clarifying. The point where I'm at now is, as I said in my previous comment,
if it's just signal strength, rather than signal speed, why bring "A" [cortico-cortical connections] into it, why not just have "B" ["quasi-volitional" connections] and "C" ["perceptual" connections]?
Thanks for being patient enough to go through and clarify your confusion!
About the pyramidal cells - I should have been more specific and said that prefrontal cortex [as opposed to primary motor cortex] - AFAIK does not have output pyramidal cells in Layer V. Those are Betz cells and basically only the primary motor cortex has them, although Wikipedia (on Betz cells and pyramidal cells) tells me the PFC has any neurons that qualify as "pyramidal neurons", too; it looks like their role in processing is markedly different from the giant pyramidal neurons found in the primary motor cortex's Layer 5.
Pyramidal neurons in the prefrontal cortex are implicated in cognitive ability. In mammals, the complexity of pyramidal cells increases from posterior to anterior brain regions. The degree of complexity of pyramidal neurons is likely linked to the cognitive capabilities of different anthropoid species. Pyramidal cells within the prefrontal cortex appear to be responsible for processing input from the primary auditory cortex, primary somatosensory cortex, and primary visual cortex, all of which process sensory modalities.[21] These cells might also play a critical role in complex object recognition within the visual processing areas of the cortex.[3] Relative to other species, the larger cell size and complexity of pyramidal neurons, along with certain patterns of cellular organization and function, correlates with the evolution of human cognition. [22]
This is technically compatible with "pyramidal neurons" playing a role in schizophrenic hallucinations, but it's not clear how there's any correspondence with the A vs B [vs C] ratio concept.
Maybe I slightly misunderstood your original theory, if you are not trying to say here, in general, that there is an "action-origination" signal that originates from one region of cortex, and in schizophrenics the major data packet ["B"] and the warning signal ["C"] experience different delays getting to the target.
If you are postulating a priori that the brain has a capability to begin the "B" and "C" signals in different regions of cortex at the same time, then how can you propose to explain schizophrenia based on intra-brain communication delay differentials at all? And if it's just signal strength, rather than signal speed, why bring "A" into it, why not just have "B" and "C"?
My own view is: Antipsychotics block dopamine receptors. I think you're right that they reduce a ratio that's something like your B/A ratio. But I can't draw a simple wiring diagram about it based on any few tracts. I would call it a Q/P ratio - a ratio of "quasi-volition", based on dopamine signaling originating with the frontal cortex and basal ganglia, and perception, originating in the back half of the cortex and not relying on dopamine.
Illustration of how the Q/P idea is compatible with antipsychotics reducing something like a B/A ratio in the motor case: Antipsychotics cause "extrapyramidal" symptoms, which are stereotyped motions of the extremities caused by neurons external to the pyramidal [corticospinal] tract. As I understand it, this is because one effect of blocking dopamine receptors in the frontal lobe is to inhibit the activity of Betz cells.
No, when I say "in parallel", I'm not talking about two signals originating from different regions of cortex. I'm talking about two signals originating from the same region of cortex, at the time the decision is made - one of which [your "B" above] carries the information "move your arm"[/"subvocalize this sentence"] and the other of which [the right downward-pointing arrow in your diagram above, which you haven't named, and which I'll call "C"] carries the information "don't perceive an external agency moving your arm"[/"don't perceive an external agency subvocalizing this sentence"].
AFAICT, schizophrenic auditory hallucinations in general don't pass through the brainstem. Neither do the other schizophrenic "positive symptoms" of delusional and disordered cognition. So in order to actually explain schizophrenic symptoms and the meliorating effect of antipsychotics, "B" and "C" themselves have to be instantiated without reference to the brainstem.
With respect to auditory hallucinations, "B" and "C" should both originate further down the frontal cortex, in the DLPFC, where there are no pyramidal neurons, and "C" should terminate in the auditory-perceptual regions of the temporal lobe, not the brainstem.
If you can't come up with a reason we should assume the strength of the "B" signal [modeled as jointly originating with the "C" signal] here is varying, but the strength of the "C" signal [modeled as sometimes terminating in the auditory-perceptual regions of the temporal lobe] is not, I don't see what weight your theory can bear except in the special case of motor symptoms - not auditory-hallucination or cognitive symptoms.
Doesn't this all rely on the idea that
Command to move my arm
and
Signal that I expect my arm to move (thus suppressing the orienting / startle reaction which would otherwise occur when my arm motion is detected)
are sent through separate [if parallel] channels/pathways? What substantiates this?
Something else I later noticed should confuse me:
OK, now let’s look at the sentences in the above excerpt one-by-one:
For the first sentence (on genes): Consider a gene that says “Hey neurons! When in doubt, make more synapses! Grow more axons! Make bigger dendritic trees!” This gene would probably be protective against schizophrenia and a risk factor for autism, for reasons discussed just above. And vice-versa for the opposite kind of gene. Nice!
Why would your theory [which says that schizophrenia is about deficient connections] predict that a gene that predisposed toward a more fully-connected connectome, would protect against schizophrenia?
Great post!
I think schizophrenia is generally recognized as involving more deficiencies in local than long-distance cortex-to-cortex communication. I don't have any particular knockdown studies for this and your Google Scholar judo seems on a level with mine. But as far as I understand it, autism is the disorder associated with deficits in longer white matter tracts; I believe this is because axons grow too densely and mesh too tightly during fetal development, while in schizophrenia the opposite happens and you end up with fewer axons [also probably fewer neurons because fewer neural progenitor divisions but this wouldn't a priori affect the shape of the connectome].
I figure, when a neurotypical person is subvocalizing, there’s communication between the motor cortex parts that are issuing the subvocalization commands (assuming that that’s how subvocalization works, I dunno), and the sensory cortex parts that are detecting the subvocalization which is now happening. Basically, the sensory cortex has ample warning that the subvocalization is coming. It’s not surprised when it arrives.
But in schizophrenia, different parts of the cortex can’t reliably talk to each other. So maybe sometimes the sensory cortex detects that a subvocalization is now happening, but hadn’t gotten any signal in advance that this subvocalization was about to be produced endogenously, by a different part of the same cortex. So when it arrives, it’s a surprise, and thus is interpreted as exogenous, i.e. it feels like it’s coming from the outside.
I don't see how your theory would make this prediction. To me it seems like your theory predicts, if anything, weaker subvocalization in schizophrenics - not subvocalization that's more perceptible. The "it inhibits the warning shot but not the actual data packet" thing frankly feels like an epicycle.
I don't see how Dehaene clarifies anything or how your theory here relies on him.
I missed this being compiled and posted here when it came out! I typed up a summary [ of the Twitter thread ] and posted it to Substack. I'll post it here.
"It's easier to build foomy agent-type-things than nonfoomy ones. If you don't trust in the logical arguments for this [foomy agents are the computationally cheapest utility satisficers for most conceivable nontrivial local-utility-satisfaction tasks], the evidence for this is all around us, in the form of America-shaped-things, technology, and 'greed' having eaten the world despite not starting off very high-prevalence in humanity's cultural repertoire.
WITH THE TWIST that: while America-shaped-things, technology, and 'greed' have worked out great for us and work out great in textbook economics, textbook economics fails to account for the physical contingency of weaker economic participants [such as horses in 1920 and Native Americans in 1492] on the benevolence of stronger economic participants, who found their raw resources more valuable than their labor."
As I say on Substack, this post goes hard and now I think I have something better to link people to, who are genuinely not convinced yet that the alignment problem is hard, than List of Lethalities.
Thank you! Someone else noticed! For my part, I'll update this if I find anything.
As far as I can tell, density-dependent selection is an entirely different concept from what I'm trying to get at here, and operates entirely within the usual paradigm that says natural selection is always optimizing for relative fitness. Yes, the authors are trying to say, "We have to be careful to pay attention to the baseline that selection is relative to", but I think biologists were always implicitly careful to pay attention to varying baseline populations - which are usually understood to be species, and not ecological at all.
I'm trying to take a step back and say, look, we can't actually take it for granted that selection is optimizing for reproductive fitness relative to ANY baseline in the first place; it's an empirical observation external to the theory that "what reproduces, increases in prevalence" that evolution within sexual species seems to optimize for %-prevalence-within-the-species, rather than absolute size of descendants-cone.
Which is?
Have you read Byrnes on how we should expect certain emotional reactions to seemingly prohibitively complex stimuli, to be basically hardcoded?
I think I have learned a certain amount of my eye-contact-aversion, but I also suspect it is hardcodedly unusually difficult for me. When I was younger, my parents and teachers constantly harangued me to "get better at eye contact", and I really tried [I hated their haranguing!] but the eye contact itself was just too emotionally painful.
When I first went on Ritalin in November of 2020, I immediately noticed that it became much more possible to voluntarily choose to sustain eye contact with most people for ~3-10-second periods; I was excited and thought maybe I could make the whole aversion go away. But that didn't work either. It stayed effortful, although easier.
I still don't make much eye contact, and at this point in my life nobody ever remarks on it, and it causes no problem at all. I regularly meet people who are worse at it than I am, and some of them seem to have less of a general social handicap than I do!
Is Bostrom's original Simulation Hypothesis, the version involving ancestor-simulations, unconvincing to you? If you have decided to implement an epistemic exclusion in yourself with respect to the question of whether we are in a simulation, it is not my business to interfere with that. But we do, for predictive purposes, have to think about the fact that Bostrom's Simulation Hypothesis and other arguments in that vein will probably not be entirely unconvincing [by default] to any ASIs we build, given that they are not entirely unconvincing to the majority of the intelligent human population.
If a human being doesn't automatically qualify as a program to you, then we are having a much deeper disagreement than I anticipated. I doubt we can go any further until we reach agreement on whether all human beings are programs.
My attempt to answer the question you just restated anyway:
The idea is that you would figure out what the distant superintelligence wanted you to do the same way you would figure out what another human being who wasn't being verbally straight with you, wanted you to do: by picking up on its hints.
Of course this prototypically goes disastrously. Hence the vast cross-cultural literature warning against bargaining with demons and ~0 stories depicting it going well. So you should not actually do it.
How would you know that you were a program and Omega had a copy of you? If you knew that, how would you know that you weren't that copy?
Do you want to fully double-crux this? If so, do you one-box?
Not a woman, sadly.
I believe it, especially if one takes a view of "success" that's about popularity rather than fiat power.
But FYI to future advisors: the thing I would want to prospectively optimize for, along the gov path, when making this decision, is about fiat power. I'm highly uncertain about whether viable paths exist from a standing start to [benevolent] bureaucratic fiat power over AI governance, and if so, where those viable paths originate.
If it was just about reach, I'd probably look for a columnist position instead.
In what sense do you consider the mechinterp paradigm that originated with Olah, to be working?
"Endpoints are easier to predict than trajectories"; eventual singularity is such an endpoint; on our current trajectory, the person who is going to do it does not necessarily know they are going to do it until it is done.
Tweet link removed.
[ Sorry about the wrecked formatting in this comment, I'm on mobile and may come back and fix it later ]
They call it "burning" calories because it's oxidation. Like fire. More oxygen should help. Less oxygen should hurt.*
At least, if you buy CICO and correspondingly think that quantity food oxidation versus quantity fat oxidation is almost all that matters, metabolically, and know that medically, according to all CICO-compatible convention, quantity food oxidation is almost quota-ed at the level of food intake, while quantity fat oxidation is not. [ Hence why "food intake" is not considered a dependent variable in the CICO equation [ daily weight delta = 3600*[ food intake ] - [ RMR + exercise ] ] ].
Yet you, Scott, SMTM, and several others I've spoken with, who otherwise vastly disagree on obesity science [but most of whom say CICO is "basically true" or "trivially true"] independently think the "low O2 mediates the altitude effect" idea is plausible.
I even independently generated it myself, once, in 2021, back when I was still a CICO believer, before realizing 2 years later that, according to CICO, "low O2 results in weight loss" doesn't actually make any physiological sense.
I think people intuitively feel like it makes sense because marginally suffocating feels bad, and most other things that make you lose weight according to CICO [caloric restriction, forcing yourself to exercise, wearing fewer clothes so you shiver a bit in the cold] feel bad.
But many things that feel bad, don't make you lose weight. Like back problems. Or cancer. Or gaining weight.
And drinking unflavored olive oil in the middle of a fast period [ https://www.lesswrong.com/posts/BD4oExxQguTgpESd ] makes me lose more weight than anything else I've tried, and it doesn't feel bad at all. Keto also works for many people, who say it doesn't feel bad for them.
My intention pointing these things out is not to infantilize or indemn you or anyone else who's had the "low O2 mediates the altitude effect" idea and accepted it without noticing that it went against CICO.
My intention is to help create common knowledge of just how fucked the fact that people keep coming up with that hypothesis, and uncritically running with it, proves the discourse around this topic, and the CICO paradigm specifically, is.
*It's true that it's also "burning" calories when it's not fat calories, it's the calories in your food - such that less oxygen could conceivably hurt the process of gaining weight, like less food could. But the conventional wisdom is that the body treats eaten food as an ~absolute lower bar for "energy" intake quota - ie acts as though it should always oxidize all eaten food and turn it into glycogen, fat, ATP, or heat, no matter how inefficient this is - while the amount of oxidation the body does per hour at rest actually is a dependent variable that could conceivably vary closely with amount of O2 in the air. Medically, it would fly in the face of a lot, if people were actually doing less oxidation of eaten food, rather than just using more oxygen per food molecule, under hypoxic conditions. Conceivably, this decrease in efficiency of metabolism, to meet the oxidation quota, correspondingly slows fat gain - and I think it maaaaaybe does, and I think this is an actual plausible mechanism here. That's something that also, empirically, happens in severe caloric restriction - but importantly for CICO, it's not something that happens, at all, in moderate caloric restriction, of the "cut 200 calories per day" stripe that CICOists suggest to lose weight. And moderate caloric restriction is also sometimes effective, simply by reducing the quantity of "energy" intake, just like CICOists say it should. My point is to say that CICOists' favored method of "reduce the quantity of 'energy' intake", can hardly medically be what chronic hypoxia is doing, although CICO would lead us to expect the opposite effect, of "less body fat is ~passively oxidized", at rest, because the body just cannot do as much metabolism per hour-calorie [as opposed to per calorie eaten] when there is less O2 to work with.
After re-looking at the graph because of this post, I'm surprised by how exactly overweight does correspond with low altitude, and not anything about water tables, as I originally thought. And I do find "hypoxia makes food oxidation [and baseline-necessary homeostatic oxidation] less efficient, in the same way severe CR does" plausible as a mechanism. It's making me question my intitial conviction that the Thing Causing Lipostat Damage Since 1900 necessarily had to be some kind of 1900-era waterborne endocrine disruptor like a heavy metal, and putting other, weirder stuff, like food-borne toxins, and viruses, back on the table.
But from the perspective of someone who's seen the old rat studies [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1225664/] and the old nutrition tables [ https://ageconsearch.umn.edu/nanna/record/136596/files/fris-1961-02-02-438.pdf?withWatermark=0&withMetadata=0&version=1®isterDownload=1, https://tinyurl.com/jnzxkk5z, https://ageconsearch.umn.edu/nanna/record/234624/files/frvol25i3a.pdf?withWatermark=0&withMetadata=0&version=1®isterDownload=1 ] and knows that calories don't explain it, "The SMTM overweight-altitude pattern is in fact downstream of relative hypoxia" deconfuses somewhat about the SMTM overweight-altitude pattern, but doesn't deconfuse at all about why we've all been getting fatter since 1900 in the first place. From the perspective of someone who knows calories can't explain it, the relatively lower rate of obesity in China [even if, surprisingly-to-me, only by a few percentage points - aroaund 35% for the US vs around 31% for China, according to the first Google result I saw], which stands at an overall much lower elevation than the US [especially populated areas], looks more potentially fruitful as an area of investigation. And . . . it looks like in China, there's a regional gradient in obesity [higher in the north] that seems obviously not to be tracking altitude at all. And what about Korea, which is basically on the sea? They sandbag and say 40% obesity in self-reporting [to alarm the locals?], but when measured the same as the US [ https://www.koreaherald.com/view.php?ud=20230425000613 ], their "obesity" rate is around 6% compared to the US's 35%-40%. It makes me suspect the China thing is a reporting issue, too. Altitude clearly isn't most of the inter-regional variance worldwide.
This changed my mind on whether lithium was at all plausible. I had no idea about the youth-faster-weight-gain thing.
The one thing you don't seem to have written about, is the possibility that peoples' lipostats might be getting broken primarily during fetal development, while neuronal proliferation is happening, the lowermost layers of the brain are getting wired, hormones [and correspondingly, endocrine disruptors] have an outsized influence, and most other neurological [/neuroendocrine] disorders are contracted.
I think some toxin is probably doing this, and I think the overweight epidemic started picking up [around ~1910] in the US too early for it to have been microplastics [?] or another newfangled endocrine disruptor [?].
But I now see that the cause being relative levels of elemental lithium, even during fetal development, wouldn't make any sense.
Yooo this is sick! Thank you!
Thanks for the encouraging feedback!
It is true that in future posts I should account for availability of calories over time, and physical activity over time.
Possibly I would get a better reception if I waded into all the sub-possibilities for what could be causing the increase in self-reported queerness, but that issue is so political that I doubt more positive reception from the audience, would correspond to more accurate Bayesian updates from the audience. As it is, I feel "you can lead a LessWronger to a hypothesis, but you can't make them suborn their political arguments-are-soldiers brain to their adult brain".
"AI alignment is not in the category 'alarmingly impossible problems for the time we have left'" is certainly a position many people hold. I am doing my best to make them correct. Alas, going along with their fantasy world where it's already true, will not help make it true.