Posts

Empathy/Systemizing Quotient is a poor/biased model for the autism/sex link 2024-11-04T21:11:57.788Z
Binary encoding as a simple explicit construction for superposition 2024-10-12T21:18:31.731Z
Rationalist Gnosticism 2024-10-10T09:06:34.149Z
RLHF is the worst possible thing done when facing the alignment problem 2024-09-19T18:56:27.676Z
Does life actually locally *increase* entropy? 2024-09-16T20:30:33.148Z
Why I'm bearish on mechanistic interpretability: the shards are not in the network 2024-09-13T17:09:25.407Z
In defense of technological unemployment as the main AI concern 2024-08-27T17:58:01.992Z
The causal backbone conjecture 2024-08-17T18:50:14.577Z
Rationalists are missing a core piece for agent-like structure (energy vs information overload) 2024-08-17T09:57:19.370Z
[LDSL#6] When is quantification needed, and when is it hard? 2024-08-13T20:39:45.481Z
[LDSL#5] Comparison and magnitude/diminishment 2024-08-12T18:47:20.546Z
[LDSL#4] Root cause analysis versus effect size estimation 2024-08-11T16:12:14.604Z
[LDSL#3] Information-orientation is in tension with magnitude-orientation 2024-08-10T21:58:27.659Z
[LDSL#2] Latent variable models, network models, and linear diffusion of sparse lognormals 2024-08-09T19:57:56.122Z
[LDSL#1] Performance optimization as a metaphor for life 2024-08-08T16:16:27.349Z
[LDSL#0] Some epistemological conundrums 2024-08-07T19:52:55.688Z
Yann LeCun: We only design machines that minimize costs [therefore they are safe] 2024-06-15T17:25:59.973Z
DPO/PPO-RLHF on LLMs incentivizes sycophancy, exaggeration and deceptive hallucination, but not misaligned powerseeking 2024-06-10T21:20:11.938Z
Each Llama3-8b text uses a different "random" subspace of the activation space 2024-05-22T07:31:32.764Z
Is deleting capabilities still a relevant research question? 2024-05-21T13:24:44.946Z
Why I stopped being into basin broadness 2024-04-25T20:47:17.288Z
Blessed information, garbage information, cursed information 2024-04-18T16:56:17.370Z
Ackshually, many worlds is wrong 2024-04-11T20:23:59.416Z
[GPT-4] On the Gradual Emergence of Mechanized Intellect: A Treatise from the Year 1924 2024-04-01T19:14:02.363Z
Opinions survey 2 (with rationalism score at the end) 2024-02-17T12:03:02.410Z
Opinions survey (with rationalism score at the end) 2024-02-17T00:41:20.188Z
What are the known difficulties with this alignment approach? 2024-02-11T22:52:18.900Z
Against Nonlinear (Thing Of Things) 2024-01-18T21:40:00.369Z
Which investments for aligned-AI outcomes? 2024-01-04T13:28:57.198Z
Practically A Book Review: Appendix to "Nonlinear's Evidence: Debunking False and Misleading Claims" (ThingOfThings) 2024-01-03T17:07:13.990Z
Could there be "natural impact regularization" or "impact regularization by default"? 2023-12-01T22:01:46.062Z
Utility is not the selection target 2023-11-04T22:48:20.713Z
Contra Nora Belrose on Orthogonality Thesis Being Trivial 2023-10-07T11:47:02.401Z
What are some good language models to experiment with? 2023-09-10T18:31:50.272Z
Aumann-agreement is common 2023-08-26T20:22:03.738Z
A content analysis of the SQ-R questionnaire and a proposal for testing EQ-SQ theory 2023-08-09T13:51:02.036Z
If I showed the EQ-SQ theory's findings to be due to measurement bias, would anyone change their minds about it? 2023-07-29T19:38:13.285Z
Autogynephilia discourse is so absurdly bad on all sides 2023-07-23T13:12:07.982Z
Boundary Placement Rebellion 2023-07-20T17:40:00.190Z
Prospera-dump 2023-07-18T21:36:13.822Z
Are there any good, easy-to-understand examples of cases where statistical causal network discovery worked well in practice? 2023-07-12T22:08:59.916Z
I think Michael Bailey's dismissal of my autogynephilia questions for Scott Alexander and Aella makes very little sense 2023-07-10T17:39:26.325Z
What in your opinion is the biggest open problem in AI alignment? 2023-07-03T16:34:09.698Z
Which personality traits are real? Stress-testing the lexical hypothesis 2023-06-21T19:46:03.164Z
Book Review: Autoheterosexuality 2023-06-12T20:11:38.215Z
How accurate is data about past earth temperatures? 2023-06-09T21:29:11.852Z
[Market] Will AI xrisk seem to be handled seriously by the end of 2026? 2023-05-25T18:51:49.184Z
Horizontal vs vertical generality 2023-04-29T19:14:35.632Z
Core of AI projections from first principles: Attempt 1 2023-04-11T17:24:27.686Z
Is this true? @tyler_m_john: [If we had started using CFCs earlier, we would have ended most life on the planet] 2023-04-10T14:22:07.230Z

Comments

Comment by tailcalled on Blood Is Thicker Than Water 🐬 · 2024-11-12T18:55:06.517Z · LW · GW

I had been playing on and off with the idea that an ecological argument would show dolphins to be ultimately fish-like, but with my switch in general approach to things, I think ultimately the "dolphins are not fish" side wins out. Some of the most noteworthy characteristics of dolphins is that they are large, intelligent, social animals which provide parental care for extensive periods of time. There are literally 0 fish species with this combination of traits, whereas meanwhile the combo obviously screams "mammal!".

I was wondering if it was specific to mammals or if it applied to land vertebrates in general. Other animals that have evolved to live in the ocean include sea turtles and sea snakes, but they are quite land-animal-like to me. Frogs are an interesting edge-case in that at least they have marine-like offspring counts, but they literally have legs so obviously they don't count.

Comment by tailcalled on Fundamental Uncertainty: Chapter 9 - How do we live with uncertainty? · 2024-11-07T20:17:04.830Z · LW · GW

But red exists only because we have the experience of seeing red. That is, red exists because we have found it useful to tell red apart from other colors. We can "objectively" define red to be light with a wavelength between 620 and 750 nanometers, but we define it thus because those wavelengths correspond to what many people subjectively identify as red. Thus, whether or not an apple is red is neither a properly objective nor subjective fact, but intersubjective knowledge that depends on both the world and how we experience it. So it goes for all truth that can be known.

At first glance, red seems like such a special color to me. It's the color of blood and many fruit, the advanced color that primates and flying animals see and other animals can't distinguish, it's the most primitive step towards heat vision, and it's the color at the lowest end of the range of common electron bandgap. Obviously the word itself is kind of arbitrary but the color seems as non-arbitrary as could be.

Comment by tailcalled on Scissors Statements for President? · 2024-11-06T18:46:08.490Z · LW · GW

And I guess I should say, I have a more sun-oriented and less competition-oriented view. A surplus (e.g. in energy from the sun or negentropy from the night) has a natural "shape" (e.g. trees or solar panels) that the surplus dissipates into. There is some flexibility in this shape that leaves room for choice, but a lot less than rationalists usually assume.

Comment by tailcalled on Scissors Statements for President? · 2024-11-06T18:39:42.049Z · LW · GW

Kind of. First, the big exception: If you manage to enforce global authoritarianism, you can stockpile surplus indefinitely, basically tiling the world with charged-up batteries. But what's the point of that?

Secondly, "waste/signaling cascade" is kind of in the eye of the beholder. If a forest is standing in some region, is it wasting sunlight that could've been used on farming? Even in a very literal sense, you could say the answer is yes since the trees are competing in a zero-sum game for height. But without that competition, you wouldn't have "trees" at all, so calling it a waste is a value judgement that trees are worthless. (Which of course you are entitled to make, but this is clearly a disagreement with the people who like solarpunk.)

But yeah, ultimately I'm kind of thinking of life as entropy maximization. The surplus has to be used for something, the question is what. If you've got nothing to use it for, then it makes sense for you to withdraw, but then it's not clear why to worry that other people are fighting over it.

Comment by tailcalled on Scissors Statements for President? · 2024-11-06T13:51:34.691Z · LW · GW

Electoral candidates can only be very bad because the country is very big and strong, which can only be the case because there's a lot of people, land, capital and institutions.

Noticing that two candidates for leading these resources are both bad is kind of useless without some other opinion on what form the resources should enter. A simple option would be that the form of the resources should lessen, e.g. that people should work less. The first step to this is to go away from Keynesianism. But if you take that to its logical conclusion, it implies e/acc replacement of humanity, VHEM, mass suicide, or whatever. It's not surprising that this is unpopular.

So that raises the question: What's some direction that the form of societal resources could be shifted towards that would be less confusing than a scissor statement candidate?

Because without an answer to this question, I'm not sure we even need elaborate theories on scissor statements.

Comment by tailcalled on What can we learn from insecure domains? · 2024-11-04T06:47:21.124Z · LW · GW

Offense/defense balance can be handled just by ensuring security via offense rather than via defense.

I guess as a side-note, I think it's better to study oxidation, the habitable zone, famines, dodo extinction, etc. if one needs something beyond the basic "dangerous domains" that are mentioned in the OP.

Comment by tailcalled on The Median Researcher Problem · 2024-11-03T14:01:32.159Z · LW · GW

I think one thing that's missing here is that you're making a first-order linear approximation of "research" as just continually improving in some direction. I would instead propose a quadratic model where there is some maximal mode of activity in the world, but this mode can face certain obstacles that people can remove. Research progress is what happens when there's an interface for removing obstacles that people are gradually developing familiarity with (for instance because it's a newly developed interface).

Different people have different speeds by which they reach the equillibrium, but generally those who have an advantage would also exhibit an explosion of skills and production as they use their superior understanding of the interface.

Comment by tailcalled on What can we learn from insecure domains? · 2024-11-02T22:04:25.126Z · LW · GW

My original response contained numerous strategies that people were using:

  • Keeping one's cryptocurrency in cold storage rather than easily usable
  • Using different software than that with known vulnerabilities
  • Just letting relatively-trusted/incentive-aligned people use the insecure systems
  • Using mutual surveillance to deescalate destructive weaponry
  • Using aggression to prevent the weak from building destructive weaponry

You dismissed these as "just-so stories" but I think they are genuinely the explanations for why stuff works in these cases, and if you want to find general rules, you are better off collecting stories like this from many different domains than to try to find The One Unified Principle. Plausible something between 5 and 100 stories will taxonomize all the usable methods and you will develop a theory through this sort of investigation.

Comment by tailcalled on What can we learn from insecure domains? · 2024-11-02T11:57:55.363Z · LW · GW

No, it was a lot of words that describe why your strategy of modelling stuff as more/less "dangerous" and then trying to calibrate to how much to be scared of "dangerous" stuff doesn't work.

The better strategy, if you want to pursue this general line of argument, is to make the strongest argument you can for what makes e.g. Bitcoin so dangerous and how horrible the consequences will be. Then since your sense of danger overestimates how dangerous Bitcoin will be, you can go in and empirically investigate where your intuition was wrong by seeing what predictions of your intuitive argument failed and what obstacles caused them to fail.

Comment by tailcalled on What can we learn from insecure domains? · 2024-11-02T11:52:02.182Z · LW · GW

You shouldn't use "dangerous" or "bad" as a latent variable because it promotes splitting. MAD and Bitcoin have fundamentally different operating principles (e.g. nuclear fission vs cryptographic pyramid schemes), and these principles lead to a mosaic of different attributes. If you ignore the operating principles and project down to a bad/good axis, then you can form some heuristics about what to seek out or avoid, but you face severe model misspecification, violating principles like realizability which are required for Bayesian inference to get reasonable results (e.g. converge rather than oscillate, and be well-calibrated rather than massively overconfident).

Once you understand the essence of what makes a domain seem dangerous to you, you can debug by looking at what obstacles this essence faced that stopped it from flowing into whatever horrors you were worried about, and then try to think through why you didn't realize those obstacles ahead of time. As you learn more about the factors relevant in those cases, maybe you will learn something that generalizes across cases, but most realistically what you learn will be about the problems with the common sense.

Comment by tailcalled on What can we learn from insecure domains? · 2024-11-02T11:24:34.597Z · LW · GW

Your error is in having inferred that there is a general rule that this necessarily happens. MAD is obviously governed by completely different principles than crypto is. Or maybe your error is in trusting common sense too much and therefore being too surprised when stuff contradicts it, idk.

Comment by tailcalled on What can we learn from insecure domains? · 2024-11-02T11:19:46.935Z · LW · GW

For almost everything, yeah, you just avoid the bad parts.

In order to predict the few exceptions, one needs a model of what functions will be available in society. For instance, police implies the need to violently suppress adversaries, and defense implies the need to do so with adversaries that have independent industrial capacity. This is an exception to the general principle of "just avoid the bad stuff" because while your computer can decline to process a TCP packet, your body can't decline to process a bullet.

If someone is operating e.g. an online shop, then they also face difficulty because they have to physically react to untrusted information and can't avoid that without winding down the shop. Lots of stuff like that.

Comment by tailcalled on What can we learn from insecure domains? · 2024-11-02T11:12:13.191Z · LW · GW

Imagine your typical computer user (I remember being mortified when running anti-spyware tool on my middle-aged parents' computer for them).  They aren't keeping things patched and up-to-date. What I find curious is how can it be the case that their computer is both: filthy with malware and they routinely do things like input sensitive credit-card/tax/etc information into said computer.

I don't know what exactly your parents are using their computer for.

If we say credit-card information, I know at least in my country there's a standard government-mandated 2-factor authentication which helps with security. Also, banks have systems to automatically detect and block fraudulent transactions, as well as to reverse and punish fraudulent transactions, which makes it harder for people to exploit.

In order to learn how exactly the threats are stopped, you'd need to get more precise knowledge of what the threats are. I.e., given a computer with a certain kind of spyware, what nefarious activities could you worry that spyware enables? Then you can investigate what obstacles there are on the way to it.

I fully expect to live in a world where its BOTH true that: Pilny the Liberator can PWN any LLM agent in minutes AND people are using LLM agents to order 500 chocolate cupcakes on a daily basis.

Using an LLM agent to order something is a lot less dangerous than using an LLM agent to sell something, because ordering is kind of "push"-oriented; you're not leaving yourself vulnerable to exploitation from anyone, only from the person you are ordering from. And even that person is pretty limited in how they can exploit you, since you plan to pay afterwards, and the legal system isn't going to hold up a deal that was obviously based on tricking the agent.

Comment by tailcalled on What can we learn from insecure domains? · 2024-11-02T07:47:54.077Z · LW · GW

In crypto, a lot of people just HODL instead of using it for stuff in practice. I'd guess the more people use it, the more likely they are to run into one of the 99.9% of projects that are scams. (Though... if we count the people who've been hit by ransomware, it is non-obvious to me that the majority of users are HODLers rather than ransomeware victims.) To prevent losing one's crypto, there have also been developed techniques like "cold storage", which are extremely secure.

The HTTP server logs you posted aren't based on insecurity of most webservers, they are based on the insecurity of particular programs (or versions of programs or setups of programs). Important systems (e.g. online banking) almost always use different systems than the ones that are currently getting attacked. Attacks roll the dice in the hope that maybe they'll find someone with a known vulnerability to exploit, but presumably such exploits are extremely temporary.

Copilot is general instructed via the user of the program, and the user and is relatively trusted. I mean, people are still trying to "align" to be robust against the user, but 99.9% of the time that doesn't matter, and the remaining time is often stuff like internet harassment which is definitely not existentially risky, even if it is bad.

Some people are trying to introduce LLM agents into more general places, e.g. shops automatically handling emails from businesses. I'm pretty skeptical about this being secure, but if it turns out to be hopelessly insecure, I'd expect the shops to just decline using them.

Nuclear weapons were used twice when only the US had them. They only became existentially dangerous as multiple parties built up enormous stockpiles of them, but at the same time people understood that they were existentially dangerous and therefore avoided using them in war. More recently they've agreed that keeping such things around is bad and have been disassembling them under mutual surveillance. And they have systems set up to prevent other, less-stable countries from developing them.

Comment by tailcalled on Three Notions of "Power" · 2024-11-01T21:27:51.978Z · LW · GW

I don't really have any end-to-end narrative, but here's a bunch of opinion-fragments:

  • There's lots of good reason to believe that sometimes there's dominance-based prostitution. However, all the prostitutes I've talked with on these subjects have emphasized that there's a powerful alliance between prudish leftists and religious people who misrepresent what's really going on in order to marginalize prostitution, so I'm still inclined to hold your claims to especially high standards, and I don't really know why you (apparently) trust the organizations that prostitutes oppose so much.
  • Der Speigel does not describe how they sampled the individual stories they ended up with, and it seems very unlikely that they sampled them the same way that the UN number did, so it doesn't seem like the UN number should be assumed to reflect stories like the ones in Der Speigel.
  • The article has multiple mentions of women who left prostitution but then later returned. In one of them, it's for the pay, which seems like a bargaining power issue (unless we go into the mess of counting taxes). In another, it's more complicated as it was out of hope that a customer would fall in love with her. (Which seems unlikely to happen? Would maybe count as something similar to the vaccine situation, insight into how to achieve one's goals.)
  • In the case of e.g. Alina, it sounds like the main form of dominance was dominance by proxy: "She says that she was hardly ever beaten, nor were the other women. "They said that they knew enough people in Romania who knew where our families lived. That was enough," says Alina.". This matches my proposal about how the dominance needs to occur at the boundary of where the law can enforce.
  • This sounds very suspicious to me: "There are many women from EU countries "whose situation suggests they are the victims of human trafficking, but it is difficult to provide proof that would hold up in court," reads the BKA report. Everything depends on the women's testimony, the authors write, but there is "little willingness to cooperate with the police and assistance agencies, especially in the case of presumed victims from Romania and Bulgaria.""
Comment by tailcalled on Alexander Gietelink Oldenziel's Shortform · 2024-11-01T08:26:56.950Z · LW · GW

For everyday life, flat earth is more convenient than round earth geocentrism, which in turn is more convenient than heliocentrism. Like we don't constantly change our city maps based on the time of year, for instance, which we would have to do if we used a truly heliocentric coordinate system as the positions of city buildings are not even approximately constant within such a coordinate system.

This is mainly because the sun and the earth are powerful enough to handle heliocentrism for you, e.g. the earth pulls you and the cities towards the earth so you don't have to put effort into staying on it.

The sun and the planetary motion does remain the most important governing factor for predicting activities on earth, though, even given this coordinate change. We just mix them together into ~epicyclic variables like "day"/"night" and "summer"/"autumn"/"winter"/"spring" rather than talking explicitly about the sun, the earth, and their relative positions.

Comment by tailcalled on Three Notions of "Power" · 2024-11-01T07:30:45.583Z · LW · GW

Can you explain what this coordination would look like?

Comment by tailcalled on Three Notions of "Power" · 2024-10-31T17:13:54.316Z · LW · GW

Your sources are not very clear about that, and it contradicts what I've heard elsewhere, but yes I do admit at the boundaries of where society enforces laws, there do exist people who are forced to do things including prostitution via dominance.

Comment by tailcalled on Three Notions of "Power" · 2024-10-31T17:04:55.373Z · LW · GW

Maybe one could say the essence of our difference is this:

You see the dominance ranking as defined by the backing-off tendency and assume it to be mainly an evolutionary psychological artifact.

Meanwhile, I see the backing-off tendency as being the primary indicator of dominance, but the core interesting aspect of dominance to be the tendency to leverage credible threats, which of course causes but is not equivalent to the psychological tendency to back off.

Under my model, dominance would then be able to cause bargaining power (e.g. robbing someone by threatening to shoot them), but one could also use bargaining power to purchase dominance (e.g. spending money to purchase a gun).

This leaves dominance and bargaining power independent because on the one hand you have the weak-strong axis where both increase but on the other hand you have the merchant-king axis where they directly trade off.

Comment by tailcalled on Three Notions of "Power" · 2024-10-31T13:00:56.346Z · LW · GW

Your link doesn't contain a detailed description of their methodology or intermediate results. I would have to do a lot of digging to make heads or tails of it.

I guess feel free to opt out of this conversation if you want but then ultimately I don't see you as having contributed any point that is relevant for me to respond to or update with.

Comment by tailcalled on Three Notions of "Power" · 2024-10-31T12:08:16.138Z · LW · GW

I think John Wentworth and I are modelling it in different ways and that may be the root of your confusion. To me, dominance is something like the credible ability and willingness to impose costs targeted at particular agents, whereas John Wentworth is more using the submission signalling definition.

Comment by tailcalled on Three Notions of "Power" · 2024-10-31T11:53:56.773Z · LW · GW

I didn't ask where the number is from, I asked how they came up with that number. Their official definition is "all work or service which is exacted from any person under the menace of any penalty and for which the said person has not offered himself voluntarily", but taking that literally would seem to imply that additional work to pay taxes is slavery and therefore ~everyone is a slave. This is presumably not what they meant, and indeed it's inconsistent with their actual number, but it's unclear what they mean instead.

Comment by tailcalled on Three Notions of "Power" · 2024-10-31T11:36:40.526Z · LW · GW

Can you expand on the methodology behind this number?

Comment by tailcalled on Three Notions of "Power" · 2024-10-31T11:13:17.794Z · LW · GW

Slavery was abolished and remains abolished through dominance:

  • first by getting outlawed by the Northern US and Great Britain, who drew strong economic benefit from higher labor prices due to them industrializing earlier for geographic reasons,
  • secondly by leveraging state dominance during the great depression to demand massive increases in quality and quantity of production, to make it feasible to maintain a non-slave-holding society without having excess labor forces being forced to starve,
  • thirdly, endless policies that use state violence and reduce the total fertility rate as a side-effect,

Throughout most of history, there has been excess labor, making the value of work fall down close to the cost of subsistence, being only sustainable because landowners see natural fluctuations in their production and therefore desire to keep people around even if it doesn't make short-term economic sense. This naturally creates serfdom and indentured servitude.

It's only really prisoners of war (e.g. African-American chattel slaves) who are slaves due to dominance; ordinary slavery is just poor bargaining power.

Comment by tailcalled on Three Notions of "Power" · 2024-10-31T08:02:08.873Z · LW · GW

I guess to expand, the US military doctrine since the world war has been that there's a need to maintain dominance over countries focused on military strength to the disadvantage of their citizens. Hence while your statement is somewhat-true, it's directly and intentionally the result of a dominance hierarchy maintained by the US.

Comment by tailcalled on Three Notions of "Power" · 2024-10-31T07:17:49.024Z · LW · GW

Ah. I would say human psychology is too epiphenomenal so I'm mainly modelling things that shape (dis)equillibria in complex ecologies.

Comment by tailcalled on Three Notions of "Power" · 2024-10-31T06:47:04.864Z · LW · GW

Western states today use state violence to enforce high taxes and lots of government regulations. In my view they're probably more dominance-oriented than states which just leave rural farmers alone. At least some of this is part of a Keynesian policy to boost economic output, and economic output is closely related to military formidability (due to ability to afford raw resources and advanced technology for the military).

Hm, I guess you would see this as more closely related to bargaining power than to dominance, because in your model dominance is a human-psychology-thing and bargaining power isn't restricted to voluntary transactions?

Comment by tailcalled on Three Notions of "Power" · 2024-10-30T21:55:57.443Z · LW · GW

What phenomenon are you modelling where this distinction is relevant?

Comment by tailcalled on Three Notions of "Power" · 2024-10-30T21:41:38.564Z · LW · GW

Except for the child and the blacksmith, all of these seem like dominance conflicts to me. The blacksmith plausibly becomes a dominance conflict too once you consider how he ended up with the resources and what tasks he's likely to face. You contrast these with conflicts between human groups, but I'd compare to e.g. a conflict between a drunk middle-aged loner who is looking for a brawl vs two young policemen and a bar owner.

Comment by tailcalled on Three Notions of "Power" · 2024-10-30T19:52:49.811Z · LW · GW

Given a world of humans, I don't think that libertarian society would be good enough at preventing competing powers from overthrowing it. Because there'd be an unexploitable-equillibrium condition where a government that isn't focused on dominance is weaker than a government more focused on government, it would generally be held by those who have the strongest focus on dominance. Those who desire resources would be better off putting themselves in situations where the dominant powers can become more dominant by giving them/putting them in charge of resources.

Given a world of AIs, I don't think the dominant AI would need a market; it could just handle everything itself.

As I understand it, libertarian paradises are basically fantasies by people who don't like the government, not realistically-achievable outcomes given the political realities.

Comment by tailcalled on Three Notions of "Power" · 2024-10-30T19:31:23.218Z · LW · GW

(Now, you might think that an AI in charge of e.g. a company could make the big hierarchy work efficiently by just being capable enough to track everything themselves. But at that point, I wouldn't expect to see an hierarchy at all; the AI can just do everything itself and not have multiple agents in the first place. Unlike humans, AIs will not be limited by their number of hands. If there is to be some arrangement involving multiple agents coordinating in the first place, then it shouldn't be possible for one mind to just do everything itself.)

My model probably mostly resembles this situation. Some ~singular AI will maintain a monopoly on violence. Maybe it will use all the resources in the solar system, leaving no space for anyone else. Alternatively (for instance if alignment succeeds), it will leave one or more sources of resources that other agents can use. If the dominant AI fully protects these smaller agents from each other, then they'll handle their basic preferences and mostly withdraw into their own world, ending the hierarchy. If the dominant AI has some preference for who to favor, or leaves some options for aggression/exploitation which don't threaten the dominant AI, then someone is going to win this fight, making the hierarchy repeat fractally down.

Main complication to this model is inertia; if human property rights are preserved well enough, then most resources would start out owned by humans, and it would take some time for the economy to equillibrate to the above.

Comment by tailcalled on Three Notions of "Power" · 2024-10-30T18:48:16.941Z · LW · GW

LDT/FDT is a central example of rationalist-Gnostic heresy.

Comment by tailcalled on Three Notions of "Power" · 2024-10-30T17:42:50.365Z · LW · GW

I would expect the AI society to need some sort of monopoly on violence to coordinate this, which is basically the same as a dominance hierarchy.

Comment by tailcalled on Three Notions of "Power" · 2024-10-30T17:15:58.483Z · LW · GW

Hierarchies getting smashed requires someone to smash them, in which case that someone has the mandate of heaven. That's how it worked with John Wentworth's original example of Zhu Di, who overthrew Zhu Yunwen.

Comment by tailcalled on Three Notions of "Power" · 2024-10-30T07:55:17.452Z · LW · GW

However, though dominance is hard-coded, it seems like something of a simple evolved hack to avoid costly fights among relatively low-cognitive-capability agents; it does not seem like the sort of thing which more capable agents (like e.g. future AI, or even future more-intelligent humans) would rely on very heavily.

This seems exactly reversed to me. It seems to me that since dominance underlies defense, law, taxes and public expenditure, it will stay crucial even with more intelligent agents. Conversely, as intelligence becomes "too cheap to meter", "getting what you want" will become less bottlenecked on relevant insights, as those insights are always available.

Comment by tailcalled on johnswentworth's Shortform · 2024-10-27T10:16:04.424Z · LW · GW

Suppose people's probability of solving a task is uniformly distributed between 0 and 1. That's a thin-tailed distribution.

Now consider their probability of correctly solving 2 tasks in a row. That will have a sort of triangular distribution, which has more positive skewness.

If you consider e.g. their probability of correctly solving 10 tasks in a row, then the bottom 93.3% of people will all have less than 50%, whereas e.g. the 99th percentile will have 90% chance of succeeding.

Conjunction is one of the two fundamental ways that tasks can combine, and it tends to make the tasks harder and rapidly make the upper tail do better than the lower tail, leading to an approximately-exponential element. Another fundamental way that tasks can combine is disjunction, which leads to an exponential in the opposite direction.

When you combine conjunctions and disjunctions, you get an approximately sigmoidal relationship. The location/x-axis-translation of this sigmoid depends on the task's difficulty. And in practice, the "easy" side of this sigmoid can be automated or done quickly or similar, so really what matters is the "hard" side, and the hard side of a sigmoid is approximately exponential.

Comment by tailcalled on johnswentworth's Shortform · 2024-10-27T08:40:00.954Z · LW · GW

If each deleterious mutation decreases the success rate of something by an additive constant, but you need lots of sequential successes for intellectual achievements, then intellectual formidability is ~exponentially related to deleterious variants.

Comment by tailcalled on Why I’m not a Bayesian · 2024-10-21T12:29:06.754Z · LW · GW

Idk, I guess the more fundamental issue is this treats the goal as simply being assigning probabilities to statements in predicate logic, whereas his point is more about whether one can do compositional reasoning about relationships while dealing with nebulosity, and it's this latter thing that's the issue.

Comment by tailcalled on Why I’m not a Bayesian · 2024-10-20T15:51:00.637Z · LW · GW

To bring probability into the picture, the logic needs to be augmented with enough probabilities of values of variables in the logic that the rest of the probabilities can be derived.

I feel like this treat predicate logic as being "logic with variables", but "logic with variables" seems more like Aristotelian logic than like predicate logic to me.

Comment by tailcalled on Why I’m not a Bayesian · 2024-10-20T15:45:14.190Z · LW · GW

Ah, sorry, I think I misparsed your comment.

Comment by tailcalled on Why I’m not a Bayesian · 2024-10-20T15:38:46.440Z · LW · GW

You seem to be assuming that predicate logic is unnecessary, is that true?

Comment by tailcalled on Isaac King's Shortform · 2024-10-17T17:01:40.874Z · LW · GW

Idea: it should be illegal to keep animals in captivity, so any organization that wants to butcher animals for flesh and blood should make a space that's attractive for animals to hang out so they can hunt them there.

Comment by tailcalled on Is there a known method to find others who came across the same potential infohazard without spoiling it to the public? · 2024-10-17T12:11:42.352Z · LW · GW

Maybe some military research program?

Comment by tailcalled on Minimal Motivation of Natural Latents · 2024-10-16T16:57:13.899Z · LW · GW

Or perhaps a better/more-human phrasing than my mouse comment is, the attributes that are in common between cats across the world are not the attributes that matter the most for cats. Cats are relatively bounded, so perhaps mostly their aggregate ecological impact is what matters.

Comment by tailcalled on Minimal Motivation of Natural Latents · 2024-10-16T16:40:11.870Z · LW · GW

Cats seem relatively epiphenomenal unless you're like, a mouse. So let's say you are a mouse. You need to avoid cats and find cheese without getting distracted by dust. In particular, you need to avoid the cat every time, not just on your 5th time.

Comment by tailcalled on Minimal Motivation of Natural Latents · 2024-10-16T16:34:16.937Z · LW · GW

By focusing on what is, you get a lot of convex losses on your theories that makes it very easy to converge. This is what prevents people from going funny in the head with that focus.

But the value of what is is long-tailed, so the vast majority of those constraints come from worthless instances of the things in the domain you are considering, and the niches that allow things to grow big are competitive and therefore heterogenous, so this vast majority of constraints don't help you build the sorts of instances that are valuable. In fact, they might prevent it, if adaptation to a niche leads to breaking some of the constraints in some way.

One attractive compromise is to focus on the best of what there is.

Comment by tailcalled on Minimal Motivation of Natural Latents · 2024-10-16T15:59:25.436Z · LW · GW

The issue with Aumann's theorem is that if the agents have different data then they might have different structure for the latents they use and so they might lack the language to communicate the value of a particular latent.

Like let's say you want to explain John Wentworth's "Minimal Motivation of Natural Latents" post to a cat. You could show the cat the post, but even if it trusted you that the post was important, it doesn't know how to read or even that reading is a thing you could do with it. It also doesn't know anything about neural networks, superintelligences, or interpretability/alignment. This would make it hard to make the cat pay attention in any way that differs from any other internet post.

Plausibly a cat lacks the learning ability to ever understand this post (though I don't think anyone has seriously tried?), but even if you were trying to introduce a human to it, unless that human has a lot of relevant background knowledge, they're just not going to get it, even when shown the entire text, and it's going to be hard to explain the gist without a significant back-and-forth to establish the relevant concepts.

Comment by tailcalled on Minimal Motivation of Natural Latents · 2024-10-16T15:51:02.835Z · LW · GW

Generally the purpose of a simplified model is to highlight:

  1. The essence that drives the dynamics
  2. The key constraints under consideration that obstruct and direct this essence

If the question we want to consider is just "why do there seem to be interpretable features across agents from humans to neural networks to bacteria", then I think your model is doing fine at highlighting the essence and constraints.

However, if the question we want to consider is more normative about what methods we can build to develop higher interpretations of agents, or to understand which important things might be missed, then I think your model fails to highlight both the essence and the key constraints.

Comment by tailcalled on Minimal Motivation of Natural Latents · 2024-10-16T11:24:43.204Z · LW · GW

I keep getting stuck on:

Suppose two Bayesian agents are presented with the same spreadsheet - IID samples of data in each row, a feature in each column. Each agent develops a generative model of the data distribution.

It is exceedingly rare that two Bayesian agents are presented with the same data. The more interesting case is when they are presented with different data, or perhaps with partially-overlapping data. Like let's say you've got three spreadsheets, A, B, and AB, and spreadsheets A and AB are concatenated and given to agent X, while spreadsheets B and AB are concatenated and given to agent Y. Obviously agent Y can infer whatever information about A that is present in AB, so the big question is how can X communicate unique information of A to Y, when Y hasn't even allocated the relevant latents to make use of that information yet, and X doesn't know what Y has learned from B and thus what is or isn't redundant?

Comment by tailcalled on Minimal Motivation of Natural Latents · 2024-10-16T11:17:12.296Z · LW · GW

My current model is real-world Bayesian agents end up with a fractal of latents to address the complexity of the world, and for communication/interpretability/alignment, you want to ignore the vast majority of these latents. Furthermore, most agents are gonna be like species of microbes that are so obscure and weak that we just want to entirely ignore them.

Natural latents don't seem to zoom in to what really matters, and instead would risk getting distracted by stuff like the above.