Posts

Are Intelligence and Generality Orthogonal? 2022-07-18T20:07:44.694Z

Comments

Comment by cubefox on Modern Transformers are AGI, and Human-Level · 2024-03-28T19:33:16.235Z · LW · GW

Well, backpropagation alone wasn't even enough to make efficient LLMs feasible. It took decades, till the invention of transformers, to make them work. Similarly, knowing how to make LLMs is not yet sufficient to implement predictive coding. LeCun talks about the problem in a short section here from 10:55 to 14:19.

Comment by cubefox on My Interview With Cade Metz on His Reporting About Slate Star Codex · 2024-03-28T00:21:53.229Z · LW · GW

If lots of people have a false belief X, that’s prima facie evidence that “X is false” is newsworthy. There’s probably some reason that X rose to attention in the first place; and if nothing else, “X is false” at the very least should update our priors about what fraction of popular beliefs are true vs false.

I think this argument would be more transparent with examples. Whenever I think of examples of popular beliefs that it would be reasonable to change one's support of in the light of this, they end up involving highly politicized taboos.

It is not surprising when a lot of people having a false belief is caused by the existence of a taboo. Otherwise the belief would probably already have been corrected or wouldn't have gained popularity in the first place. And giving examples for such beliefs of course is not really possible, precisely because it is taboo to argue that they are false.

Comment by cubefox on My Interview With Cade Metz on His Reporting About Slate Star Codex · 2024-03-27T23:56:16.133Z · LW · GW

If you mean by "statement" an action (a physical utterance) then I disagree. If you mean an abstract object, a proposition, for which someone could have more or less evidence, or reason to believe, then I agree.

Comment by cubefox on My Interview With Cade Metz on His Reporting About Slate Star Codex · 2024-03-27T23:47:41.959Z · LW · GW

But it wasn't a cancellation attempt.

In effect Cade Metz indirectly accused Scott of racism. Which arguably counts as a cancellation attempt.

Comment by cubefox on My Interview With Cade Metz on His Reporting About Slate Star Codex · 2024-03-27T23:28:42.382Z · LW · GW

Beliefs can only be epistemically legitimate, actions can only be morally legitimate. To "bring something up" is an action, not a belief. My point is that this action wasn't legitimate, at least not in this heavily abridged form.

Comment by cubefox on rhollerith_dot_com's Shortform · 2024-03-27T23:03:36.767Z · LW · GW

I also find them irksome for some reason. They feel like pollution. Like AI generated websites in my Google results.

An exception was the ghost cartoon here. The AI spelling errors added to the humor, similar to the bad spelling of lolcats.

Comment by cubefox on My Interview With Cade Metz on His Reporting About Slate Star Codex · 2024-03-27T22:24:31.773Z · LW · GW

What you're suggesting amounts to saying that on some topics, it is not OK to mention important people's true views because other people find those views objectionable.

It's okay to mention an author's taboo views on a complex and sensitive topic, when they are discussed in a longer format which does justice to how they were originally presented. Just giving a necessarily offensive sounding short summary is only useful as a weaponization to damage the reputation of the author.

Comment by cubefox on My Interview With Cade Metz on His Reporting About Slate Star Codex · 2024-03-27T21:58:44.903Z · LW · GW

On the one hand you say

So I think it's actually pretty legitimate for Metz to bring up incidences like this

but also

This is not to say that I think Scott should be "canceled" for these views or whatever, not at all

which seems like a double standard. E.g. assume the consequence of the NYT article had actually lead to Scott's cancellation. Which wasn't an implausible thing for Metz to expect.

(On a historical analogy, Scott's case seems quite analogous to the historical case of Baruch Spinoza. Spinoza could be (and was) accused of employing a similar strategy to get, with his pantheist philosophy, the highly taboo topic of atheism into the mainstream philosophical discourse. If so, the strategy was successful.)

Comment by cubefox on My Interview With Cade Metz on His Reporting About Slate Star Codex · 2024-03-27T21:24:16.402Z · LW · GW

Wait a minute. Please think through this objection. You are saying that if the NYT encountered factually true criticisms of an important public figure, it would be immoral of them to mention this in an article about that figure?

No, not in general. But in the specific case at hand, yes. We know Metz did read quite a few of Scott's blog posts, and all necessary context and careful subtlety with which he (Scott) approaches this topic (e.g. in Against Murderism) is totally lost in an offhand remark in a NYT article. It's like someone in the 17th century writing about Spinoza, and mentioning, as a sidenote, "and oh by the way, he denies the existence of a personal God" and then moves on to something else. Shortening his position like this, where it must seem outrageous and immoral, is in effect defamatory.

If some highly sensitive topic can't be addressed in a short article with the required carefulness, it should simply not be addressed at all. That's especially true for Scott, who wrote about countless other topics. There is no requirement to mention everything. (For Spinoza an argument could be made that his, at the time, outrageous position plays a fairly central role in his work, but that's not the case for Scott.)

Does it bother you that your prediction didn't actually happen? Scott is not dying in prison!

Luckily Scott didn't have to fear legal consequences. But substantial social consequences were very much on the table. We know of other people who lost their job or entire career prospects for similar reasons. Nick Bostrom probably dodged the bullet by a narrow margin.

Comment by cubefox on Wei Dai's Shortform · 2024-03-27T11:04:28.521Z · LW · GW

There are several levels in which humans can be bad or evil:

  1. Doing bad things because they believe them to be good
  2. Doing bad things while not caring whether they are bad or not
  3. Doing bad things because they believe them to be bad (Kant calls this "devilish")

I guess when humans are bad, they usually do 1). Even Hitler may have genuinely thought he is doing the morally right thing.

Humans also sometimes do 2), for minor things. But rarely if the anticipated bad consequences are substantial. People who consistently act according to 2) are called psychopaths. They have no inherent empathy for other people. Most humans are not psychopathic.

Humans don't do 3), they don't act evil for the sake of it. They aren't devils.

Comment by cubefox on My Interview With Cade Metz on His Reporting About Slate Star Codex · 2024-03-27T10:20:18.950Z · LW · GW

Imagine you are a philosopher in the 17th century, and someone accuses you of atheism, or says "He aligns himself with Baruch Spinoza". This could easily have massive consequences for you. You may face extensive social and legal punishment. You can't even honestly defend yourself, because the accusation of heresy is an asymmetric discourse situation. Is your accuser off the hook when you end up dying in prison? He can just say: Sucks for him, but it's not my fault, I just innocently reported his beliefs.

Comment by cubefox on Modern Transformers are AGI, and Human-Level · 2024-03-26T23:39:27.772Z · LW · GW

No, I was talking about the results. lsusr seems to use the term in a different sense than Scott Alexander or Yann LeCun. In their sense it's not an alternative to backpropagation, but a way of constantly predicting future experience and to constantly update a world model depending on how far off those predictions are. Somewhat analogous to conditionalization in Bayesian probability theory.

LeCun talks about the technical issues in the interview above. In contrast to next-token prediction, the problem of predicting appropriate sense data is not yet solved for AI. Apart from doing it in real time, the other issue is that (e.g.) for video frames a probability distribution over all possible experiences is not feasible, in contrast to text tokens. The space of possibilities is too large, and some form of closeness measure is required, or imprecise predictions, that only predict "relevant" parts of future experience.

In the meantime OpenAI did present Sora, a video generation model. But according to the announcement, it is a diffusion model which generates all frames in parallel. So it doesn't seem like a step toward solving predictive coding.

Edit: Maybe it eventually turns out to be possible to implement predictive coding using transformers. Assuming this works, it wouldn't be appropriate to call transformers AGI before that achievement was made. Otherwise we would have to identify the invention of "artificial neural networks" decades ago with the invention of AGI, since AGI will probably be based on ANNs. My main point is that AGI (a system with high generality) is something that could be scaled up (e.g. by training a larger model) to superintelligence without requiring major new intellectual breakthroughs, breakthroughs like figuring out how to get predictive coding to work. This is similar to how a human brain seems to be broadly similar to a dog brain, but larger, and thus didn't involve a major "breakthrough" in the way it works. Smarter animals are mostly smarter in the sense that they are better at prediction.

Comment by cubefox on Modern Transformers are AGI, and Human-Level · 2024-03-26T18:41:27.716Z · LW · GW

I agree it is not sensible to make "AGI" a synonym for superintelligence (ASI) or the like. But your approach to compare it to human intelligence seems unprincipled as well.

In terms of architecture, there is likely no fundamental difference between humans and dogs. Humans are probably just a lot smarter than dogs, but not significantly more general. Similar to how a larger LLM is smarter than a smaller one, but not more general. If you doubt this, imagine we had a dog-level robotic AI. Plausibly, we soon thereafter would also have human-level AI by growing its brain / parameter count. For all we know, our brain architectures seem quite similar.

I would go so far as to argue that most animals are about equally general. Humans are more intelligent than other animals, but intelligence and generality seem orthogonal. All animals can do both inference and training in real time. They can do predictive coding. Yann LeCun calls it the dark matter of intelligence. Animals have a real world domain. They fully implement the general AI task "robotics".

Raw transformers don't really achieve that. They don't implement predictive coding, and they don't work in real time. LLMs may be in some sense more (e.g. in language understanding) intelligent than animals, but that was already true, in some sense, for the even more narrow AI AlphaGo. AGI signifies a high degree of generality, not necessarily a particularly high degree of intelligence in a less general (more narrow) system.

Edit: One argument for why predictive coding is so significant is that it can straightforwardly scale to superintelligence. Modern LLMs get their ability mainly from trying to predict human written text, even when they additionally process other modalities. Insofar text is a human artifact, this imposes a capability ceiling. Predictive coding instead tries to predict future sensory experiences. Sensory experiences causally reflect base reality quite directly, unlike text produced by humans. An agent with superhuman predictive coding would be able to predict the future, including conditioned on possible actions, much better than humans.

Comment by cubefox on All About Concave and Convex Agents · 2024-03-25T19:51:26.915Z · LW · GW

Whether it is possible to justify Kelly betting even when your utility is linear in money (SBF said it was for him) is very much an open research problem. There are various posts on this topic when you search LessWrong for "Kelly". I wouldn't assume Wikipedia contains authoritative information on this question yet.

Comment by cubefox on All About Concave and Convex Agents · 2024-03-25T11:48:42.661Z · LW · GW

These classifications are very general. Concave utility functions seem more rational than convex ones. But can we be more specific?

Intuitively, it seems a rational simple relation between resources and utility should be such that the same relative increases in resources are assigned the same utility. So doubling your current resources should be assigned the same utility (desirability) irrespective of how much resources you currently have. E.g. doubling your money while being already rich seems approximately as good as doubling your money when being not rich.

Can we still be more specific? Arguably, quadrupling (x4) your resources should be judged twice as good (assigned twice as much utility) as doubling (x2) your resources.

Can we still be more specific? Arguably, the prospect of halfing your resources should be judged being as bad as doubling your resources is good. If you are forced with a choice between two options A and B, where A does nothing, and B either halves your resources or doubles them, depending on a fair coin flip, you should assign equal utility to choosing A and to choosing B.

I don't know what this function is in formal terms. But it seems that rational agents shouldn't have utility functions that are very dissimilar to it.

The strongest counterargument I can think of is that the prospect of losing half your resources may seem significantly worse than the prospect of doubling your resources. But I'm not sure this has a rational basis. Imagine you are not dealing with uncertainty between two options, but with two things happening sequentially in time. Either first you double your money, then you half it. Or first you half your money, then you double it. In either case, you end up with the same amount you started with. So doubling and halfing seem to cancel out in terms of utility, i.e. they should be regarded as having equal opposite utility.

Comment by cubefox on Victor Ashioya's Shortform · 2024-03-24T04:44:43.598Z · LW · GW

A shorter, more high level alternative is Axis of Ordinary, which is also available via Facebook and Telegram.

Comment by cubefox on Shortform · 2024-03-23T17:16:08.783Z · LW · GW

"The x are more y than they actually are" seems like a contradiction?

Comment by cubefox on Shortform · 2024-03-15T03:56:32.303Z · LW · GW

You mentioned positions I described as straw men or weak men. Darwinist utilitarianism would be more like a steel man.

Comment by cubefox on Shortform · 2024-03-15T03:27:16.481Z · LW · GW

Probably because from the outset, only one sort of answer is inside the realm of acceptable answers. Anything else would be far outside the Overton window. If they already know what sort of answer they have to produce, doing the actual calculations has no benefit. It's like a theologian evaluating arguments about the existence of God.

Comment by cubefox on Shortform · 2024-03-15T02:59:24.146Z · LW · GW

The above seems like a strawman or weakman argument. Consider instead Nietzsche's Critique of Utilitarianism:

Thus Nietzsche thinks utilitarians are committed to ensuring the survival and happiness of human beings, yet they fail to grasp the unsavory consequences which that commitment may entail. In particular, utilitarians tend to ignore the fact that effective long-run utility promotion might require the forcible destruction of people who either enfeeble the gene pool or who have trouble converting resources into utility—incurable depressives, the severely handicapped, and exceptionally fastidious people all seem potential targets.

Comment by cubefox on 'Empiricism!' as Anti-Epistemology · 2024-03-14T22:27:09.693Z · LW · GW

I'll add that sometimes, there is a big difference between verbally agreeing with a short summary, even if it is accurate, and really understanding and appreciating it and its implications. That often requires long explanations with many examples and looking at the same issue from various angles. The two Scott Alexander posts you mentioned are a good example.

Comment by cubefox on 'Empiricism!' as Anti-Epistemology · 2024-03-14T22:01:38.831Z · LW · GW

Yeah, but I do actually think this paragraph is wrong on the existence of easy rules. It is a bit like saying: There are only the laws of fundamental physics, don't bother with trying to find high level laws, you just have to do the hard work of learning to apply fundamental physics when you are trying to understand a pendulum or a hot gas. Or biology.

Similarly, for induction there are actually easy rules applicable to certain domains of interest. Like Laplace's rule of succession, which assumes random i.i.d. sampling. Which implies the sample distribution tends to resemble the population distribution. The same assumption is made by supervised learning about the training distribution, which works very well in many cases. There are other examples like the Lindy effect (mentioned in another comment) and various popular models in statistics. Induction heads also come to mind.

Even if there is just one, complex, fully general method applicable to science or induction, there may still exist "easy" specialized methods, with applicability restricted to a certain domain.

Comment by cubefox on 'Empiricism!' as Anti-Epistemology · 2024-03-14T13:06:32.471Z · LW · GW

Yeah, one has to correct, when possible, for likelihood of observing a particular part of the lifetime of the trend. Though absent any further information our probability distribution should arguably be even. Which does suggest there is indeed a sort of "straight rule" of induction when extrapolating trends, as the scientist in the dialogue suspected. It is just that it serves as a weak prior that is easily changed by additional information.

Comment by cubefox on 'Empiricism!' as Anti-Epistemology · 2024-03-14T03:24:02.813Z · LW · GW

There does actually seem to be a simple and general rule of extrapolation that can be used when no other data is available: If a trend has so far held for some timespan t, it will continue to hold, in expectation, for another timespan t, and then break down.

In other words, if we ask ourselves how long an observed trend will continue to hold, it does seem, absent further data, a good indifference assumption to think that we are currently in the middle of the trend; that we have so far seen half of it.

Of course it is possible that we are currently near the beginning of the trend, in which case it would continue longer than it has held so far; or near the end, in which case it would continue less long than it has held so far. But on average we should expect that we are in the middle.

So if we know nothing about the investment scheme in the post, except that it has worked for two years so far, our expection should be that it breaks down after a further two years.

Comment by cubefox on I was raised by devout Mormons, AMA [&|] Soliciting Advice · 2024-03-13T21:06:41.686Z · LW · GW

Thanks. My question wasn't bait. It comes from repurposing the innocent but (for a two-boxer) uncomfortable "why ain'tcha winning?" question, by applying it to the population level. As a population, South Korea (TFR=0.72 and falling) doesn't look like it's winning the Malthusian game. 2.04 sounds almost sustainable. And Africa has a TFR>4.

No, I think it's still a bad thing because (as with most religions) it fuels beliefs that prevent people from even considering trying to solve problems like aging and death because "heaven will be better than mortality", "God will make everything better", etc.

Yeah, fair enough. Something like that would be my response too. Though I would add that solving aging is not the quite the same as solving a low total fertility rate. There is also the broader issue of dysgenic trends, with a negative correlation between TFR and IQ, but that takes us too far here.

Comment by cubefox on OpenAI's Sora is an agent · 2024-03-13T19:35:15.476Z · LW · GW

Can you say a bit more on how the idea in this post relates to DeepMind's SIMA?

Comment by cubefox on I was raised by devout Mormons, AMA [&|] Soliciting Advice · 2024-03-13T18:14:38.174Z · LW · GW

Currently the fertility rate is collapsing around the world. In most industrialized countries it is far below 2.1 children per woman. Which suggests that these societies will go extinct, if not some other magical AI solution appears. Even Mormon fertility rates are plummeting, but they are still higher than of most other people in the US. Which suggests Mormons are actually less misaligned with the "goal" of evolution than supposedly more rational people. Mormons are also less responsible for a potential future disappearance of the society they live in.

Do you think this gives Mormonism some practical, if not epistemic, justification?

Comment by cubefox on 0th Person and 1st Person Logic · 2024-03-12T23:02:38.943Z · LW · GW

Suppose I tell a stranger, "It's raining." Under possible worlds semantics, this seems pretty straightforward: I and the stranger share a similar map from sentences to sets of possible worlds, so with this sentence I'm trying to point them to a certain set of possible worlds that match the sentence, and telling them that I think the real world is in this set.

Can you tell a similar story of what I'm trying to do when I say something like this, under your proposed semantics?

So my conjecture of what happens here is: You and the stranger assume a similar degree of confirmation relation between the sentence "It's raining" and possible experiences. For example, you both expect visual experiences of raindrops, when looking out of the window, to confirm the sentence pretty strongly. Or rain-like sounds on the roof. So with asserting this sentence you try to tell the stranger that you predict/expect certain forms of experiences, which presumably makes the stranger predict similar things (if they assume you are honest and well-informed).

The problem with agents mapping a sentence to certain possible worlds is that this mapping has to occur "in our head", internally to the agent. But possible worlds / truth conditions are external, at least for sentences about the external world. We can only create a mapping between things we have access to. So it seems we cannot create such a mapping. It's basically the same thing Nate Showell said in a neighboring comment.

(We could replace possible worlds / truth conditions themselves with other beliefs, presumably a disjunction of beliefs that are more specific than the original statement. Beliefs are internal, so a mapping is possible. But beliefs have content (i.e. meaning) themselves, just like statements. So how then to account for these meanings? To explain them with more beliefs would lead to an infinite regress. It all has to bottom out in experiences, which is something we simply have as a given. Or any really any robot with sensory inputs, as Adele Lopez remarked.)

No, in that post I also consider interpretations of probability where it's subjective. I linked to that post mainly to show you some ideas for how to quantify sizes of sets of possible worlds, in response to your assertion that we don't have any ideas for this. Maybe try re-reading it with this in mind?

Okay, I admit I have a hard time understanding the post. To comment on the "mainstream view":

"1. Only one possible world is real, and probabilities represent beliefs about which one is real."

(While I wouldn't personally call this a way of "estimating the size" of sets of possible worlds,) I think this interpretation has some plausibility. And I guess it may be broadly compatible with the confirmation/prediction theory of meaning. This is speculative, but truth seems to be the "limit" of confirmation or prediction, something that is approached, in some sense, as the evidence gets stronger. And truth is about how the external world is like. Which is just a way of saying that there is some possible way the world is like, which rules out other possible worlds.

Your counterarguments against interpretation 1 seems to be that it is merely subjective and not objective, which is true. Though this doesn't rule out the existence of some unknown rationality standards which restrict the admissible beliefs to something more objective.

Interpretation 2, I would argue, is confusing possibilities with indexicals. These are really different. A possible world is not a location in a large multiverse world. Me in a different possible world is still me, at least if not too dissimilar, but a doppelganger of me in this world is someone else, even if he is perfectly similar to me. (It seems trivially true to say that I could have had different desires, and consequently something else for dinner. If this is true, it is possible that I could have wanted something else for dinner. Which is another way of saying there is a possible world where I had a different preference for food. So this person in that possible world is me. But to say there are certain possible worlds is just a metaphysically sounding way of saying that certain things are possible. Different counterfactual statements could be true of me, but I can't exist at different locations. So indexical location is different from possible existence.)

I don't quite understand interpretation 3. But interpretation 4 I understand even less. Beliefs seem to be are clearly different from desires. The desire that p is different from the belief that p. They can be even seen as opposites in terms of direction of fit. I don't understand what you find plausible about this theory, but I also don't know much about UDT.

Comment by cubefox on 0th Person and 1st Person Logic · 2024-03-12T18:00:35.814Z · LW · GW

Yeah, this is a good point. The meaning of a statement is explained by experiences E, so the statement can't be assumed from the outset to be a proposition (the meaning of a statement), as that would be circular. We have to assume that it is a potential utterance, something like a probabilistic disposition to assent to it. The synonymity condition can be clarified by writing the statements in quotation marks:

Additionally the quantifier ranges only over experiences E, which can't be any statements, but only potential experiences of the agent. Experiences are certain once you have them, while ordinary beliefs about external affairs are not.

By the way, the above is the synonymity condition which defines when two statements are synonymous or not. A somewhat awkward way to define the meaning of an individual statement would be as the equivalence class of all synonymous statements. But a possibility to define the meaning of an individual statement more directly would be to regard the meaning as the set of all pairwise odds ratios between the statement and any possible evidence. The odds ratio measures the degree of probabilistic dependence between two events. Which accords with the Bayesian idea that evidence is basically just dependence.

Then one could define synonymity alternatively as the meanings of two statements, their odds ratio sets, being equal. The above definition of synonymity would then no longer be required. This would have the advantage that we don't have to assign some mysterious unconditional value to P("A"|E)=P("A") if we think A and E are independent. Because independence just means OddsRatio("A",E)=1.

Another interesting thing to note is that Yudkowsky sometimes seems to express his theory of "anticipated experiences" in the reverse of what I've done above. He seems to think of prediction instead of confirmation. That would reverse things:

I don't think it makes much of a difference, since probabilistic dependence is ultimately symmetric, i.e. OddsRatio(X,Y)=OddsRatio(Y,X).

Maybe there is some other reason though to prefer the prediction approach over the confirmation approach. Like, for independence we would, instead of P("A"|E)=P("A"), have P(E|"A")=P(E). The latter refers to the unconditional probability of an experience, which may be less problematic than to rely on the unconditional probability of a statement.

And how does someone compute the degree to which they expect some experience to confirm a statement? I leave that outside the theory. The theory only says that what you mean with a statement is determined by what you expect to confirm or disconfirm it. I think that has a lot of plausibility once you think about synonymity. How could be say two different statements have different meaning when we regard them as empirically equivalent under any possible evidence?

The approach can be generalized to account for the meaning of sub-sentence terms, i.e. individual words. A standard solution is to say that two words are synonymous iff they can be substituted for each other in any statement without affecting the meaning of the whole statement. Then there are tautologies, which are independent of any evidence, so they would be synonymous according to the standard approach. I think we could say their meaning differs in the sense that the meaning of the individual words differ. For other sentence types, like commands, we could e.g. rely on evidence that the command is executed - instead of true, like in statements. An open problem are to account for the meaning of expressions that don't have any obvious satisfaction conditions (like being true or executed), e.g. greetings.

Regarding "What Are Probabilities, Anyway?". The problem you discuss there is how to define an objective notion of probability. Subjective probabilities are simple, they are are just the degrees of belief of some agent at a particular point in time. But it is plausible that some subjective probability distributions are better than others, which suggests there is some objective, ideally rational probability distribution. It is unclear how to define such a thing, so this remains an open philosophical problem. But I think a theory of meaning works reasonably well with subjective probability.

Comment by cubefox on 0th Person and 1st Person Logic · 2024-03-12T01:41:01.166Z · LW · GW

You can interpret them as subjective probability functions, where the conditional probability P(A|B) is the probability you currently expect for A under the assumption that you are certain that B. With the restriction that P(A and B)=P(A|B)P(B)=P(A)P(B|A).

I don't think possible worlds help us to calculate any of the two values in the ratio P(A and B)/P(B). That would only be possible of you could say something about the share of possible worlds in which "A and B" is true, or "B".

Like: "A and B" is true in 20% of all possible worlds, "B" is true in 50%, therefore "A" is true in 40% of the "B" worlds. So P(A|B)=0.4.

But that obviously doesn't work. Each statement is true in infinitely many possible worlds and we have no idea how to count them to assign numbers like 20%.

Comment by cubefox on 0th Person and 1st Person Logic · 2024-03-11T23:13:46.448Z · LW · GW

Instead of directly having a 0P-preference for "a square of the grid is red," the robot would have to have a 1P-preference for "I believe that a square of the grid is red."

It would be more precise to say the robot would prefer to get evidence which raises its degree of belief that a square of the grid is red.

Comment by cubefox on 0th Person and 1st Person Logic · 2024-03-11T22:40:56.745Z · LW · GW

This approach implies there are two possible types of meanings: Sets of possible worlds and sets of possible experiences. A set of possible worlds would constitute truth conditions for "objective" statements about the external world, while a set of experience conditions would constitute verification conditions for subjective statements, i.e. statements about the current internal states of the agent.

However, it seems like a statement can mix both external or internal affairs, which would make the 0P/1P distinction problematic. Consider Wei Dai's example of "I will see red". It expresses a relation between the current agent ("I") and its hypothetical "future self". "I" is presumably an internal object, since the agent can refer to itself or its experiences independently of how the external world turns out to be constituted. The future agent, however, is an external object relative to the current agent which makes the statement. It must be external because its existence is uncertain to the present agent. Same for the future experience of red.

Then the statement "I will see red" could be formalized as follows, where ("I"/"me"/"myself") is an individual constant which refers to the present agent:

Somewhat less formally: "There is an x such that I will become x and there is a experience of red y such that x sees y."

(The quantifier is assumed to range over all objects irrespective of when they exist in time.)

If there is a future object and a future experience that make this statement true, they would be external to the present agent who is making the statement. But is internal to the present agent, as it is the (present) agent itself. (Consider Descartes demon currently misleading you about the existence of the external world. Even in that case you could be certain that you exist. So you aren't something external.)

So Wei's statement seems partially internal and partially external, and it is not clear whether its meaning can be either a set of experiences or a set of possible worlds on the 0P/1P theory. So it seems a unified account is needed.


Here is an alternative theory.

Assume the meaning of a statement is instead a set of experience/degree-of-confirmation pairs. That is, two statements have the same meaning if they get confirmed/disconfirmed to the same degree for all possible experiences that E. So statement A has the same meaning as a statement B iff:

where is a probability function describing conditional beliefs. (See Yudkowsky's anticipated experiences. Or Rudolf Carnap's liberal verificationism, which considers degrees of confirmation instead of Wittgenstein's strict verification.)

Now this arguably makes sense for statements about external affairs: If I make two statements, and I would regard them to be confirmed or disconfirmed to the same degree by the same evidence, that would plausibly mean I regard them as synonymous. And if two people disagree regarding the confirmation conditions of a statement, that would imply they don't mean the same (or completely the same) thing when they express that statement, even if they use the same words.

It also makes sense for internal affairs. I make a statement about some internal affair, like "I see red", formally . Here refers to myself and to my current experience of red. Then this is true iff there is some piece of evidence that which is equivalent to that internal statement, namely the experience that I see red. Then if , otherwise .

Again, the "I" here is logically an individual constant internal to the agent, likewise the experience . That is, only my own experience verifies that statement. If there is another agent, who also sees red, those experiences are numerically different. There are two different constants which refer to numerically different agents, and two constants which refer to two different experiences.

That is even the case if the two agents are perfectly correlated, qualitatively identical doppelgangers with qualitatively identical experiences (on, say, some duplicate versions of Earth, far away from each other). If one agent stubs its toe, the other agent also stubs its toe, but the first agent only feels the pain caused by the first agent's toe, while the second only feels the pain caused by the second agent's toe, and neither feels the experience of the other. Their experiences are only qualitatively but not numerically identical. We are talking about two experiences here, as one could have occurred without the other. They are only contingently correlated.

Now then, what about the mixed case "I will see red"? We need an analysis here such that the confirming evidence is different for statements expressed by two different agents who both say "I will see red". My statement would be (now) confirmed, to some degree, by any evidence (experiences) suggesting that a) I will become some future person x such that b) that future person will see red. That is different from the internal "I see red" experience that this future person would have themselves.

An example. I may see a notice indicating that a red umbrella I ordered will arrive later today, which would confirm that I will see red. Seeing this notice would constitute such a confirming experience. Again, my perfect doppelganger on a perfect twin Earth would also see such a notice, but our experiences would not be numerically identical. Just like my doppelganger wouldn't feel my pain when we both, synchronously, stub our toes. My experience of seeing the umbrella notice is caused (explained) by the umbrella notice here on Earth, not by the far away umbrella notice on twin Earth. When I say "this notice" I refer to the hypothetical object which causes my experience of a notice. So every instance of the indexical "this" involves reference to myself and to an experience I have. Both are internal, and thus numerically different even for agents with qualitatively identical experiences. So if we both say "This notice says I will see a red umbrella later today", we would express different statements. Their meaning would be different.


In summary, I think this is a good alternative to the 0P/1P theory. It provides a unified account of meanings, and it correctly deals with distinct agents using indexicals while having qualitatively identical experiences. Because it has a unified account of meaning, it has no in-principle problem with "mixed" (internal/external) statements.

It does omit possible worlds. So one objection would be that it would assign the same meaning to two hypotheses which make distinct but (in principle) unverifiable predictions. Like, perhaps, two different interpretations of quantum mechanics. I would say that a) these theories may differ in other aspects which are subject to some possible degree of (dis)confirmation and b) if even such indirect empirical comparisons are excluded a priori, regarding them as synonymous doesn't sound so bad, I would argue.

The problem with using possible worlds to determine meanings is that you can always claim that the meaning of "The mome raths outgrabe" is the set of possible worlds where the mome raths outgrabe. Since possible worlds (unlike anticipated degrees of confirmation by different possible experiences) are objects external to an agent, there is no possibility of a decision procedure which determines that an expression is meaningless. Nor can there, with the possible worlds theory, be a decision procedure which determines that two expressions have the same or different meanings. It only says the meaning of "Bob is a bachelor" is determined by the possible worlds where Bob is a bachelor, and that the meaning of "Bob is an unmarried man" is determined by the worlds where Bob is an unmarried man, but it doesn't say anything which would allow an agent to compare those meanings.

Comment by cubefox on Why correlation, though? · 2024-03-10T18:00:13.165Z · LW · GW

That's not quite right. It measures the strength of monotonic relationships, which which may also be linear. So this measure is more general than Pearson correlation. It just measures whether, if one value increases, the other value increases as well, not whether they increase at the same rate.

Comment by cubefox on Completion Estimates · 2024-03-10T00:15:08.068Z · LW · GW

One way to estimate completion times of a person or organization is to compile a list of their own past predictions and actual outcomes, and compute in each case how much longer (or shorter) the actual time to completion was in relative terms.

Since an outcome that took 100% longer than predicted (twice as long), and an outcome that took 50% shorter (half as long) should intuitively cancel out, the geometric mean has to be used to compute the average. In the previous case that would be the square root of the product of those two factors, (2 * 0.5)^(1/2)=1. In that case we should multiply future completion estimates by 1, i.e. leave them as is.

That only works if we have some past history of time estimates and outcomes. Another way would be to look at prediction markets, should one exist for the event at hand. Though that is ultimately a way of outsourcing the problem rather than one of computing it.

Comment by cubefox on Using axis lines for good or evil · 2024-03-09T13:24:10.386Z · LW · GW

These suggestions seem plausible. A few notes:

  • Tick marks for years are ambiguous. Is the tick mark indicating the start of the year? The middle of the year? The end of the year? I have worked with chart libraries, and sometimes it's even the current date, n years ago. Like a tick mark labelled "2010" = March 9, 2010. A better alternative to tick marks is to have "year separators", where the "2010" is placed between two "tick marks" rather than under one, which can only be interpreted as start and end of the year.
  • Regarding temperature. Physically speaking, only 0 Kelvin is an objective zero point, such that something with 20 Kelvin has "twice as" much temperature as with 10 Kelvin. Kelvin is a "ratio" scale. Celsius and Fahrenheit are only "interval" scales, so 20°C is not twice as hot as 10°C, but only ~3.5% hotter. (See also this interesting Wikipedia article on the various types of scales.) This is even though 0°C (water freezes) seems more objective than 0°F.
    Nonetheless, Kelvin is not relevant to what we perceive as "small" and "large" differences in everyday life. We wouldn't say 20°C only feels a mere 3.5% warmer than 10°C.
    I guess it helps to include two familiar "reference points" in the temperature axis of a chart, like freezing and boiling of water (0°C and 100°C, or explicit labels for Fahrenheit) or "fridge temperature" and "room temperature". That should give some intuitive sense of distance in the temperature axis.
Comment by cubefox on Using axis lines for good or evil · 2024-03-09T12:33:21.593Z · LW · GW

This is a good point. Beginning in medias res seems also one of the reasons why posts by Eliezer Yudkowsky and Scott Alexander are so readable.

But for long posts I think a short abstract in the beginning is actually helpful, perhaps highlighted in italics. Unfortunately some people use the abstract as a mere teaser ("... wanna know how I came to that startling conclusion? Guess you have to read the whole paper/post, hehe") rather than as a proper spoiler of the main insights.

"Spoiler" sounds bad from the perspective of the author ("will people still read my post when I already revealed the punchline in the abstract?"), but a spoiler can actually provide motivation to read the whole post for "fun" reasons. E.g. by going "I already agree with that claim in the abstract, let's indulge in confirming my preconceptions!" or "I disagree with that claim, guess I have to read the post so I can write a rebuttal in the comments!" Not very rational, but better than not being motivated to read the post at all.

Though you probably use other tricks to make a post more readable. From your post above I inferred these points:

  • Use examples
  • Include images if possible
  • Don't clutter the post with a lot of distracting links and footnotes
  • Include rhetorical questions
  • short sentences
  • delete unnecessary tangents to make the post shorter

That's what I thought anyway. Maybe you could share your own tips? "How to Write Readable Posts"

Comment by cubefox on Why does generalization work? · 2024-03-06T21:16:19.838Z · LW · GW

As I said, he assumes there is some objectively correct way to define the "thingspace" and a probability distribution on it. Should this rather strong assumption hold, his argument seems plausible that categories (like "mortal") should, and presumably usually do, correspond to clusters of high probability density.

(By the way, macrostates, or at least categories, don't generally form a partition, because something can be both mortal and a biped.)

So I don't think he takes certain categories for granted, but rather the existence an objective thingspace and probability distribution which in turn would enable objective categories. But he doesn't argue for it (except very tangentially in a comment) so you may well doubt such an objective background exists.

I think some small ground to believe his theory is right is that most intuitively natural categories seem to be also objectively better than others, in the sense that they form, or have in the past formed, projectible predicates:

A property of predicates, measuring the degree to which past instances can be taken to be guides to future ones. The fact that all the cows I have observed have been four-legged may be a reasonable basis from which to predict that future cows will be four-legged. This means that four-leggedness is a projectible predicate. The fact that they have all been living in the late 20th or early 21st century is not a reasonable basis for predicting that future cows will be. See also entrenchment, Goodman's paradox.

Projectibility seems to me itself a rather objective statistical category.

Comment by cubefox on indexical uncertainty and the Axiom of Independence · 2024-03-06T20:35:14.892Z · LW · GW

Hm, interesting point about causal decision theory. It seems to me even with CDT I should expect as (causal) consequence of pressing C a higher probability that we get different vaccines than if I had only randomized between button A and B. Because I can expect some probability that the other guy also presses C (which then means we both do). Which would at least increase the overall probability that we get different vaccines, even if I'm not certain that we both press C. Though I find this confusing to reason about.

But anyway, this discussion of indexicals got me thinking of how to precisely express "actions" and "consequences" (outcomes?) in decision theory. And it seems that they should always trivially include an explicit or implicit indexical, not just in cases like the example above. Like for an action X, "I make X true", and for an outcome Y, "I'm in a world where Y is true". Something like that. Not sure how significant this is and whether there are counterexamples.

Comment by cubefox on indexical uncertainty and the Axiom of Independence · 2024-03-06T13:21:52.172Z · LW · GW

Thanks. My second interpretation of the independence axiom seemed to be on track. The car example in the post you linked is formally analogous to your vaccine example. The mother is indifferent between giving the car to her son (A) or daughter (B) but prefers to throw a coin (C, such that C=0.5A+0.5B) to decide who gets it. Constructing it like this, according to the post, would contradict Independence. But the author argues that throwing the coin is not quite the same as simply 0.5A+0.5B, so independence isn't violated.

This is similar to what I wrote, at the end, about your example above:

"I receive either vaccine A or B, but the opposite of what the other guy receives" isn't equivalent to "I receive either vaccine A or B".

Which would mean the example is compatible with the independence axiom.

Maybe there is a different example which would show that rational indexical preferences may contradict Independence, but I struggle to think of one.

Comment by cubefox on indexical uncertainty and the Axiom of Independence · 2024-03-05T16:13:38.794Z · LW · GW

This is a very old post, but I have to say I don't even understand the Axiom of Independence as presented here. It is stated:

The Axiom of Independence says that for any A, B, C, and p, you prefer A to B if and only if you prefer p A + (1-p) C to p B + (1-p) C.

If p A + (1-p) C and p B + (1-p) C, this means that both A and B are true if and only if C is false (two probabilities sum to 1 if and only if they are mutually exclusive and exhaustive). Which means A is true if and only if B is true, i.e. . Since A and B have the same truth value with certainty, they are equivalent events. Like being a unmarried man is equivalent to being a bachelor. If we have to treat A and B as the same event, this would strongly suggest we have to assign them the same utility. But is the same as , i.e. being indifferent between A and B. Which contradicts the assumption that A is preferred to B. Which would make the axiom, at least as stated here, necessarily false, not just false in some more exotic case involving indexical events.

This is also the point where I get stuck understanding the rest of the post. I assume I'm missing something basic here.

Maybe the axiom is intended as something like this:

if and only if

where

and .

Which could be interpreted as saying we should prefer A to B if and only if we would prefer "gambling between" (i.e. being uncertain about) A and C to gambling between B and C, assuming the relative odds are the same in both gambles.

But even in that case, it is unclear how the above can be interpreted like this:

This makes sense if p is a probability about the state of the world. (In the following, I'll use “state” and “possible world” interchangeably.) In that case, what it’s saying is that what you prefer (e.g., A to B) in one possible world shouldn’t be affected by what occurs (C) in other possible worlds.

Unfortunately the last example with the vaccine also doesn't help. It presents a case with just three options: I press A, I press B, I press C. It is unclear (to me) how preferring the third option (and being indifferent between the first two?) would contradict the independence axiom.

Even if we interpret the third option as a gamble between the first two, I'm not sure how this would be incompatible with Independence. But this interpretation doesn't work anyway, since "I receive either vaccine A or B, but the opposite of what the other guy receives" isn't equivalent to "I receive either vaccine A or B".

Comment by cubefox on Wei Dai's Shortform · 2024-03-02T03:08:46.303Z · LW · GW

Many parts of academia have a strong Not Invented Here tendency. Not just research outside of academia is usually ignored, but even research outside a specific academic citation bubble, even if another bubble investigates a pretty similar issue. For example, economic decision theorists ignore philosophical decisions theorists, which in turn mostly ignore the economic decision theorists. They each have their own writing style and concerns and canonical examples or texts. Which makes it hard for outsiders to read the literature or even contribute to it, so they don't.

A striking example is statistics, where various fields talk about the same mathematical thing with their own idiosyncratic names, unaware or unconcerned whether it already had a different name elsewhere.

Edit: Though LessWrong is also a citation bubble to some degree.

Comment by cubefox on Deep and obvious points in the gap between your thoughts and your pictures of thought · 2024-02-25T01:57:12.902Z · LW · GW

Something like the dual process model applied to me in early 2020. My "rational self" (system 2) judged it likely that the novel coronavirus was no longer containable at this point. That we would get a catastrophic global pandemic, like the Spanish flu. Mainly because of a chart I saw on Twitter that compared 2003 SARS case number growth with nCov case number growth. The amount of confirmed cases was still very small, but it was increasing exponentially. Though my gut feeling (system 1) was still judging a global pandemic as unlikely. After all, something like that never happened in my lifetime, and getting double digits of new infections per day didn't yet seem worrying in the grand scheme of things. Exponential growth isn't intuitive. Moreover, most people, including rationalists on Twitter, were still talking about other stuff. Only some time later did my gut feeling "catch up" and the realization hit like a hammer. I think it's important not to forget how early 2020 felt.

Or another example: I currently think (system 2) that a devastating AI catastrophe will occur with some significant probability. But my gut feeling /system 1 still says that everything will surely work differently from how the doomers expect and that we will look naive in hindsight, just as a few years ago nobody expected LLMs to produce oracle AI that basically solves the Turing test, until shortly before it happened.

Those are examples of the system 1 thinking: The situation still looks fairly normal currently, so it will stay normal.

Comment by cubefox on Why does generalization work? · 2024-02-24T23:47:20.084Z · LW · GW

Eliezer has a defense of there being "objectively correct" macrostates / categories. See Mutual Information, and Density in Thingspace. He concludes:

And the way to carve reality at its joints, is to draw your boundaries around concentrations of unusually high probability density in Thingspace.

The open problems with this approach seem to be that it requires some objective notion of probability, and that there is an objectively preferred way of defining "thingspace". (Regarding the latter, I guess the dimensions of thingspace should fulfill some statistical properties, like being as probabilistically independent as possible, or something like that.) Otherwise everyone could have their own subjective probability and their own subjective thingspace, and "carving reality at its joints" wouldn't be possible.

But it seems to me that techniques from unsupervised / self-supervised learning do suggest that there are indeed some statistical features that allow for some objectively superior clustering of data.

Comment by cubefox on In set theory, everything is a set · 2024-02-23T21:20:13.773Z · LW · GW

I'm not metachirality, but I would recommend any introduction to simple type theory (basically: higher-order logic). If you already know first-order logic, it's a natural extension. This is a short one: https://www.sciencedirect.com/science/article/pii/S157086830700081X

Church's simple theory of types is pretty much the "basic" type theory; other type theories extend it in various ways. It's used, e.g., in computer science (some formal proof checkers), linguistics (formal semantics) and philosophy (metaphysics). It can also be used in mathematics as an alternative to the theory of ZFC, which is axiomatized in first-order logic.

Comment by cubefox on The Hidden Complexity of Wishes · 2024-02-15T19:14:26.401Z · LW · GW

I'm well aware of and agree there is a fundamental difference between knowing what we want and being motivated to do what we want. But as I wrote in the first paragraph:

Already LaMDA or InstructGPT (language models fine-tuned with supervised learning to follow instructions, essentially ChatGPT without any RLHF applied), are in fact pretty safe Oracles in regard to fulfilling wishes without misinterpreting you, and an Oracle AI is just a special kind of Genie whose actions are restricted to outputting text. If you tell InstructGPT what you want, it will very much try to give you just what you want, not something unintended, at least if it can be produced using text.

That is, instruction-tuned language models do not just understand (epistemics) what we want them to do, they additionally, to a large extent, do what we want them to do. They are good at executing our instructions. Not just at understanding our instructions but then doing something unintended.

(However, I agree they are probably not perfect at executing our instructions as we intended them. We might ask them to answer to the best of their knowledge, and they may instead answer with something that "sounds good" but is not what they in fact believe. Or, perhaps, as Gwern pointed out, they exhibit things like a strange tendency to answer our request for a non-rhyming poem with a rhyming poem, even though they may be well-aware, internally, that this isn't what was requested.)

Comment by cubefox on Conditional prediction markets are evidential, not causal · 2024-02-08T17:20:28.495Z · LW · GW

See also: Dynomight - Prediction market does not imply causation

Comment by cubefox on Brute Force Manufactured Consensus is Hiding the Crime of the Century · 2024-02-06T17:34:26.882Z · LW · GW

The problem with "lab-leak is unlikely, look at this 17-hour debate" is that it is too short an argument, not a too long one. Arguments don't get substituted by merely referring to them. The expression "the Riemann hypothesis" is not synonymous with the Riemann hypothesis. The former just refers to the latter. You can understand one without understanding the other.

On average, every argument gets more indirect the less accessible it is, and inaccessibility is strongly dependent on length, as well as language, intelligibility, format etc.

It may be that the 17-hour debate cannot be summarized adequately (depending on some standard of adequacy) in a nine-minute post, but a nine-minute post would be much more adequate than "lab-leak is unlikely, look at this 17-hour debate".

Comment by cubefox on Brute Force Manufactured Consensus is Hiding the Crime of the Century · 2024-02-04T12:52:45.342Z · LW · GW

I tried to look at Daniel Filan's tweets. The roughly 240 posts long megathread is a largely unstructed stream of consciousness that appears hard to read. He doesn't seem to make direct arguments so much as it is a scratchpad for thoughts and questions that occured to the author while listening to the debate. The mere act of pointing to such a borderline unreadable thread IMHO doesn't itself constitute a significant counterargument, nor does the act of referring to the prospect of a future summary article. Reasonable norms of good debate suggest relevant counterarguments should be proportional in length and readability to the original argument, which in this case is Rokos compact nine-minute post.

Again, imagine me making a case for or against AI risk by pointing to a 17-hour YouTube debate and to an overly long and convoluted Twitter thread of the thoughts of some guy who listened to the debate. I think few would take me seriously.

I also note that you originally seemed to suggest you watched the debate in whole, while you now sound as if you watched only parts of it and read Daniel Filan's thread. If this alone really got you "from about 70% likelihood of a lab-leak to about 1-5%", then I think it should be easy for you to post an object-level counterargument to Roko's post.

(Apart from that, Daniel Filan himself says he is 75-80% convinced of the zoonosis hypothesis, in contrast to your 95-99%, while also noting that this estimate doesn't even include considerations about genetic evidence which he apparently expects to favor the lab-leak hypothesis. He also says: "Oh also this is influenced by the zoonosis guy being more impressive, which may or may not be a bias." Though pointing to other people's credences doesn't provide a significant argument for anything unless there is also evidence this individual is unusually reliable at making such judgements.)

Comment by cubefox on Brute Force Manufactured Consensus is Hiding the Crime of the Century · 2024-02-04T11:30:35.971Z · LW · GW

I think it's less your obligation to weed through a 17 hour debate than it is Algon's obligation, who implies he watched the debate, to list here the arguments that convinced him that the lab-leak hypothesis is false.

Comment by cubefox on Brute Force Manufactured Consensus is Hiding the Crime of the Century · 2024-02-04T11:21:52.030Z · LW · GW

Could you please list your relevant object-level arguments for the lab-leak being unlikely?

(Posting a link to an extremely long, almost 17 hours, YouTube debate, and to bets on it based on a judgement of unknown judges, is not very helpful and doesn't itself constitute a strong counterargument to the lab-leak arguments in this post. This is similar to how pointing to the supposed "winner" of a recent AI risk debate isn't a strong argument against AI risk.)