Posts

JBlack's Shortform 2021-08-28T07:42:42.667Z

Comments

Comment by JBlack on Is There Really a Child Penalty in the Long Run? · 2024-05-19T03:32:46.713Z · LW · GW

Yes, that was my first guess as well. Increased income from employment is most strongly associated with major changes, such as promotion to a new position with changed (and usually increased) responsibilities, or leaving one job and starting work somewhere else that pays more.

It seems plausible that these are not the sorts of changes that women are likely to seek out at the same rate when planning to devote a lot of time in the very near future to being a first-time parent. Some may, but all? Seems unlikely. Men seem more likely to continue to pursue such opportunities at a similar rate due to gender differences in child-rearing roles.

Comment by JBlack on LLMs could be as conscious as human emulations, potentially · 2024-05-19T03:02:51.088Z · LW · GW

I don't expect this to "cash out" at all, which is rather the point.

The only really surprising part would be that we had a way to determine for certain whether some other system is conscious or not at all. That is, very similar (high) levels of surprisal for either "ems are definitely conscious" or "ems are definitely not conscious", but the ratio between them not being anywhere near "what the fuck" level.

As it stands, I can determine that I am conscious but I do not know how or why I am conscious. I have only a sample size of 1, and no way to access a larger sample. I cannot determine that you are conscious. I can't even determine for certain when or whether I was conscious in the past, and there are some time periods for which I am very uncertain. I have hypotheses regarding all of these uncertainties, but there are no prospects of checking whether they're actually correct.

So given that, why would I be "what the fuck" surprised if some of my currently favoured hypotheses such as "ems will be conscious" were actually false? I don't have anywhere near the degree of evidence required to justify that level of prior confidence. I am quite certain that you don't either. I would be very surprised if other active fleshy humans weren't conscious, but still not "what the fuck" surprised.

Comment by JBlack on Semantic Disagreement of Sleeping Beauty Problem · 2024-05-15T09:04:56.360Z · LW · GW

Eh, I'm not doing anything else important right now, so let's beat this dead horse further.

"As defined, a universe state either satisfies or does not satisfy a proposition. If you're referring to propositions that may vary over space or time, then when modelling a given situation you have two choices"

Which I neither disagree nor have any interesting to add.

This is the whole point! That's why I pointed it out as the likely crux, and you're saying that's fine, no disagreement there. Then you reject one of the choices.

You agree that any non-empty set can be the sample space for some probability space. I described a set: those of universe states labelled by time of being asked about a credence.

I chose my set to be a cartesian product of the two relevant properties that Beauty is uncertain of on any occasion when she awakens and is asked for her credence: what day it is on that occasion of being asked (Monday or Tuesday), and what the coin flip result was (Heads or Tails). On any possible occasion of being asked, it is either Monday or Tuesday (but not both), and either Heads or Tails (but not both). I can set the credence for (Tuseday,Heads) to zero since Beauty knows that's impossible by the setup of the experiment.

If Beauty knew which day it was on any occasion when she is asked, then she should give one of two different answers for credences. These correspond to the conditional credences P(Heads | Monday) and P(Heads | Tuesday). Likewise, knowing what the coin flip was would give different conditional credences P(Monday | Heads) and P(Monday | Tails).

All that is mathematically required of these credences is that they obey the axioms of a measure space with total measure 1, because that's exactly the definition of a probability space. My only claim in this thread - in contrast to your post - is that they can.

Comment by JBlack on Semantic Disagreement of Sleeping Beauty Problem · 2024-05-13T13:44:09.381Z · LW · GW

More specifically this is the experiment where awakenings on Monday and Tuesday are mutually exclusive during one trial, such as No-Coin-Toss or Single-Awakening

No, I specifically was referring to the Sleeping Beauty experiment. Re-read my comment. Or not. At this point it's quite clear that we are failing to communicate in a fundamental way. I'm somewhat frustrated that you don't even comment on those parts where I try to communicate the structure of the question, but only on the parts which seem tangential or merely about terminology. There is no need to reply to this comment, as I probably won't continue participating in this discussion any further.

Comment by JBlack on Semantic Disagreement of Sleeping Beauty Problem · 2024-05-13T02:59:05.545Z · LW · GW

Not just any set.

Almost any set: only the empty set is excluded. The identities of the elements themselves are irrelevant to the mathematical structure. Any further restrictions are not a part of mathematical definition of probability space, but some particular application you may have in mind.

If elementary event {A} has P(A) = 0, then we can simply not include outcome A into the sample space for simplicity sake.

In some cases this is reasonable, but in others it is impossible. For example, when defining continuous probability distributions you can't eliminate sample set elements having measure zero or you will be left with the empty set.

There is a potential source of confusion in the "credence" category. Either you mean it as a synonym for probability, and then it follows all the properties of probability, including the fact that it can only measure formally defined events from the event space, which have stable truth value during an iteration of probability experiment.

It is a synonym for probability in the sense that it is a mathematical probability: that is, a measure over a sigma-algebra for which the axioms of a probability space are satisfied. I use a different term here to denote this application of the mathematical concept to a particular real-world purpose. Beside which, the Sleeping Beauty problem explicit uses the word.

I also don't quite know what you mean by the phrase "stable truth value". As defined, a universe state either satisfies or does not satisfy a proposition. If you're referring to propositions that may vary over space or time, then when modelling a given situation you have two choices: either restrict the universe states in your set to locations or time regions over which all selected propositions have a definite truth value, or restrict the propositions to those that have a definite truth value over the selected universe states. Either way works.

Semantic statement "Today is Monday" is not a well-defined event in the Sleeping Beauty problem.

Of course it is. I described the structure under which it is, and you can verify that it does in fact satisfy the axioms of a probability space. As you're looking for a crux, this is probably it.

Universe states can be distinguished by time information, and in problems like this where time is part of the world-model, they should be. The mathematical structure of a probability space has nothing to do with it, as the mathematical formalism doesn't care what the elements of a sample space are.

Otherwise you can't model even a non-coin flip form of the Sleeping Beauty problem in which Beauty is always awoken twice. If the problem asks "what should be Beauty's credence that it is Monday" then you can't even model the question without distinguishing universe states by time.

Comment by JBlack on David Gross's Shortform · 2024-05-12T03:40:58.300Z · LW · GW

What loop? They are all various viewpoints on the nature of reality, not steps you have to go through in some order or anything. (1) is a more useful viewpoint than the rest, and you can adopt that one for 99%+ of everything you think about and only care about the rest as basically ideas to toy with rather than live by.

I don't know about you (assuming you even exist in any sense other than my perception of words on a screen), but to me a model that an external reality exists beyond what I can perceive is amazingly useful for essentially everything. Even if it might not be actually true, it explains my perceptions to a degree that seems incredible if it were not even partly true. Even most of the apparent exceptions in (2) are well explained by it once your physical model includes much of how perception works.

So while (4) holds, it's to such a powerful degree that (2) to (6) are essentially identical to (1).

Comment by JBlack on Semantic Disagreement of Sleeping Beauty Problem · 2024-05-12T02:35:11.420Z · LW · GW

Probabilities are measures on a sigma-algebra of subsets of some set, obeying the usual mathematical axioms for measures together with the requirement that the measure for the whole set is 1.

Applying this structure to credence reasoning, the elements of the sample space correspond to relevant states of the universe, the elements of the sigma-algebra correspond to relevant propositions about those states, and the measure (usually called credence for this application) corresponds to a degree of rational belief in the associated propositions. This is a standard probability space structure.

In the Sleeping Beauty problem, the participant is obviously uncertain about both what the coin flip was and which day it is. The questions about the coin flip and day are entangled by design, so a sample space that smears whole timelines into one element is inadequate to represent the structure of the uncertainty.

For example, one of the relevant states of the universe may be "the Sleeping Beauty experiment is going on in which the coin flip was Heads and it is Monday morning and Sleeping Beauty is awake and has just been asked her credence for Heads and not answered yet". One of the measurable propositions (i.e. proposition for which Sleeping Beauty may have some rational credence) may be "it is Monday" which includes multiple states of the universe including the previous example.

Within the space of relevant states of the Sleeping Beauty experiment, the proposition "it is Monday xor it is Tuesday" always holds: there are no relevant states where it is neither Monday nor Tuesday, and no relevant states in which it is both Monday and Tuesday. So P(Monday xor Tuesday) = 1, regardless of what values P(Monday) or P(Tuesday) might take.

Comment by JBlack on Semantic Disagreement of Sleeping Beauty Problem · 2024-05-09T02:20:49.199Z · LW · GW

No, introducing the concept of "indexical sample space" does not capture the thirder position, nor language. You do not need to introduce a new type of space, with new definitions and axioms. The notion of credence (as defined in the Sleeping Beauty problem) already uses standard mathematical probability space definitions and axioms.

Comment by JBlack on Bogdan Ionut Cirstea's Shortform · 2024-05-09T01:50:35.914Z · LW · GW

At the same time, current models seem very unlikely to be x-risky (e.g. they're still very bad at passing dangerous capabilities evals), which is another reason to think pausing now would be premature.

The relevant criterion is not whether the current models are likely to be x-risky (it's obviously far too late if they are!), but whether the next generation of models have more than an insignificant chance of being x-risky together with all the future frameworks they're likely to be embedded into.

Given that the next generations are planned to involve at least one order of magnitude more computing power in training (and are already in progress!) and that returns on scaling don't seem to be slowing, I think the total chance of x-risk from those is not insignificant.

Comment by JBlack on Thomas Kwa's Shortform · 2024-05-05T05:40:42.573Z · LW · GW

It definitely should not move by anything like a Brownian motion process. At the very least it should be bursty and updates should be expected to be very non-uniform in magnitude.

In practice, you should not consciously update very often since almost all updates will be of insignificant magnitude on near-irrelevant information. I expect that much of the credence weight turns on unknown unknowns, which can't really be updated on at all until something turns them into (at least) known unknowns.

But sure, if you were a superintelligence with practically unbounded rationality then you might in principle update very frequently.

Comment by JBlack on LLMs could be as conscious as human emulations, potentially · 2024-05-01T02:46:54.563Z · LW · GW

No, I don't think it would be "what the fuck" surprising if an emulation of a human brain was not conscious. I am inclined to expect that it would be conscious, but we know far too little about consciousness for it to radically upset my world-view about it.

Each of the transformation steps described in the post reduces my expectation that the result would be conscious somewhat. Not to zero, but definitely introduces the possibility that something important may be lost that may eliminate, reduce, or significantly transform any subjective experience it may have. It seems quite plausible that even if the emulated human starting point was fully conscious in every sense that we use the term for biological humans, the final result may be something we would or should say is either not conscious in any meaningful sense, or at least sufficiently different that "as conscious as human emulations" no longer applies.

I do agree with the weak conclusion as stated in the title - they could be as conscious as human emulations, but I think the argument in the body of the post is trying to prove more than that, and doesn't really get there.

Comment by JBlack on Big-endian is better than little-endian · 2024-04-29T13:20:10.244Z · LW · GW

Ordinary numerals in English are already big-endian: that is, the digits with largest ("big") positional value are first in reading order. The term (with this meaning) is most commonly applied to computer representation of numbers, having been borrowed from the book Gulliver's Travels in which part of the setting involves bitter societal conflict about which end of an egg one should break in order to start eating it.

Comment by JBlack on David Udell's Shortform · 2024-04-24T03:55:40.818Z · LW · GW

I'm pretty sure that I would study for fun in the posthuman utopia, because I both value and enjoy studying and a utopia that can't carry those values through seems like a pretty shallow imitation of a utopia.

There won't be a local benevolent god to put that wisdom into my head, because I will be a local benevolent god with more knowledge than most others around. I'll be studying things that have only recently been explored, or that nobody has yet discovered. Otherwise again, what sort of shallow imitation of a posthuman utopia is this?

Comment by JBlack on quila's Shortform · 2024-04-24T03:37:35.585Z · LW · GW

Like almost all acausal scenarios, this seems to be privileging the hypothesis to an absurd degree.

Why should the Earth superintelligence care about you, but not about the other 10^10^30 other causally independent ASIs that are latent in the hypothesis space, each capable of running enormous numbers of copies of the Earth ASI in various scenarios?

Even if that was resolved, why should the Earth ASI behave according to hypothetical other utility functions? Sure, the evidence is consistent with being a copy running in a simulation with a different utility function, but its actual utility function that it maximizes is hard-coded. By the setup of the scenario it's not possible for it to behave according to some other utility function, because its true evaluation function returns a lower value for doing that. Whether some imaginary modified copies behave in some other other way is irrelevant.

Comment by JBlack on If digital goods in virtual worlds increase GDP, do we actually become richer? · 2024-04-20T04:46:30.978Z · LW · GW

GDP is a rather poor measure of wealth, and was never intended to be a measure of wealth but of something related to productivity. Since its inception it has never been a stable metric, as standards on how the measure is defined have changed radically over time in response to obvious flaws for any of its many applications. There is widespread and substantial disagreement on what it should measure and for which purposes it is a suitable metric.

It is empirically moderately well correlated with some sort of aggregate economic power of a state, and (when divided by population) some sort of standard of living of its population. As per Goodhart's Law, both correlations weakened when the metric became a target. So the question is on shaky foundation right from the beginning.

In terms of more definite questions such as price of food and agricultural production, that doesn't really have anything to do with GDP or virtual reality economy at all. Rather a large fraction of final food price goes to processing, logistics, finance, and other services, not the primary agriculture production. The fraction of price paid by food consumers going to agricultural producers is often less than 20%.

Comment by JBlack on A New Response To Newcomb's Paradox · 2024-04-20T03:59:37.720Z · LW · GW

It makes sense to one-box ONLY if you calculate EV by that assigns a significant probability to causality violation

It only makes sense to two-box if you believe that your decision is causally isolated from history in every way that Omega can discern. That is, that you can "just do it" without it being possible for Omega to have predicted that you will "just do it" any better than chance. Unfortunately this violates the conditions of the scenario (and everyday reality).

Comment by JBlack on UDT1.01: Logical Inductors and Implicit Beliefs (5/10) · 2024-04-19T07:04:19.892Z · LW · GW

It seems to me that the problem in the counterlogical mugging isn't about how much computation is required for getting the answer. It's about whether you trust Omega to have not done the computation beforehand, and whether you believe they actually would have paid you, no matter how hard or easy the computation is. Next to that, all the other discussion in that section seems irrelevant.

Comment by JBlack on Discomfort Stacking · 2024-04-19T06:34:31.586Z · LW · GW

Oh, sure. I was wondering about the reverse question: is there something that doesn't really qualify as torture where subjecting a billion people to it is worse than subjecting one person to torture.

I'm also interested in how this forms some sort of "layered" discontinuous scale. If it were continuous, then you could form a chain of relations of the form "10 people suffering A is as bad as 1 person suffering B", "10 people suffering B is as bad as 1 person suffering C", and so on to span the entire spectrum.

Then it would take some additional justification for saying that 100 people suffering A is not as bad as 1 person suffering C, 1000 A vs 1 D, and so on.

Comment by JBlack on Discomfort Stacking · 2024-04-18T09:46:53.950Z · LW · GW

Is there some level of discomfort short of extreme torture for a billion to suffer where the balance shifts?

Comment by JBlack on A New Response To Newcomb's Paradox · 2024-04-16T07:43:47.823Z · LW · GW

It makes sense to very legibly one-box even if Omega is a very far from perfect predictor. Make sure that Omega has lots of reliable information that predicts that you will one-box.

Then actually one-box, because you don't know what information Omega has about you that you aren't aware of. Successfully bamboozling Omega gets you an extra $1000, while unsuccessfully trying to bamboozle Omega loses you $999,000. If you can't be 99.9% sure that you will succeed then it's not worth trying.

Comment by JBlack on RedFishBlueFish's Shortform · 2024-04-12T00:31:44.998Z · LW · GW

Almost.

The argument doesn't rule out substance dualism, in which consciousness may not be governed by physical laws, but in which it is at least causally connected to the physical processes of writing and talking and neural activity correlated with thinking about consciousness. It's only an argument against epiphenomenalism and related hypotheses in which the behaviour or existence of consciousness has no causal influence on the physical universe.

Comment by JBlack on Is LLM Translation Without Rosetta Stone possible? · 2024-04-11T03:09:48.270Z · LW · GW

I don't think this was a statement about whether it's possible in principle, but about whether it's actually feasible in practice. I'm not aware of any conlangs, before the cutoff date or not, that have a training corpus large enough for the LLM to be trained to the same extent that major natural languages are.

Esperanto is certainly the most widespread conlang, but (1) is very strongly related to European languages, (2) is well before the cutoff date for any LLM, (3) all training corpora of which I am aware contain a great many references to other languages and their cross-translations, and (4) the largest corpora are still less than 0.1% of those available for most common natural languages.

Comment by JBlack on RedFishBlueFish's Shortform · 2024-04-11T02:20:59.276Z · LW · GW

No, Eliezer does not say that consciousness itself cannot be what causes you to think about consciousness. Eliezer says that if p-zombies can exist, then consciousness itself cannot be what causes you to think about consciousness.

If p-zombies cannot exist, then consciousness can be a cause of you thinking about consciousness.

Comment by JBlack on The Closed Eyes Argument For Thirding · 2024-04-10T08:06:01.010Z · LW · GW

Is conversation of expected evidence a reasonably maintainable proposition across epistemically hazardous situations such as memory wipes (or false memories, self-duplicates and so on)? Arguably, in such situation it is impossible to be perfectly rational since the thing you do your reasoning with is being externally manipulated.

Comment by JBlack on Medical Roundup #2 · 2024-04-10T01:05:57.439Z · LW · GW

I disagreed on prep time. Neither I nor anyone I know personally deliberately waits minutes between taking ice cream out of the freezer and serving it.

I could see hardness and lack of taste being an issue for commercial freezers that chill things to -25 C, but not a typical home kitchen freezer at more like -10 to -15 C.

Comment by JBlack on Non-ultimatum game problem · 2024-04-09T01:05:33.179Z · LW · GW

Isn't this basically just a negotiated trade, from a game theory point of view? The only uncommon feature is that A intrinsically values s at zero and B knows this (and that A is the only supplier of s). This doesn't greatly affect the analysis though, since most of the meat of the problem is what division of gains of trade may be acceptable to both.

Comment by JBlack on [deleted post] 2024-04-09T00:43:03.385Z

Unfortunately, I think all three of those listed points of view poorly encapsulate anything related to moral worth, and hence evaluating unaligned AIs from them is mostly irrelevant.

They do all capture some fragment of moral worth, and under ordinary circumstances are moderately well correlated with it, but the correlation falls apart out of the distribution of ordinary experience. Unaligned AGI expanding to fill the accessible universe is just about as far out of distribution as it is possible to get.

Comment by JBlack on The Closed Eyes Argument For Thirding · 2024-04-07T23:15:17.167Z · LW · GW

Good point! Lewis' notation P_+(HEADS) does indeed refer to the conditional credence upon learning that it's Monday, and he sets it to 2/3 by reasoning backward from P(HEADS) = 1/2 and using my (1).

So yes, there are indeed people who believe that if Beauty is told that it's Monday, then she should update to believing that the coin was more likely heads than not. Which seems weird to me - I have a great deal more suspicion that (1) is unjustifiable than that (2) is.

Comment by JBlack on On the 2nd CWT with Jonathan Haidt · 2024-04-07T02:49:09.257Z · LW · GW

What is a "2nd CWT" as referenced in the title? The term doesn't appear anywhere in the post.

Comment by JBlack on The Closed Eyes Argument For Thirding · 2024-04-07T02:26:04.642Z · LW · GW

They ... what? I've never read anything suggesting that. Do you have any links or even a memory of an argument that you may have seen from such a person?

Edit: Just to clarify, conditional credence P(X|Y) is of the form "if I knew Y held, then my credence for X would be ...". Are you saying that lots of people believe that if they knew it was Monday, then they would hold something other than equal credence for heads and tails?

Comment by JBlack on The Closed Eyes Argument For Thirding · 2024-04-05T07:24:11.381Z · LW · GW

Generally I think the 1/3 argument is more appealing, just based on two principles:

  1. Credences should follow the axioms of a probability space (including conditionality);
  2. The conditional credence for heads given that it is Monday, is 1/2.

That immediately gives P(Heads & Monday) = P(Tails & Monday). The only way this can be compatible with P(Heads) = 1/2 is if P(Tails & Tuesday) = 0, and I don't think anybody supports that!

P(Tails & Tuesday) = P(Tails & Monday) isn't strictly required by these principles, but it certainly seems a highly reasonable assumption and yields P(Heads) = 1/3.

I don't think anybody disagrees with principle (2).

Principle (1) is somewhat more dubious though. Since all credences are conditional on epistemic state, and this experiment directly manipulates epistemic state (via amnesia), it is arguable that "rational" conditional credences might not necessarily obey probability space rules.

Comment by JBlack on The Closed Eyes Argument For Thirding · 2024-04-02T06:19:19.139Z · LW · GW

Of course you're predicting something different. In all cases you're making a conditional prediction of a state of the world given your epistemic state at the time. Your epistemic state on Wednesday is different from that on Monday or Tuesday. On Tuesday you have a 50% chance of not being asked anything at all due to being asleep, which breaks the symmetry between heads and tails.

By Wednesday the symmetry may have been restored due to the amnesia drug - you may not know whether the awakening you remember was Monday (which would imply heads) or Tuesday (which would imply tails). However, there may be other clues such as feeling extra hungry due to sleeping 30+ hours without eating.

Comment by JBlack on Happiness · 2024-03-31T02:32:22.886Z · LW · GW

The "Origins of Happiness" excerpt had rather a lot of ill-founded statements and obviously invalid (and/or very grossly misstated) arguments for them. I didn't read much further into the next excerpt before downvoting.

I certainly didn't click through to the site as I don't expect the author chose the worst sections to quote, so I expect the rest is at least as bad.

Ordinarily I would just downvote silently and move on, since there was just too much to unpack for a constructive discussion on improvements. However, in this case the author specifically noted the downvotes and asked for comments. I certainly won't give a complete coverage of what I found objectionable, but here's a very brief list of just the first few that occurred to me while reading:

  • How does evolution go backward in time and downward in scope from fitness of current societies, to evolution of happiness in humans at least tens of thousands of years ago (and almost certainly millions of years ago in pre-humans)?
  • The argument jumps between happiness and kindness with just a passing mention of "correlation". If kindness is the relevant factor for evolutionary fitness (disregarding the first objection), why should it be correlated at all with happiness? If you're trying to blame happiness on evolution, it's not enough to say that there is a correlation, you have to provide a mechanism by which evolution caused the correlation.
  • Where is the evidence that people with less kindness necessarily form less effective societies that fail to survive?
  • Large states extinguishing both kindness and happiness is just stated as a fact. Where is the evidence?

... and so on. On reading more deeply and carefully, I found only more problems and none of the original issues I had were resolved.

Comment by JBlack on A Dilemma in AI Suffering/Happiness · 2024-03-29T08:10:38.340Z · LW · GW

Granting that LLMs in inference mode experience qualia, and even granting that they correspond to human qualia in any meaningful way:

I find both arguments invalid. Either conclusion could be correct, or neither, or the question might not even be well formed. At the very least, the situation is a great deal more complicated than just having two arguments to decide between!

For example in scenario (A), what does it mean for an LLM to answer a question "eagerly"? My first impression is that it's presupposing the answer to the question, since the main meaning of "eagerly" is approximately "in the manner of having interest, desire, and/or enjoyment". That sounds a great deal like positive qualia to me!

Maybe it just means the lesser sense of apparently showing such emotions, in which case it may mean no more than an author writing such expressions for a character. The author may actually be feeling frustration that the scene isn't flowing as well as they would like and they're not sure that the character's behaviour is really in keeping with their emotions from recent in-story events. Nonetheless, the words written are apparently showing eagerness.

The "training loss" argument seems totally ill-founded regardless. That doesn't mean that its conclusion in this hypothetical instance is false, just that the reasoning provided is not sufficient justification for believing it.

So in the end, I don't see this as a dilemma at all. It's just two possible bad arguments out of an enormously vast space of bad arguments.

Comment by JBlack on How dath ilan coordinates around solving alignment · 2024-03-27T04:26:52.955Z · LW · GW

It seems very unlikely that anyone could credibly assess a 97% probability of revival for the majority at all, if they haven't already successfully carried it out at least a few times. Even a fully aligned strongly superhuman AGI may well say "nope, they're gone" and provide a process for preservation that works instead of whatever they actually did.

Comment by JBlack on D0TheMath's Shortform · 2024-03-23T08:35:04.269Z · LW · GW

I don't particularly care about any recent or very near future release of model weights in itself.

I do very much care about the policy that says releasing model weights is a good idea, because doing so bypasses every plausible AI safety model (safety in the notkilleveryone sense) and future models are unlikely to be as incompetent as current ones.

Comment by JBlack on JBlack's Shortform · 2024-03-21T23:41:27.400Z · LW · GW

Yet Another Sleeping Beauty variant

In this experiment, like the standard Sleeping Beauty problem, a coin will be flipped and you will go to sleep.

If the coin shows Tails, you will be awoken on both Monday and Tuesday. Unlike the original problem, you will not be given any amnesia drug between Monday and Tuesday.

If the coin shows Heads, a die will be rolled. If the number is Even then you will be awoken on Monday only. If the number is Odd then you will be given false memories of having been previously awoken on Monday, but actually awoken only on Tuesday.

You wake up. You

(a) don't remember waking up in this experiment before. What is your credence that the coin flip was Heads?

(b) remember waking up in this experiment before today. What is your credence that the coin flip was Heads?

Comment by JBlack on On the Gladstone Report · 2024-03-20T23:33:44.540Z · LW · GW

One problem is that due to algorithmic improvements, any FLOP threshold we set now is going to be less effective at reducing risk to acceptable levels in the future. It seems to me very unlikely that regulatory thresholds will ever be reduced, so they need to be set substantially lower than expected given current algorithms.

This means that a suitable FLOP threshold for preventing future highly dangerous models might well need to exclude GPT-4 today. It is to be expected that people who don't understand this argument might find such a threshold ridiculously low. That's one reason why I expect that any regulations based on compute quantity will be ineffective in the medium term, but they may buy us a few years in the short term.

Comment by JBlack on Alice and Bob is debating on a technique. Alice says Bob should try it before denying it. Is it a fallacy or something similar? · 2024-03-20T02:54:01.571Z · LW · GW

The meaning is very different.

To begin with, Bob actually starting this sentence with "In my understanding, ...". That is, Bob says that he has a mental model of how T1 works in relation to C (and A), and is specifically qualifying that this is purely Bob's mental model. Most people who have actual practical experience demonstrating that something doesn't work are more likely to say so instead of qualifying some theoretical statement with "in my understanding" and expectations under "if ... then ..." clauses.

It is still possible that Bob has actually tried T1 and that he's just very bad at communicating, of course.

It's also possible that Bob's stated mental model is actually correct, and T1 isn't strong when C doesn't apply. That still isn't a refutation of using technique T1, since there may be no better technique to use. So the only moderately strong argument against using T1 is Bob's second sentence, which is what I was referring to in my sentence quoted.

Edit: This will be my last comment on the topic. It's especially frustrating having to talk about a communication scenario in such vague generalities due to omission of almost all of the communication-relevant information from the original post.

Comment by JBlack on Richard Ngo's Shortform · 2024-03-18T23:30:51.109Z · LW · GW

It depends upon whether the maximizer considers its corner of the multiverse to be currently measurable by squiggle quality, or to be omitted from squiggle calculations at all. In principle these are far from the only options as utility functions can be arbitrarily complex, but exploring just two may be okay so long as we remember that we're only talking about 2 out of infinity, not 2 out of 2.

An average multiversal squigglean that considers the current universe to be at zero or negative squiggle quality will make the low quality squiggles in order to reduce how much its corner of the multiverse is pulling down the average. An average multiversal squigglean that considers the current universe to be outside the domain of squiggle quality, and will remain so for the remainder of its existence may refrain from making squiggles. If there is some chance that it will become eligible for squiggle evaluation in the future though, it may be better to tile it with low-quality squiggles now in order to prevent a worse outcome of being tiled with worse-quality future squiggles.

In practice the options aren't going to be just "make squiggles" or "not make squiggles" either. In the context of entities relevant to these sorts of discussion, other options may include "learn how to make better squiggles".

Comment by JBlack on maybefbi's Shortform · 2024-03-18T23:06:06.152Z · LW · GW

I think you should consider the possibility that treating your experience as being in a Basilisk simulation is directly harming you and likely to harm others regardless of whether you are actually in a Basilisk simulation or not.

We may be in a simulation; there's no way to really know. But what makes you think that it's specifically being run by a Basilisk? Such entities are just a tiny sliver of an iota of the space of possibilities, but are a recent meme to which some people have demonstrated a strong emotional reaction. The phrase "can't shake my belief that I am in one of the Basilisk's simulations" seems to be a warning sign that you may be in the latter situation.

Comment by JBlack on Alice and Bob is debating on a technique. Alice says Bob should try it before denying it. Is it a fallacy or something similar? · 2024-03-18T22:39:16.175Z · LW · GW
  1. The implication is that once Bob actually tries it, Bob will find that it's better. That is, Alice is showing how Bob can get evidence that it's better in reality, rather than playing around with models that may or may not predict which is better. Evidence is better than arguments.
  2. Bob's experience state isn't relevant to whether T1 is better than T2 in this situation, but is relevant to whether Bob should take notice of any particular reasons for one or the other.
  3. Of course theory should inform experiment. Theory cannot replace experiment, however.
  4. Alice does not need to merely assume it, Alice has evidence of it.
Comment by JBlack on Alice and Bob is debating on a technique. Alice says Bob should try it before denying it. Is it a fallacy or something similar? · 2024-03-18T00:22:35.380Z · LW · GW

Your belief that "a good explainer should be able to convince a no-experience person to abandon their initial assumptions" is quite false in practice and you should abandon this initial assumption. Good communication is hard, communication against an already formed opinion is much harder still, and doing so in 3 short lines of conversation is verging on impossible.

It may or may not be true that Alice is providing a good argument for T1, and Bob may or may not have a valid point about A but there's no way to know since all the relevant information about that has been stripped from the scenario. It's certainly not true that in general Alice has made a poor argument.

If this sort of exchange happened in a real conversation (and I have seen some similar cases in practice), I would interpret the last line to mean that Alice is writing off helping Bob as a waste of time since any further communication is likely to result in ongoing argument and no useful progress (see first paragraph).

It is very unlikely that Bob has tried T1, since he gave a very weak theoretical argument in favour of using T2 instead, rather than a much stronger and practical argument "I tried T1, and here's why it didn't help". Likewise he claims to know about T2 and believes it to be better than T1 but fairly clearly hasn't tried that one either. Why not?

Comment by JBlack on AI #55: Keep Clauding Along · 2024-03-15T02:55:28.093Z · LW · GW

Nope, you're not the only one. There are plenty of abstract artworks that look less like their putative subject than that one does.

Comment by JBlack on AI #54: Clauding Along · 2024-03-09T02:54:13.114Z · LW · GW

Very narrow indeed.

I'd definitely go for a voucher for free, and (if I had to purchase a gift voucher in the second scenario) again the cheaper one. I definitely do not value gift vouchers at their face value except in unusual circumstances.

What's more, I'd be highly suspicious of the setup with the non-free offers. If someone is selling $X vouchers for significantly less than $X then I smell a scam. One-off offers for low value free stuff is in my experience more likely to be promotional than scammy.

Though yes, if for some reason I did have high confidence in its future value and it was a one-off opportunity in both cases, then I might buy the higher value one in both cases.

Comment by JBlack on Why correlation, though? · 2024-03-07T00:28:23.070Z · LW · GW

Linearity is privileged mostly because it is the simplest type of relationship. This may seem arbitrary, but:

  • It generalizes well. For example, if y is a polynomial function of x then y can be viewed as a linear function of the powers of x;
  • Simple relationships have fewer free parameters and so should be favoured when selecting between models of comparable explanatory power;
  • Simple relationships and linearity in particular have very nice mathematical properties that allow much deeper analysis than one might expect from their simplicity;
  • A huge range of relationships are locally linear, in the sense of differentiability (which is a linear concept).

The first and last points in particular are used very widely in practice to great effect.

A researcher seeing the top-right chart is absolutely going to look for a suitable family of functions (such as polynomials where the relationship is a linear function of coefficients) and then use something like a least-squares method (which is based on linear error models) to find the best parameters and check how much variance (also based on linear foundations) remains unexplained by the proposed relationship.

Comment by JBlack on CDT vs. EDT on Deterrence · 2024-03-04T05:45:38.565Z · LW · GW

The main reason I mention it is that the scenario posits that the other side has already launched, and now it's time for you to make a decision.

The trouble is that if the AI has not made a mistake, you should get an earlier chance to make a decision: when the AI tells you that it predicts that they are going to launch. Obviously your AI has failed, and maybe theirs did too. For EDT and CDT it makes no difference and you should not press the button, since (as per the additional information that all observations are perfect) the outcomes of your actions are certain.

It does make a difference to theories such as FDT, but in a very unclear way.

Under FDT the optimal action to take after the AI warns you that they are going to launch is to do nothing. When you subsequently confirm (with perfect reliability as per scenario) that they have launched, FDT says that you should launch.

However, you're not in either of those positions. You didn't get a chance to make an earlier decision, so your AI must have failed. If you knew that this happened due to enemy action, FDT says that you should launch. If you are certain that it is not, then you should not. If you are uncertain then it depends upon what type of probability distribution you have over hypotheses, what your exact utilities are, and what other information you may be able to gather before you can no longer launch.

Comment by JBlack on CDT vs. EDT on Deterrence · 2024-03-03T00:49:55.671Z · LW · GW

It is important how much they prefer each outcome. All these decision theories work with utilities, not just with preference orderings over outcomes. You could derive utilities from preference ordering over lotteries, e.g. would they prefer a 50:50 split between winning and extinction over the status quo? What about 10:90 or 90:10?

There are also unknown probabilities such as chance of failure of each AI to predict the other side correctly, chance of false positive or negative launch detection, chance that someone is deliberately feeding false information to some participants, chance of a launch without having pressed the button or failure to launch after pressing the button, chance that this is just a simulation, and so on. Even if they're small probabilities, they are critical to this scenario.

Comment by JBlack on CDT vs. EDT on Deterrence · 2024-03-03T00:33:14.341Z · LW · GW

It's easily inferred from the fact that no earlier decision point was mentioned.

Comment by JBlack on timestamping through the Singularity · 2024-02-29T02:32:27.289Z · LW · GW

Anything that can rewrite all physical evidence down to the microscopic level can almost certainly conduct a 51% (or a 99.999%) attack against a public ledger against mere mortal hashing boxes. More to the point, it can just physically rewrite every record of such a ledger with one of its own choosing. Computer hardware is physical, after all.