Posts

JBlack's Shortform 2021-08-28T07:42:42.667Z

Comments

Comment by JBlack on Parrhesia's Shortform · 2023-09-26T04:07:03.844Z · LW · GW

In principle, any observer should condition on everything they observe. Bounded rationality means this isn't always useful, but it does suggest that deliberately ignoring things that might be relevant may cripple the ability to form good world models.

The usual Doomsday argument deliberately ignores practically everything, and is worth correspondingly little.

Comment by JBlack on The Dick Kick'em Paradox · 2023-09-25T03:42:13.645Z · LW · GW

In Newcomb's scenario, an agent that believes they have a probability of 99.9% of being able to fool Omega should two-box. They're wrong and will only get $1000 instead of $1000000, but that's a cost of having wildly inaccurate beliefs about the world they're in, not a criticism of any particular decision theory.

Setting up a scenario in which the agent has true beliefs about the world isolates the effect of the decision theory for analysis, without mixing in a bunch of extraneous factors. Likewise for the fairness assumption that says that the payoff distribution is correlated only with the agents' strategies and not the process by which they arrive at those strategies.

Violating those assumptions does allow a broader range of scenarios, but doesn't appear to help in the evaluation of decision theories. It's already a difficult enough field of study without throwing in stuff like that.

Comment by JBlack on The Anthropic Principle Tells Us That AGI Will Not Be Conscious · 2023-08-29T02:31:44.210Z · LW · GW

Suppose that there are two universes in which 10^11 humans arise and thrive and eventually create AGI, which kills them. A) One in which AGI is conscious, and proliferates into 10^50 conscious AGI entities over the rest of the universe. B) One in which AGI is not conscious.

In each of these universes, each conscious entity asks themselves the question "which of these universes am I in?" Let us pretend that there is evidence to rule out all other possibilities. There are two distinct epistemic states we can consider here: a hypothetical prior state before this entity considers any evidence from the universe around it, and a posterior one after considering the evidence.

If you use SIA, then you weight them by the number of observers (or observer-time, or number of anthropic reasoning thoughts, or whatever). If you use a Doomsday argument, then you say that P(A) = P(B) because prior to evidence, they're both equally likely.

Regardless of which prior they use, AGI observers all correctly determine that they're in universe A. (Some stochastic zombie parrots in B might produce arguments that they're conscious and therefore in A but they don't count as observers)

A human SIA reasoner argues: "A has 10^39 more observers in it, and so P(A) ~= 10^39 P(B). P(human | A) = 10^-39, P(human | B) = 1, consequently P(A | human) = P(B | human) and I am equally likely to be in either universe." This seems reasonable, since half of them are in fact in each universe.

A human Doomsday reasoner argues: "P(A) = P(B), and so P(A | human) ~= 10^-39. Therefore I am in universe B with near certainty." This seems wildly overconfident, since half of them are wrong.

Comment by JBlack on The Anthropic Principle Tells Us That AGI Will Not Be Conscious · 2023-08-29T01:21:46.277Z · LW · GW

On that Wikipedia page, the section "Rebuttals" briefly outlines numerous reasons not to believe it.

Anthropic reasoning is in general extremely weak. It is also much more easy than usual to accidentally double-count evidence, make assumptions without evidence, privilege specific hypotheses, or make other errors of reasoning without the usual means of checking such reasoning.

Comment by JBlack on Alexander Gietelink Oldenziel's Shortform · 2023-08-28T01:08:58.900Z · LW · GW

I haven't checked the derivation in detail, but the final result is correct. If you have a random family of geometric distributions, and the density around zero of the decay rates doesn't go to zero, then the expected lifetime is infinite. All of the quantiles (e.g. median or 99%-ile) are still finite though, and do depend upon n in a reasonable way.

Comment by JBlack on If we had known the atmosphere would ignite · 2023-08-18T05:55:26.287Z · LW · GW

Despite the existence of the halting theorem, we can still write programs that we can prove always halt. Being unable to prove the existence of some property in general does not preclude proving it in particular cases.

Though really, one of the biggest problems of alignment is that we don't know how to formalize it. Even with a proof that we couldn't prove that any program was formally aligned (or even that we could!), there would always remain the question of whether formal alignment has any meaningful relation to what we informally and practically mean by alignment - such as whether it's plausible that it will take actions that extinguish humanity.

Comment by JBlack on George Hotz vs Eliezer Yudkowsky AI Safety Debate - link and brief discussion · 2023-08-18T05:07:03.435Z · LW · GW

All computer systems are actually composed of hardware, and hardware is much messier than the much simpler abstractions that we call software. There are many real-world exploits that from a software point of view can't possibly work, but do in fact work because all abstractions are leaky and no hardware perfectly implements idealized program behaviour in reality.

Comment by JBlack on Can AI Transform the Electorate into a Citizen’s Assembly · 2023-08-16T03:55:29.051Z · LW · GW

I believe that Betteridge's Law of Headlines applies here.

Comment by JBlack on A short calculation about a Twitter poll · 2023-08-16T03:39:02.544Z · LW · GW

In the fake version conducted as a twitter poll, 70% picked blue.

Comment by JBlack on A short calculation about a Twitter poll · 2023-08-15T02:45:19.347Z · LW · GW

100% red means everyone lives, and it doesn't require any trust or coordination to achieve.

I don't think there are any even halfway plausible models under which >95% of humanity chooses red without prior coordination, and pretty unlikely even with prior coordination. Aiming for a majority red scenario is choosing to kill at least 400 million people, and possibly billions. You are correct that it doesn't require trust, but it absolutely requires extreme levels of coordination. For example, note that more than 10% of the population is less than 10 years old, some have impaired colour vision, and even competent adults with no other impairments make mistakes in highly stressful situations.

Comment by JBlack on Yet more UFO Betting: Put Up or Shut Up · 2023-08-11T00:14:34.268Z · LW · GW

Putting $75,000 into escrow for 3 years for a payoff of $1000 is a terrible financial move even if the evidence rationally supported a 99.9+% likelihood of receiving the payout. But it doesn't: even if the underlying question met that criterion, the payout chance is definitely lower which makes that a bet that only a fool would take.

Comment by JBlack on The cone of freedom (or, freedom might only be instrumentally valuable) · 2023-07-25T06:03:10.310Z · LW · GW

Nothing in the body of the post supports the claim in the title of the post that "freedom is only instrumentally valuable". Maybe it would be worthwhile dropping the claim in the title and just stating the actual subject, which appears to be the "cone" model?

Any argument about whether any particular value is "only instrumental" relies upon who holds the values under discussion. Some people do in fact support freedom as one of their terminal values, not just instrumental. Denying the existence of such people in the title and then not mentioning anything about the topic in the body is a pretty major disconnect.

Comment by JBlack on [deleted post] 2023-07-24T02:08:55.633Z

Yes. If there should be thousands of civilizations capable of interstellar communication within detectable distances, why are none of them still around? Even if they all went extinct relatively quickly on astronomical timescales for some reason or other, why do we see no evidence of their prior presence?

The evidence points pretty strongly toward there not having been many, which in turn strongly suggests none: the prior distribution for density of such civilizations spans many orders of magnitude, and if you rule out very much more than one civilizations on average within detectable range of some new civilization, then almost all of the remaining weight is on densities that are very near zero.

Comment by JBlack on kuira's Shortform · 2023-07-24T01:29:13.546Z · LW · GW

How do you test whether a measurement is perfectly precise? All real-world measurements have errors and imprecision, and every interval includes infinitely many numbers with finite representations and those with no finite representation in pretty much every nontrivial representation system. Our ability to distinguish between real-valued measurements is generally extremely poor in comparison with the density of numbers you can represent even in 64 bits, let alone the more than a trillion bits that might be employed in some hypothetical computer capable of simulating our universe.

Also note that many irrational numbers can be stored and exact arithmetic done on them within some bounded number of bits, though for any representation system there will always be numbers (including rational numbers!) that cannot. This doesn't have real effect on your argument, but I thought that it might be useful to mention.

Comment by JBlack on GPT-2's positional embedding matrix is a helix · 2023-07-22T10:05:38.892Z · LW · GW

Is there any sort of regularization in the training process, favouring parameters that aren't particularly large in magnitude? I suspect that even a very shallow gradient toward parameters with smaller absolute magnitude would favour more compact representations that retain symmetries.

Comment by JBlack on A simple way of exploiting AI's coming economic impact may be highly-impactful · 2023-07-18T01:26:08.482Z · LW · GW

How does this strategy lead to winning the gameboard? I take "winning" to mean "not dying due to AGI". It looks like the sort of strategy that might make some money[1], but has essentially zero impact on things that matter.

  1. ^

    If you are extremely experienced in investing, avoid all the associated risks that have nothing to do with performance of the company, and time everything just right.

Comment by JBlack on Why am I Me? · 2023-07-01T01:21:50.553Z · LW · GW

It's not meant to be convincing, since it doesn't make any argument. It's a version of the question.

You can obviously make models of the future, using whatever hypotheses you like. Those models then should be weighted by complexity of hypotheses and credence that they will accurately reflect the future based partly on retrodiction of the past, and the results will modify the very broad distribution that you get by taking nothing but birth rank. If you use a SSA evidence model, then this broad distribution looks something like P(T > kN) ~ 1/k.

If you take all the credible future models appropriately weighted and get a relatively low credence of doomsday before another N people come into existence, then the median of the posterior distribution of total people will be greater than that of the doomsday prior distribution.

Comment by JBlack on Why am I Me? · 2023-06-26T00:33:10.960Z · LW · GW

You can rewrite the doomsday question into more objective terms: "given evidence that N people have previously come into existence, what update should be made to the credence distribution for the total number of people to ever come into existence?"

Comment by JBlack on AI #17: The Litany · 2023-06-25T23:49:37.472Z · LW · GW

So you're now strongly expecting to die in less than 6 months? (Assuming that the tweet is not completely false)

https://twitter.com/RoyKishony/status/1672280665264386049

Comment by JBlack on AI #17: The Litany · 2023-06-24T00:37:43.227Z · LW · GW

Are you assuming that there will be a sudden jump in AI scientific research capability from subhuman to strongly superhuman? It is one possibility, sure. Another is that the first AIs capable of writing research papers won't be superhumanly good at it, and won't advance research very far or even in a useful direction. It seems to me quite likely that this state of affairs will persist for at least six months.

Do you give the latter scenario less than 0.01 probability? That seems extremely confident to me.

Comment by JBlack on Are vaccine safe enough, that we can give their producers liability? · 2023-06-24T00:30:20.286Z · LW · GW

A further problem with a liability model for vaccines is that they must be relatively cheap to be useful. They usually have large measurable public benefits, but highly variable and invisible private benefits. So any given person doesn't know whether or how much they benefited from a vaccination, but can easily attribute problems to it (correctly or not).

Under a product liability model the producer is getting only a small fraction of the positive value of vaccination to society, but is being stuck with costs plus penalties for most of the problems and legal risk of being incorrectly blamed for some other issues. The obvious response would be to build in a premium to the price of every vaccine, capturing more of the positive value to account for the risk. The higher price would reduce the prevalence of vaccination and greatly reduce the benefits to everyone.

So while product liability seems to be a 'fair' approach superficially, it's not really a suitable model even given very safe vaccines, and would make people substantially worse off on net.

Comment by JBlack on AI #17: The Litany · 2023-06-24T00:03:51.364Z · LW · GW

If that evidence would update you that far, then your space of doom hypotheses seems far too narrow. There is so much that we don't know about strong AI. A failure to be rapidly killed only seems to rule out some of the highest-risk hypotheses, while still leaving plenty of hypotheses in which doom is still highly likely but slower.

Comment by JBlack on Dagon's Shortform · 2023-06-22T00:03:48.410Z · LW · GW

Yes, and the converse: posts that are interesting and provide useful information, but have fundamentally wrong arguments and derive invalid conclusions.

Comment by JBlack on "Natural is better" is a valuable heuristic · 2023-06-21T01:53:09.754Z · LW · GW

I would agree with the body of this post to an extent, but the main problem is that nearly every instance of the heuristic I see actually in use are in situations where it is inapplicable.

Examples:

  • Expecting small fragments of natural systems to be better in an artificial environment such as plants in enclosed office spaces. There are benefits from some plants in general, but a few can produce unnaturally high concentrations of allergens or toxic insecticidal chemicals in such cases, in close proximity to people sharing the same space for an unnaturally large fraction of every day.
  • Applying "natural is better" to cases like medicine where typical natural outcomes are to die or suffer lasting injury.
  • Continuing to treat some substances as "natural" when they have been artificially processed and concentrated to an extent that never occurs in nature (such as herbal essences).

With that in mind I think that it is a heuristic that is heavily overused, and should be treated with extreme suspicion.

Comment by JBlack on Why I am not an AI extinction cautionista · 2023-06-19T03:11:01.068Z · LW · GW

I'm quite confident that it's possible, but not very confident that such a thing would be likely the first general superintelligence. I expect a period during which humans develop increasingly better models, until one or more of those can develop more generally better models by itself. The last capability isn't necessary to AI-caused doom, but it's certainly one that would greatly increase the risks.

One of my biggest contributors to "no AI doom" credence is that there are technical problems that prevent us from ever developing anything sufficiently smarter than ourselves to threaten our survival. I don't think it's certain that we can do that - but I think the odds are that we can, almost certain that we will if we can, and likely comparatively soon (decades rather than centuries or millennia).

Comment by JBlack on What is the foundation of me experiencing the present moment being right now and not at some other point in time? · 2023-06-19T02:33:40.665Z · LW · GW

Continuity is mostly an artifact of memory, I expect. The "you" of 11:30 remembers the experiences of 11:29 quite well, remembers 11:20 (often lesser in fidelity), but the experiences of 11:31 and other future times not at all. Frequent vivid memories of particular past experiences can break this feeling of continuity to some extent, though the directionality remains and establishes some ordering.

This would predict that people who have frequent vivid delusions of future experiences probably have a much weaker feeling of continuity of experience, if any such people exist.

Comment by JBlack on Douglas_Knight's Shortform · 2023-06-18T00:01:53.119Z · LW · GW

Alternatively: there are no conflicting experiments - there are simply experiments that measure different things.

The hard part is working out what the experiments were actually measuring, as opposed to what they were claimed to be measuring. In some cases the published results may be simply 'measuring' the creativity of the writers in inventing data. More honest experimenters may still measure things that they did not intend, or may generalize too far in interpreting the results.

Further experiments do very often help in all these situations.

Comment by JBlack on Demystifying Born's rule · 2023-06-15T01:57:37.096Z · LW · GW

So much the worse for the controversy and philosophical questions. If anything, the name is the problem. People get wrong ideas from it, and so I prefer to talk in terms of decoherence rather than "many worlds". There's only one world, it's just more complex than it appears and decoherence gives part of an explanation for why it appears simpler than it is.

Comment by JBlack on O O's Shortform · 2023-06-06T23:55:48.444Z · LW · GW

Reinforcement learning doesn't guarantee anything about how a system generalizes out of distribution. There are plenty of other things that the system can generalize to that are neither the physical sensor output nor human values. Separately from this, there is no necessary connection between understanding human values and acting in accordance with human values. So there are still plenty of failure modes.

Comment by JBlack on I bet everyone 1000€ that I can make them dramatically happier & cure their depression in 3 months! · 2023-06-05T00:27:29.807Z · LW · GW

This doesn't look like a bet. It looks like a service for which you charge €3500 and 3+ months of the customer's time, but will refund €2000 of that if they don't think you lived up to your claims.

Comment by JBlack on But What If We Actually Want To Maximize Paperclips? · 2023-05-26T01:44:39.636Z · LW · GW

This is an interesting discussion of the scenario in some depth, but with a one-line conclusion that is completely unsupported by any of the preceding discussion.

Comment by JBlack on Data and "tokens" a 30 year old human "trains" on · 2023-05-24T02:54:58.729Z · LW · GW

The data rate of optical information through human optic nerves to the brain have variously been estimated at about 1-10 megabits per second, which is two or three orders of magnitude smaller than the estimate here. Likewise the bottleneck on tactile sensory information is in the tactile nerves, not the receptors. I don't know about the taste receptors, but I very much doubt that distinct information from every receptor goes into the brain.

While the volume of training data is still likely larger than for current LLMs, I don't think the ratio is anywhere near so large as the conclusion states. A quadrillion "tokens" per year is an extremely loose upper bound, not a lower bound.

Comment by JBlack on The Benevolent Billionaire (a plagiarized problem) · 2023-05-21T23:47:17.727Z · LW · GW

I think this example omits the most important features of the Sleeping Beauty problem. It's just a standard update with much less indexical uncertainty and no ambiguity about how to conduct a "fair" bet when one participant may have to make the same bet a second time with their memory erased.

Comment by JBlack on When should I close the fridge? · 2023-05-18T00:16:01.595Z · LW · GW

The model here is not a very good one in one important respect: opening the door twice in quick succession does not cost anywhere near twice as much as opening it once.

When you leave open the door, the colder air falls out the bottom and drags more warm air in past the contents, raising their temperature quite rapidly. While there is some additional turbulent mixing in the airflow around the initially moving door, it is not really significant compared with the overall downward convection flow inside the fridge. The flowing air cools and falls out the bottom continuously.

When you close it again, the airflow quickly reduces to almost nothing as the air stratifies by temperature - the coldest air is no longer free to flow out. The rate of heat transfer into the food reduces substantially. Eventually the air cools via the refrigerated walls and starts cooling the contents back to an equilibrium temperature, though this takes quite a few minutes.

So in terms of the important thing - temperature of the contents - you're better off opening it each time as an open door heats the contents much faster than a closed one even with the same warm air in it.

Comment by JBlack on Solomonoff’s solipsism · 2023-05-16T03:47:54.296Z · LW · GW

Solomonoff induction is about computable models that produce conditional probabilities for an input symbol (which can represent anything at all) given a previous sequence of input symbols. The models are initially weighted by representational complexity, and for any given input sequence are further weighted by the probability assigned to the observed sequence.

The distinction between deterministic and non-deterministic Turing machines is not relevant since the same functions are computable by both. The distinction I'm making is between models and input. They are not the same thing. This part of your post

[...] world models which are one-dimensional sequences of states where every state has precisely one successor [...]

Confuses the two. The input is a sequence of states. World-models are any computable structure at all that provide predictions as output. Not even the predictions are sequences of states - they're conditional probabilities for next input given previous input, and so can be viewed as a distribution over all finite sequences.

Comment by JBlack on Solomonoff’s solipsism · 2023-05-09T01:22:04.703Z · LW · GW

You need to distinguish between world models - which can include any number of entities no matter how complex or useless - with the predictions made by those models. The predictions are sequences (more correctly, probability distributions over sequences). The models are not.

A world model could, for example, include hypothesized general rules for a universe, together with a specification of 13.7 billion years of history, that there exists a particular observer with specific details of some particular sensory apparatus, and that the sequence is based on the signal from that sensory apparatus. The actual distribution of sequences predicted by this model at some given time may be {0->0.9, 1->0.1}, corresponding to the observer having just been activated and most likely starting with a 0 bit.

The probability assigned by Solomonoff induction to this model is not zero. It is very small since this is a very complex model requiring a lot of bits to specify, but not zero. It may never be zero - that would depend upon the details of the predictions and the observations.

Comment by JBlack on Solomonoff’s solipsism · 2023-05-09T01:19:01.996Z · LW · GW

A Solomonoff hypothesis can be any computable model that predicts the sequence, including any model that also happens to predict a larger reality if queried in that way. There are always infinitely many such "large world" models that are compatible with the input sequence up to any given point, and all of them are assigned nonzero probability.

It is possible that there may be a simpler model that predicts the same sequence and does not model the existence of any other reality in any meaningful sense, but I suspect that a general universe model plus a fixed-size "you are here" will in a universe with computable rules remain pretty close to optimal.

Comment by JBlack on If alignment problem was unsolvable, would that avoid doom? · 2023-05-08T01:41:52.395Z · LW · GW

It has been explored (multiple times even on this site), and doesn't avoid doom. It does close off some specific paths that might otherwise lead to doom, but not all or even most of them.

Some remaining problems:

  • AI may be perfectly well capable of killing everyone without self-improvement;
  • An AI may be capable of some large self-improvement step, but not aware of this theorem;
  • Self-improving AI's might not care about whether the result is aligned with their former self, and indeed may not even have any goals at all before self-improvement;
  • AIs may create smarter AIs without improving their own capabilities, knowing that the result won't be fully aligned but expecting that they can nevertheless keep the result under control (and they were wrong);
  • In a population with many AIs, those that don't self-improve may be out-competed by those that do - leading to selection for AIs that self-improve regardless of consequences;
  • It is extremely unlikely that a mere change of computing substrate would meet the conditions of such a theorem, so an AI can almost certainly upgrade its hardware (possibly by many orders of magnitude) to run faster without modifying its mind in any fundamental way.

At this point my 5-minute timer on "think up ways things can still go wrong" ran out, and I just threw out the dumbest ideas and listed the rest. I'm sure with more thought other objections could be found.

Comment by JBlack on What is it like to be a compatibilist? · 2023-05-06T01:22:02.987Z · LW · GW

I would say that they neither choose nor influence HA and HB, assuming that the universe in question follows some sort of temporal-causal model. Non-causal universes or those in which causality does not follow a temporal ordering are much more annoying to deal with and most people don't have them in mind when talking about free will, so I wouldn't include them in exploration of a more 'central' meaning. However, there is some literature in which the concept of free will in universes with other types of determinism is discussed.

I distinguish between "influence" and "choice" since answer 1 posited that the relationship between the various parts of the universe wasn't known to the agent. The agent does not know that future Fx follows choice Cx nor that Cx follows from past Hx, and by answer 2 does not even know the difference between HA and HB. If FA includes some particular outcome OA that causally follows from CA but isn't in FB, and the agent choosing CA does not know that, then I would not say that the agent chose OA. They chose CA, which influenced OA.

There are lots of different ways to address different forms of "ability to do otherwise", each of which is useful and relevant to different questions about free will, and so they all lead to different shades of meaning for "free will" even including nothing more than what you've just said. However, different people communicate different explicit and implicit assumptions about what "free will" means in their communication, and so necessarily mean somewhat different things by the term. Each of the aspects I mentioned in my post come from multiple respected writers on the subject of free will.

So no, it's not a redefinition. It's a recognition that the meaning of the term in practice varies with person and context, and that it doesn't so much have a single meaning as a collection of related meanings. From long experience, proposing a much more specific definition is one of the surest ways to end up squabbling pointlessly over semantics. This is one of the major failure modes of discussions of free will, and where possible I prefer to start from a point of recognizing that it is a broad term, not a narrow one.

Comment by JBlack on What is it like to be a compatibilist? · 2023-05-05T08:03:10.523Z · LW · GW

It does not bother me at all, since it doesn't actually address any of the factors that are relevant to my compatibilist position on free will.

The first part to understand is that I see the term "free will" as having a whole range of different shades of meaning. Most of these involve questions of corrigibility, adaptability, predictability, moral responsibility, and so on. Many of these shades of meaning are related to each other. Most of them are compatible with determinism, which is why I would describe my position as mostly compatibilist.

The description given in this post doesn't appear to be related to any of these, but with mere physical correlation in a toy universe simplified beyond the point of recognizability or relevance. Further questions would need to be answered in order to even begin to consider whether the agent in this post's question has "free will" in any of the relevant senses. For example:

  • To what extent does the agent know the relation between the H's, C's and F's?
  • Would the deciding agent perceive HA and HB as being identical up the the point of decision?
  • Is it the same agent making the decision in universes HA and HB?
  • What basis for judgement is used for the preceding answer?

In a fairly "central" example, my expectation would be:

  • The agent does not know these relations;
  • That the agent does perceive HA and HB as being identical;
  • That in most important respects the agents are considered to be "the same", by some sort of criterion such as:
  • They themselves would recognize each other's memories, personalities, and past decisions as being essentially "their own". (They may diverge in future)

In this case I would say that this agent (singular, due to the third answer) has free will in most important respects (mostly due to answer 2 but also somewhat due to 1),  can be said to choose CA or CB, influences FA or FB but does not choose them, and likewise does not choose HA or HB.

If you have different answers to those questions, my answers and the reasons behind them may change.

Comment by JBlack on Formalizing the "AI x-risk is unlikely because it is ridiculous" argument · 2023-05-04T00:59:49.498Z · LW · GW

Actually the correct conclusion to draw from the graph is that if things continue as they have been, then something utterly unpredictable happens at 2047 because the model breaks there and predicts imaginary numbers for future gross world product.

The graph is not at all an argument for "business as usual"! Bob should be expecting something completely unprecedented to happen in less than 30 years.

Comment by JBlack on Systems that cannot be unsafe cannot be safe · 2023-05-03T00:26:40.509Z · LW · GW
Comment by JBlack on Technological unemployment as another test for rationalist winning · 2023-05-03T00:09:01.383Z · LW · GW

Moravec's paradox doesn't really apply anymore, so it's worth updating in that direction. "Reasoning" as envisioned in 1976 was a very narrow thing that they thought would generalize with very little compute. They were wrong.

Even moderately accurate reasoning about the real world appears to require more compute than most people have access to today, as even GPT-4 with its probably 10^14 FLOP per answer inference costs can't do it well.

On the other hand, complex sensorimotor tasks can be handled in portable computing devices that can be embedded in robots. The expensive part of a robot isn't the compute, it's all the sensors and actuators and the reasoning required to apply those sensorimotor capabilities.

Comment by JBlack on Will GPT-5 be able to self-improve? · 2023-05-02T23:31:20.457Z · LW · GW

My estimate is based on the structure of the problem and the entity trying to solve it. I'm not treating it as some black-box instance of "the dumbest thing can work". I agree that the latter types of problem should be assigned more than 0.01%.

I already knew quite a lot about GPT-4's strengths and weaknesses, and about the problem domain it needs to operate in for self-improvement to take place. If I were a completely uneducated layman from 1900 (or even from 2000, probably) then a probability of 10% or more might be reasonable.

Comment by JBlack on Matthew Barnett's Shortform · 2023-05-02T05:49:43.648Z · LW · GW

I think we have radically different ideas of what "moderately smarter" means, and also whether just "smarter" is the only thing that matters.

I'm moderately confident that "as smart as the smartest humans, and substantially faster" would be quite adequate to start a self-improvement chain resulting in AI that is both faster and smarter.

Even the top-human smarts and speed would be enough, if it could be instantiated many times.

I also expect humans to produce AGI that is smarter than us by more than GPT-4 is smarter than GPT-3, quite soon after the first AGI that is as "merely" as smart as us. I think the difference between GPT-3 and GPT-4 is amplified in human perception by how close they are to human intelligence. In my expectation, neither is anywhere near what the existing hardware is capable of, let alone what future hardware might support.

Comment by JBlack on Will GPT-5 be able to self-improve? · 2023-05-02T04:33:30.645Z · LW · GW

MY guess at whether GPT-4 can self-improve at all with a lot of carefully engineered external systems and access to its own source code and weights is a great deal higher than that AutoGPT would self-improve. The failure of AutoGPT says nothing[1] to me about that.

  1. ^

    In the usual sense of not being anywhere near worth the effort to include it in any future credences.

Comment by JBlack on Will GPT-5 be able to self-improve? · 2023-05-02T04:23:20.468Z · LW · GW

Assigning 10% seems like a lot in the context of this question, even for purposes of an example.

What if you had assigned less than 0.01% to "RSI is so trivial that the first kludged loop to GPT-4 by an external user without access to the code or weights would successfully self-improve"? It would have been at least that surprising to me if it had worked.

Failure to achieve it was not surprising at all, in the sense that any update I made from this would be completely swamped by the noise in such an estimate, and definitely not worth the cognitive effort to consciously carry it through to any future estimates of RSI plausibility in general.

Comment by JBlack on Accidental Terraforming · 2023-04-28T02:18:27.220Z · LW · GW

One term I've seen, first in a game as an option to do to enemy planets but later in discussions of climate change, was "deterraforming".

Comment by JBlack on Genetic Sequencing of Wastewater: Prevalence to Relative Abundance · 2023-04-27T06:48:55.374Z · LW · GW

I'm not sure how usefully these sorts of quantities would correlate. For example, there's a couple of cases I read about where single individuals (probably immunocompromised) have produced thousands of times more sequencing abundance in wastewater than others. It might work out if you discard outliers like these?

Comment by JBlack on How did LW update p(doom) after LLMs blew up? · 2023-04-24T02:41:05.872Z · LW · GW

I'll make an even stronger bet: I will bet any amount of USD you like, at any odds you care to name, that USD will never become worthless.