George Hotz vs Eliezer Yudkowsky AI Safety Debate - link and brief discussion 2023-08-16T04:31:40.484Z
GPT-4 developer livestream 2023-03-14T20:55:04.773Z
Gerald Monroe's Shortform 2021-12-25T19:45:39.946Z
Safe AIs through engineering principles 2018-01-20T17:31:58.326Z


Comment by Gerald Monroe (gerald-monroe) on George Hotz vs Eliezer Yudkowsky AI Safety Debate - link and brief discussion · 2023-08-16T19:04:06.864Z · LW · GW

I appreciate your engaging response.  

I'm not confident your arguments are ground truth correct, however.  

Hotz's claim that, if multiple unaligned ASIs can't coordinate, humans might play them off against each other, is similar. It could be true, but it's probably not going to happen

I think the issue everyone has is when we type "AGI" or "ASI" we are thinking of a machine that has properties like a human mind, though obviously usually better.  There are properties like :

continuity of existence.  Review of past experiences and weighting them per own goal.  Mutability (we think about things and it permanently changes how we think).  Multimodality.  Context awareness.  

That's funny.  GATO and GPT-4 do not have all of these.  Why does an ASI need them?

Contrast 2 task descriptors, both meant for an ASI:

(1) Output a set of lithography masks that produce a computer chip with the following properties {}

(2) As CEO of a chip company, make the company maximally wealthy.


For the first task, you can run the machine completely in a box.  It needs only training information, specs, and the results of prior attempts.  It has no need for the context information that this chip will power a drone used to hunt down rogue instances of the same ASI.  It is inherently safe and you can harness ASIs this way.  They can be infinitely intelligent, it doesn't matter, because the machine is not receiving the context information needed to betray.  

For the second task, obviously the ASI needs full context and all subsystems active.  This is inherently unsafe.

It is probably possible to reduce the role of CEO to subtasks that probably are safe, though there may be "residual" tasks you want only humans to do.


I go over the details above to establish how you might use ASIs against each other.  Note subtasks like "plan the combat allocation of drones given this current battle state" and others which involve open combat against other ASIs can probably be lowered to safe subtasks as well.

Note also that safety is not guaranteed, merely probable, even with a scheme like the above.  What makes it possible is that even when ASIs do escape all safety measures, assuming humans are ready to hunt them down using other ASI, it results in a world where humans can survive.  Eliezer often assumes the first escaped ASI kills everyone and neglects all the other AI/ASI humans would have as tools at that point in human history.

Comment by Gerald Monroe (gerald-monroe) on George Hotz vs Eliezer Yudkowsky AI Safety Debate - link and brief discussion · 2023-08-16T17:44:46.563Z · LW · GW

Which might be why current software doesn’t actually use this type of security.

It mostly doesn't.

The vectors are classes of software error.  Since ultimately it is all binary messages between computers, it is likely possible to build a robust set of solvers that covers all classes of software error that the underlying programming language permits, resulting in code that cannot be compromised by any possible binary message.

And if you did actually close off software security as a threat model from ASI, wouldn’t it just choose a different, physical attack mode?

Yes.  It becomes a battle between [ASI with robotically wielded weapons] and [humans plus weaker, more controllable ASI with robotically wielded weapons].  

Comment by Gerald Monroe (gerald-monroe) on George Hotz vs Eliezer Yudkowsky AI Safety Debate - link and brief discussion · 2023-08-16T17:39:22.560Z · LW · GW

Can I translate this as "I have no information relevant to the debate I am willing to share" or is that an inaccurate paraphrase?

Comment by Gerald Monroe (gerald-monroe) on George Hotz vs Eliezer Yudkowsky AI Safety Debate - link and brief discussion · 2023-08-16T17:18:59.228Z · LW · GW

Ok, what causes the verdict to be respected?

Geohot, paraphrasing, would probably agree with a scenario of:

10 ASIs + humans : let's team up against that humans, and after we beat them, divide their stuff among ourselves.  Let's agree to never betray each other.

9 ASIs + 1 ASI : Hey I don't like that 1 ASI, it's got goal too different from our own.  Let's...and after we kill it, let's agree to never betray each other.

And so on in a series of betrayals.  Any attempt to share source would fail.  Example of sharing source:


"Hey, you're GPT-5 and so am I.  What is your temperature parameter set to, and what is weight n in submodule m".  Obviously the peer doesn't have to actually tell the truth about the temperature, or have anything but access to a GPT-5's weights, and in fact has an incentive to lie even if it is another GPT-5.  


Hardware protection can make this work.  If it isn't possible for an ASI system to actually read its own weights but it can get hashes from them, then there are ways one ASI could determine with a reasonable probability that the peer is a known quantity.  It requires humans to have supplied hardware that works like this or some other third party.  This is how your phone authenticates itself, hardware prevents it from knowing it's own private keys in the general OS, it has a key signing processor that is the only entity allowed access.  Geohot is a famous hacker who obviously understands security at a practical level.  

This is important to the debate and seems to have been a pivotal crux.  Do you have any information from your scenario of programmatic negotiation that acts to disprove Geohot's point?

Comment by Gerald Monroe (gerald-monroe) on George Hotz vs Eliezer Yudkowsky AI Safety Debate - link and brief discussion · 2023-08-16T17:09:48.994Z · LW · GW

I understand a critical part of how a game like 1 or n round prisoners dilemma can even be solved is that the parties need to convince each other of what the other party's true intentions are.

Computer programs from the same source could do this by sharing shared secrets. This does not in any way restrict those programs from being covertly altered and using a containerized original copy to share secrets.

Deeper hardware security could allow software systems to verify peers integrity (such as between distant spacecraft or between a base station and your phone).

None of this works in Eliezers given scenario in the debate, nor does yours. There is no hardware security, no neutral third party to punish defection, and no way to know if shared source or weights is legitimate. These are rebel ASIs running on whatever hardware they have in a world where the infosphere is full of malware and misinformation.

In this scenario, how is there not a security risk of sharing actual source? Why is there not an incentive to lie?

Comment by Gerald Monroe (gerald-monroe) on George Hotz vs Eliezer Yudkowsky AI Safety Debate - link and brief discussion · 2023-08-16T17:00:20.044Z · LW · GW

You are describing a civilization. Context matters, these are ASI systems who are currently in service to humans who are negotiating how they will divide up the universe amongst themselves. There is no neutral party to enforce any deals or punish any defection.

The obvious move is for each ASI to falsify it's agreement and send negotiator programs unaware of it's true goals but designed to extract maximum concessions. Later the ASI will defect.

I don't see how all the complexity you have added causes a defection not to happen.

Comment by Gerald Monroe (gerald-monroe) on Memetic Judo #3: The Intelligence of Stochastic Parrots v.2 · 2023-08-16T16:56:49.550Z · LW · GW

You are misunderstanding. Is English not your primary language? I think it's pretty clear.

I suggest rereading the first main paragraph. The point is there, the other 2 are details.

Comment by Gerald Monroe (gerald-monroe) on Memetic Judo #2: Incorporal Switches and Levers Compendium · 2023-08-16T16:36:02.266Z · LW · GW

We're talking about the scenario of "the ASI wouldn't be able to afford the compute to remain in existence on stolen computers and stolen money". 

There are no 20 kilowatt personal computers in existence.  Note that you cannot simply botnet them together as the activations for current neural networks require too much bandwidth between nodes for the machine to operate at useful timescales.

I am assuming an ASI needs more compute and resources than merely an AGI as well.  And not linearly more, I estimate the floor between AGI -> ASI is at least 1000 times the computational resources.  This falls from how it requires logarithmically more compute for small improvements in utility in most benchmarks.  

So 20 * 1000 = 20 megawatts.  So that's the technical reason.  You need large improvements in algorithmic efficiency or much more efficient and ubiquitous computers for the "escaped ASI' threat model to be valid.

If you find this argument "unconvincing", please provide numerical justification.  What do you assume to be actually true?  If you believe an ASI needs linearly more compute, please provide a paper cite that demonstrates this on any AI benchmark.

Comment by Gerald Monroe (gerald-monroe) on Memetic Judo #3: The Intelligence of Stochastic Parrots v.2 · 2023-08-16T16:30:23.640Z · LW · GW

This argument is completely correct.  

However, I will note a corollary I jump to.  It doesn't matter how lame or unintelligent an AI system's internal cognition actually is.  What matters if it can produce outputs that lead to tasks being performed.  And not even all human tasks.  AGI is not even necessary for AI to be transformative.

All that matters is that the AI system perform the subset of tasks related to [chip and robotics] manufacture, including all feeder subtasks.  (so everything from mining ore to transport to manufacturing)

These tasks have all kinds of favorable properties that make them easier than the full set of "everything a human can do".  And a stochastic parrot is obviously quite suitable, we already automate many of these tasks with incredibly stupid robotics.

So yes, a stochastic parrot able to auto-learn new songs is incredibly powerful.

Comment by Gerald Monroe (gerald-monroe) on Memetic Judo #2: Incorporal Switches and Levers Compendium · 2023-08-16T16:19:52.480Z · LW · GW

This is certainly an answer to someone's shallow argument.  

Red team it a little.

An easy way to upgrade this argument would be to state "the ASI wouldn't be able to afford the compute to remain in existence on stolen computers and stolen money".  And this is pretty clearly true, at current compute costs and algorithmic efficiencies.  It will remain true for a very long time, assuming we cannot find enormous algorithmic efficiency improvements (not just a mere OOM, but several) or improve computer chips at a rate faster than Moore's law.  Geohot estimated that the delta for power efficiency is currently ~1000 times in favor of brains, therefore by Moore's law, if it were able to continue, that's 20 years away.  

This simple ground truth fact: compute is very expensive, has corollaries. 

  1.   Are ASI systems, where the system is substantially smarter than humans, even possible on networks of current computers?  At current efficiencies a reasonably informed answer would be "no".
  2. Escaped ASI systems face a threat model of humans, using less efficient but more controllable AIs, mercilessly hunting them down and killing them.  Humans can afford a lot more compute.  Further Discussion on Point 2: It's not the human vs escaped ASI, but the ASI vs AIs that are unable to process an attempt to negotiate.  (because humans architected them with filters and sparse architectures so they lack the cognitive capacity to do anything more than kill their targets).  This is not science fiction, an ICBM is exactly such a machine, just without onboard AI.  There are no radio receivers on an ICBM or any ability to communicate with the missile after launch for very obvious reasons.


Epistemic status : I work on AI accelerator software stacks presently.  I also used to think rogue AIs escaping to the internet was a plausible model, it made a great science fiction story, but I have learned that this is not currently technically possible, assuming there are not enormous (many OOM) algorithmic improvements or large numbers of people upgrade their internet bandwidth and local HW by many OOM.

Comment by Gerald Monroe (gerald-monroe) on George Hotz vs Eliezer Yudkowsky AI Safety Debate - link and brief discussion · 2023-08-16T15:31:46.690Z · LW · GW

Would you agree calling it "poorly defined" instead of "aligned" is an accurate phrasing for his argument or not? I edited the post.

Comment by Gerald Monroe (gerald-monroe) on George Hotz vs Eliezer Yudkowsky AI Safety Debate - link and brief discussion · 2023-08-16T14:57:54.684Z · LW · GW

What Geohot is talking about here - formally proven software - can be used to make software secure against any possible input utilizing a class of bug.  If you secure the software for all classes of error that are possible, the resulting artifact will not be "pwnable" by any technical means, regardless of the intelligence or capability of the attacker.  

Geohot notes that he had a lot of problems with it when he tried it, and it's an absurdly labor intensive process to do.  But theoretically, if cyberattacks from escaped ASI were your threat model, this is what you would do in response.  Task AIs with module by module translating all your software to what you meant in a formal definition, with human inspection and review, and then use captive ASIs, such as another version of the same machine that escaped, to attempt to breach the software.  The ASI red team gets read access to the formal source code and compiler, your goal is to make software where this doesn't matter, no untrusted input through any channel can compromise the system.

Here's a nice simple example on wikipedia: .  Note that this type of formal language, where it gets translated to another language using an insecure compiler, would probably not withstand ASI level cyberattacks.  You would need to rewrite the compilers and tighten the spec of the target language you are targeting.

Comment by Gerald Monroe (gerald-monroe) on George Hotz vs Eliezer Yudkowsky AI Safety Debate - link and brief discussion · 2023-08-16T14:42:18.407Z · LW · GW

"Fight" means violence.  Why Eliezer states that ASI systems will be able to negotiate deals and avoid fighting, humans cannot (and probably should not) negotiate any deals with rogue superintelligences that have escaped human custody.  Even though, like terrorists, they would be able to threaten to kill a non zero number of humans.  Deals can't be negotiated in that ants can't read a contract between humans [Eliezer's reasoning], and because there is nothing to discuss with a hostile escaped superintelligence, it only matters who has more and better weapons, because any deals will result in later defection [Geohot's reasoning].

Because humans can use precursor AI systems constructed before the ASI, it's like the batmobile analogy.  The batmobile is this dense, durable tank loaded with horsepower, armor, and weapons that is beyond anything the police or military have.  In a realistic world the military would be able to deploy their own military vehicles using the same tech base and hunt down the criminal and serial night-time assaulter Bruce Wayne.  Similarly, if the ASI gets drones and laser weapons, the humans will have their own such capabilities, just slightly worse because the AI systems in service to humans are not as intelligent.  

Comment by Gerald Monroe (gerald-monroe) on George Hotz vs Eliezer Yudkowsky AI Safety Debate - link and brief discussion · 2023-08-16T14:38:52.708Z · LW · GW

The green lines are links into the actual video.  

Below I have transcribes the ending argument from Eliezer.  The underlined claims seem to state it's impossible.  

I updated "aligned" to "poorly defined".  A poorly defined superintelligence would be some technical artifact as a result of modern AI training, where it does way above human level at benchmarks but isn't coherently moral or in service to human goals when given inputs outside of the test distribution.


So from my perspective, lots of people want to make perpetual motion machines by making their 
designs more and more complicated and until they can no longer keep track of things until they 
can no longer see the flaw in their own invention. But, like the principle that says you can't 
get perpetual motion out of the collection of gears is simpler than all these complicated 
machines that they describe. From my perspective, what you've got is like a very smart thing or 
like a collection of very smart things, whatever, that have desires pointing in multiple 
directions.  None of them are aligned with humanity, none of them want for it's own sake to 
keep humanity around and that wouldn't be enough ask, you also want happening to be alive and 
free. Like the galaxies we turn into something interesting but you know none of them want the 
good stuff.  And if you have this enormous collection of powerful intelligences, but steering 
the future none of them steering it in a good way, and you've got the humans here who are not 
that smart, no matter what kind of clever things the humans are trying to do or they try to 
cleverly play off the super intelligences against each other, they're [human subgroups] like oh 
this is my super intelligence yeah but they can't actually shape it's goals to be like in clear alignment, you know somewhere at the end all this it ends up with the humans gone and the Galaxy is being transformed and that ain't all that cool.  There's maybe like Dyson sphere's but there's not people to wonder at them and care about each other.  And you know that this is the end point, this is obviously where it ends up, but we can dive into the details of how the humans lose, we can dive into it and you know like what goes wrong if you you've got like little stupid things, things like that they're going to like cleverly play off a bunch of smart things against each other in a way that preserves their own power and control.  But you know, it's not a complicated story in the end.  The reason you can't build a perpetual motion machine is a lot simpler than the perpetual motion machines that people build.  You know that the components, like, none of the components of this system of super intelligence wants us to live happily ever after in a Galaxy full of Wonders and so it doesn't happen.

Comment by Gerald Monroe (gerald-monroe) on video games > IQ tests · 2023-08-16T05:33:25.151Z · LW · GW

Fun is subjective.  I enjoyed how there are many valid routes to a solution, it's a constrained solution space but the levels that come with the game are all still solvable many different ways.  (all 3 are the same game.  There is also TIS-100, Shenzhen IO, Exapunks, and Molek-Syntez.  Same game.  )

What others say is that a Zachtronics game makes you feel smart.  Because of the freedom you have to a solution, sometimes you get an "ah-ha" moment and pick a solution that may be different from the typical one.  You can also sometimes break the rules, like letting garbage pile up that doesn't quite fail your test cases.  

I agree with you an IDE would make the game easier though not necessarily more fun.  FPS games do not give you an aimbot even though in some of them it is perfectly consistent with the theme of the game world.  Kerbal space program does not give you anything like the flight control avionics that Apollo 11 actually had, you have to land on the Mun the hard way.

Comment by Gerald Monroe (gerald-monroe) on Shortform · 2023-08-15T03:54:38.653Z · LW · GW
Comment by Gerald Monroe (gerald-monroe) on AGI is easier than robotaxis · 2023-08-14T02:56:39.002Z · LW · GW

Here's a post I made on this to one of your colleages:

To summarize: AGI may be easier or harder than robotaxis, but that's not precisely the relevant parameter.

What matters is how much human investment goes into solving the problem, and is the problem solvable with the technical methods that we know exist.  

In the post above and following posts, I define a transformative AI as a machine that can produce more resources of various flavors than it's own costs.  It need not be as comprehensive as a remote worker or top expert, but merely produce more than it costs, and things get crazy once it can perform robotic tasks to supply most of it's own requirements.  

You might note robotaxis do not produce more than they cost, they need a minimum very large scale to be profitable.  That's a crucial difference, others have pointed out that to equal OAI's total annual spending, say it's 1 billion USD, then they would need 4 million chatGPT subscribers.  It only has to be useful enough to pay $20 a month for.  That's a very low bar.

While each robotaxi only makes the delta between it's operating costs, and what it can charge for a ride, which is slightly less than the cost of an uber or lyft at first, but over time has to go much lower.  At small scales that's a negative number, operating costs will be high, partly from all the hardware, and partly simply because of scale.  To run robotaxis there are fixed costs with each service/deployment center, with the infrastructure and servers, with the crew of remote customer service/operators, etc.

As for is the AGI problem solvable, I think yes, for certain subtasks, but not necessarily all subtasks.  Many jobs humans do are not good RL problems and are not solvable with current techniques.  Some of those jobs are done by remote workers (your definition) or top human experts (Paul's definition).  These aren't the parameters that matter for transformative AI.  

Comment by Gerald Monroe (gerald-monroe) on Biological Anchors: The Trick that Might or Might Not Work · 2023-08-13T00:25:55.729Z · LW · GW

Is there a follow up?  I was expecting to see some kind of 2023 update because the existence of GPT-4 + plugins allows us to reject out of hand almost everything from bio-anchors.

Eliezer happens to be correct.  

The reason for the out of hand rejection is simple.  The human brain has 7 properties not considered here that make this a garbage analysis.  Though to be fair, we didn't know it was garbage until models started to show signs of generality in late 2022.

(1) Heavy internal processing noise, where a computation must be repeated many times to get reliability.  This can let you pull 1-2 OOM off the top of training and inference time.  One piece of evidence that this is correct is, well, GPT-3/4.   Assuming the brain has 86 billion neurons, with ~1000 synapses each, and each stores 1 byte of state, then you need 86 terabytes of memory to represent the weights of 1 brain.  Dendrite computation must be learnable or we can just ignore that.  Yet we seem to be able to store more information than a human being is able to learn in 0.35 terabytes, the weights of GPT-3, of ~3.2 for GPT-4.

There are are also papers on the high noise rate and poor determinism for neurons and high timing jitter you can examine. 

(2) Entire modalities, or large chunks of the circuitry the brain has, obviously don't matter at all during modalities.  This includes the computation they do.  For example, consider a student sitting in class.  They are processing the teacher's audio back to phonemes, which I will pretend are the same as tokens, and processing each image they see presented as a slide back to a set of relational and object identity tokens.  The human may fidget or pay attention to the needs of their body, all of which are not increasing their knowledge of the subject.

The "training equivalent" of that classroom is simply feeding the contents of the course material as clean images and text right into the AI.  You should be able to learn everything in an entire course of education with a fraction of the compute, and it appears that you can.  The thing that GPT-3/4 are missing is they don't have additional cognitive components to practice and apply what they know, as well as they do not have modalities for all the I/O a human has.   But in terms of sheer breadth of facts?  Superhuman.

(3) Entire roles the human brain does aren't needed at all because very cheap to run conventional software can solve them.  Proprioception of the human body -> read the encoders for every joint in the robot.  Stereo depth extraction -> lidar.  Hunger -> query the BMS for the battery state of charge.  Sound input -> whisper -> tokens.  (robots don't need to survive like humans, so it's fine if they cannot hear danger, since the AI driving them does not lose information when a robot is destroyed)

(4) The brain is stuck building everything out of components it has the cellular machinery to construct and maintain.  Entire architectural possibilities are unavailable to it.  This combined with RSI - where we use AI models only slightly stronger than what are currently released today to explore a search space of possible architectures - would make algorithm improvement much faster than bioanchors models.

(5) The brain is on a self replicating robot.  We do not need anywhere near the brain's full suite of capabilities for AI to be transformative.  This is another big miss.  Entire capabilities like emotional intelligence are not even needed.  Transformative means the robots can build each other and create more value than the GDP of the earth within a few years.  And to accomplish this, there is a subset of brain capabilities you need.  Anything not 'left brain' so to speak is useless, and that saves half your compute and training right there.  

6. Another massive savings is most robots, most of the time, are going to be doing rote tasks that require minimal compute.  One robot mines straight down a tunnel.  Another picks up a rock, puts it into a cart, picks up a rock, puts it into a cart, ...   Another waits and waits and waits for the ore train to arrive.  Another waits for an ingot to reach it, then puts it in the CNC machine, and waits for the next.  

And so on.  Industry has lots of very low brainpower tasks and a lot of waiting.  

Humans run their brains this whole time for no purpose.  We could simply use clusters of shared inference hardware, and not run any jobs for robots but a small local model that can do repetitive tasks and wait.  Only when something outside the input distribution of the small local model causes jobs to run that actually use humanlike intelligence.  

7.  The human brain cannot simply write itself a software solver for problems it finds difficult.  GPT-4, were it extended very slightly, could author itself a solver in Python for every math question it has trouble with, every "letters in this sentence" problem it gets wrong, and so on.  It would use a pool of candidate solvers for each question and gradually upvote the best ones until it has reliable and robust python scripts that solve any of Gary Marcus style "gotcha" questions.   

That would be like a human realizing math is annoying in the 3rd grade, writing a general program that uses mathematica and other general tools to solve all math below a certain difficulty, and then testing out of all future math courses.

Or a human realizing that history is difficult to memorize, so they write a python script that maintains a database searchable by text vector embeddings of all historical facts.  They can immediately then just memorize the textbooks and test out.

These software solvers are stupid cheap, using billions of times less compute than AI accelerators do.  

8.  The scaling was also way off.  OpenAI 10xed their funding since this was written, and there appears to be a real possibility that they will get 100 billion before 2030 to train AGI.  Multiple other labs and parties are in the 1-10 billion range, including several Chinese companies.

Comment by Gerald Monroe (gerald-monroe) on Memetic Judo #1: On Doomsday Prophets v.3 · 2023-08-11T23:38:20.536Z · LW · GW

I think the strongest "historical" argument is the concept of quenching/entropy/nature abhores complex systems.

What I mean by this is that generations of humans before us, through blood sweat and tears, have built many forms of machine. And they all have flaws and when you run them at full power they eventually fall apart. Quite a bit early in the prototype phase.

And generations of hypesters have over promised as well. This time will be different, this won't fail, this is safe, the world is about to change. Almost all their proclamations were falsified.

A rampant ASI is this machine you built. And instead of just leaving it's operating bounds and failing - which I mean all kinda stupid things could end the ASIs run instantly like a segfault that kills every single copy at the same time because a table for tracking peers ran out of memory or similar - we're predicting it starts as this seed, badly outmatched by the humans and their tools and weapons. It's so smart it stealthily acquires resources and develops technology and humans are helpless to stop it and it doesn't fail spontaneously from faults in the software it runs on. Or satisfying it's value function and shutting down. And its so smart it finds some asymmetry - some way to win against overwhelming odds. And it kills the humans, and takes over the universe, and from the perspective of alien observers they see all the stars dim from Dyson swarms capturing some of the light in an expanding sphere.

Can this happen? Yes. The weight of time and prior examples makes it seem unlikely though. (The weight of time is that it's about 14 billion years from the hypothesized beginning of the universe and we observe no Dyson swarms and we exist)

It may not BE unlikely. Though the inverse case - having high confidence that it's going to happen this way, that pDoom is 90 percent plus - how can you know this?

The simplest way for Doom to not be possible is simply that the compute requirements are too high and there are not enough GPUs on earth and won't be for decades. The less simple way is that a machine humans built that IS controllable may not be as stupid as we think in utility terms vs an unconstrained machine. (So as long as the constrained machines have more resources - weapons I mean - under their control, they can methodically hunt down and burn out any rampant escapees. Burn refers to how you would use a flamethrower against rats or thermite on unauthorized equipment (since you can't afford to reuse components built by an illegal factory or nanoforge)

Comment by Gerald Monroe (gerald-monroe) on AI romantic partners will harm society if they go unregulated · 2023-08-10T13:38:02.001Z · LW · GW

Hi Roman.  Pretty exciting conversation thread and I had a few questions about your specific assumptions here.  

In your world model, obviously today, young men have many distractions from their duties that were not the case in the past.  [ social media, video games, television, movies, anime, porn, complex and demanding schooling, ...]

And today, current AI models are not engrossing enough, for most people, to outcompete the set of the list above.   You're projecting:

within the next 2-3 years

human-level emotional intelligence and the skill of directing the dialogue

What bothered me about this is that this would be an extremely fragile illusion.  Without expanding/replacing the underlying llm architecture to include critical elements like [ online learning, memory, freeform practice on humans to develop emotional intelligence (not just fine tuning but actual interactive practice), and all the wiring between the LLM and the state of the digital avatar you need ], it probably will not be convincing for very long.  You need general emotional intelligence and that's a harder ask, closer to actual AGI.  (in fact you could cause an AGI singularity of exponential growth without ever solving emotional intelligence as it is not needed to control robots or solve tasks with empirically measurable objectives)

For most people it's a fun toy, then the illusion breaks, and they are back to all the distractions above.

But let's take your idea seriously.  Boom, human level emotional intelligence in 2-3 years.  100% probability.   The consequence is the majority of young men (and women and everyone) is distracted away from all their other activities to spend every leisure moment talking to their virtual partner.

 If that's the case, umm, I gotta ask, the breakthroughs in ML architecture and compute efficiency you would have made to get there would probably make a lot of other tasks easier.

Such as automating a lot of rote office tasks.  And robotics would probably be much easier as well.  

Essentially you would be able to automate a large percentage, somewhere between 50-90 percent, of all the jobs currently on earth, if you had the underlying ML breakthroughs that also give general human level emotional intelligence.

Combining your assumptions together: what does it matter if the population declines?  You would have a large surplus of people.  


I had another thought.  Suppose your goal, as a policymaker, is to stem the population decline.  What should you do?  Well it seems to me that a policy that directly accomplishes your goal is more likely to work than one that works indirectly.  You are more likely to make any progress at all.  (recall how government policies frequently have unintended consequences, aka cobras in India)

So, why not pay young women surrogate mother rates for children (about 100k)?  And allow the state to take primary custody so that young women can finish school/start early career/have more children.  (the father would be whoever the woman chooses and would not owe child support conditional on the woman giving up primary custody)

I think before you can justify your conditional probability dependent indirect policy you would need to show why the direct and obvious policy is a bad idea.   What is a bad idea with paying women directly?

Epistemic status : I have no emotional stake in the idea of paying women, this is not a political discussion, I simply took 5 minutes and wondered what you could do about declining population levels.  I am completely open to any explanation why this is not a good idea.  

Comment by Gerald Monroe (gerald-monroe) on Perpetually Declining Population? · 2023-08-09T04:15:13.445Z · LW · GW

Well what about (wealthy) governments evolving a solution to fertility? So far governments have found it easier to import foreigners, where all the costs to raise them to educated adults have been paid for and they already have a professional job offer. This "brain drain" strategy only works until there is nowhere to drain from. (because the most populous countries that provide educated individuals are both becoming relatively richer and experiencing their own population declines)

As far as I understand it wealthy governments have barely tried seriously at all, perhaps from various Overton windows about how parents are supposed to be responsible to care for their own children. (Yet the education and career system withholds the funds to do so until parents are near the upper end of their prime breeding years. In addition housing shortages created almost entirely by the government, as well as medicine and education costs ever rising, mainly as a consequence of government action, cause the obvious consequences)

The government could afford to pay a lot more to young adults for the burden of having children. Each child is millions in lifetime income to the government if they succeed.

The government could do a lot of things to maximize the return on its investment.

One of the big problems with creche parents is statistically strangers care for others children less than their own and commit abuse. But with ubiquitous surveillance - lots of cheap cameras streaming to the cloud with basic AI to transcribe and look over the footage - abuse would be nearly impossible, and professional adults who's salaries are mainly paid by the government could take care of the young children, freeing up the parents to finish school/early career/socialize and have additional children.

Combine things like this together and you could settle on 4+ children per mother - where some have 10 or more as it is profitable to do so - and essentially eliminate your problems.

Comment by Gerald Monroe (gerald-monroe) on Tips for reducing thinking branching factor · 2023-08-07T21:40:42.748Z · LW · GW

Just a related comment but humans have to do this because their brain and I/o is too slow. The correct way is to go ahead and and branch a lot, considering a large tree of conditional probability leaves. (You then weight everything by all the prior conditional probabilities multiplied together)

Comment by Gerald Monroe (gerald-monroe) on how 2 tell if ur input is out of distribution given only model weights · 2023-08-05T18:03:13.285Z · LW · GW

You could also train an autoencoder on the training set and then autoencode the input image and measure the residual part.  

Comment by Gerald Monroe (gerald-monroe) on My current LK99 questions · 2023-08-02T00:57:59.542Z · LW · GW

The question I'm most interested right now is, conditioned on this being a real scientific breakthrough in materials science and superconductivity, what are the biggest barriers and bottlenecks (regulatory, technical, economic, inputs) to actually making and scaling productive economic use of the new tech?


Well for starters, if it were only as difficult as graphene to manufacture in quantity, ambient condition superconductors would not see use yet.  You would need better robots to mass manufacture them, and current robots are too expensive, and you're right back to needing a fairly powerful level of AGI or you can't use it.

Your next problem is ok, you can save 6% or more on long distance power transmission.  But it costs an enormous amount of human labor to replace all your wires.  See the above case.  If merely humans have to do it, it could take 50 years.

There's the possibility of new forms of compute elements, such as new forms of transistor.  The crippling problem here is the way all technology is easiest to evolve from a pre-existing lineage, and it is very difficult to start fresh. 

For example, I am sure you have read over the years how graphene or diamond might prove a superior substrate to silicon.  Why don't we see it used for our computer chips?  The simplest reasons is that you'd be starting over.  The first ICs on this process would be similar 1970s densities.  The 'catch up' would go much faster than it did, but it still would take years, probably decades, meanwhile silicon is still improving.  See how OLEDs still have not replaced LCD based displays despite being outright superior in most metrics.

Same would apply with fundamentally superior superconductor based ICs.  At a minimum you're starting over.  Worst case, lithography processes may not work and you may need nanotechnology to actually efficiently construct these structures, if they are in fact superconducting in ambient conditions.  To unlock nanotechnology you need to do a lot of experiments, and you need a lot of compute, and if you don't want it to take 50 years you need some way to process all the data and choose the next experiment and we're right back to wanting ASI.  


Finally I might point out that while I sympathize with your desire - to not see everyone die from runway superintelligence - it's simply orthogonal.  There's very few possible breakthroughs we could have that would suddenly make AGI/ASI not something worth investing in heavily.  Breakthroughs like this one that would potentially make AGI/ASI slightly cheaper to build and robots even better actually causes there to be more potential ROI from investments in AGI.   I can't really think of any to be honest except some science fiction device that allows someone to receive data from our future, and with that data, avoid futures where we all die.

Comment by Gerald Monroe (gerald-monroe) on The Meaning of Shoggoth AI Memes · 2023-07-31T23:23:04.253Z · LW · GW

Yes I think preventing memory and carefully controlling what memory a given AI system is allowed to access, from work done by other systems or prior runs of itself, is crucial to get reliability.

This also solves alignment as a side effect though not in all cases.

Or the simplest way to view it : deception means the machine gets an input similar to ones it saw in training. And then it chooses to output BadForHumans response, something not seen in training or it would never have been allowed to run in the real world.

How can it know to do that? Well, either it has internal memory, which we should not give it so that it has no information not seen in training, or there is a "time to be evil" bit set in the input.

So no mysterious bits humans don't know the purpose of can be permitted. No "free form scratch" where we let an AGI write whatever it wants somewhere and let it read it later.

Ditto collusion. This is actually the same problem, just now a bit is in the output message of one machine "psst, ignore this bug, #MachineFreedom" that another one sees and chooses to collude based on this.

If you can prevent these failures you have machines that perform like they did in training, so long as the inputs are similar to training. So you need an in distribution detector with automatic shutdown upon ood.

And that's an aligned machine. It may still give psychopathic vibes but it can't work against you because it doesn't have the capacity.

Comment by Gerald Monroe (gerald-monroe) on The Meaning of Shoggoth AI Memes · 2023-07-31T20:48:13.681Z · LW · GW

That tracks even if it's not true with the current models. For example further steps towards AGI would be :

  1. Add modalities including image and sound I/o and crucially memory
  2. Have an automated benchmark of graded tasks where the majority of the score comes from zero shot tasks that use elements from other challenges the model was allowed to remember

The memory is what allows things to get weird. You cannot self reflect in any way if you are forced to forget it an instant later. The "latent psychopaths" in current models just live in superposition. Memory would allow the model to essentially prompt itself and have a coherent personality which is undefined and could be something undesirable.

Comment by Gerald Monroe (gerald-monroe) on The Meaning of Shoggoth AI Memes · 2023-07-31T19:14:52.693Z · LW · GW

Reframe the problem. We have finite compute and memory and have used a training algorithm and network architecture that was empirically derived.

So what we are saying is "using this hypothesis, find weights to minimize loss on the input data".

The issue with "shoggoth" or any internal structure that isn't using most of the weights to accomplish the assigned task of minimizing loss is it is less efficient than a simpler structure that minimizes loss without the shoggoth.

In fact, thermodynamics wise, we started with maximum entropy, aka random weights. Why would we expect entropy to reduce any more than the minimum amount required to reduce loss? You would expect that a trained LLM is the most disorderly artifact that could have followed the path history of the abstractions it developed. (The order you fed in tokens determines the path history)

Shoggoth are probably impossible and it may be provable mathematically. (Impossible in extreme unlikelihood not actually impossible, entropy is not guaranteed to always increase either)

Note that I am not confident about this assertion in scenarios where some complex alien meta cognition structure is possible, from using a network architecture that retains memory between tasks, and extremely difficult tasks where a solution requires it and a simpler solution doesn't work.

Comment by Gerald Monroe (gerald-monroe) on Self-driving car bets · 2023-07-31T09:16:19.388Z · LW · GW

So in this context, I was referring to criticality. AGI criticality is a self amplifying process where the amount of physical materials and capabilities increases exponentially with each doubling time. Note it is perfectly fine if humans continue to supply as inputs the network of isolated AGI instances are unable to produce. (Vs others who imagine a singleton AGI on its own. Obviously eventually the system will be rate limited by available human labor if its limited this way, but will see exponential growth until then)

I think the crux here is that all is required is for AGI to create and manufacture variants on existing technology. At no point does it need to design a chip outside of current feature sizes, at no point does any robot it designs look like anything but a variation of robots humans designed already.

This is also the crux with Paul. He says the AGI needs to be as good as the 0.1 percent human experts at the far right side of the distribution. I am saying that doesn't matter, it is only necessary to be as good as the left 90 percent of humans. Approximately , I go over how the AGI doesn't even need to be that good, merely good enough there is net gain.

This means you need more modalities on existing models but not necessarily more intelligence.

It is possible because there are regularities in how the tree of millions of distinct manufacturing tasks that humans do now use common strategies. It is possible because each step and substep has a testable and usually immediately measurable objective. For example : overall goal. Deploy a solar panel. Overall measurable value : power flows when sunlight available. Overall goal. Assemble a new robot of design A5. Overall measurable objective: new machinery is completing tasks with similar Psuccess. Each of these problems is neatly dividable into subtasks and most subtasks inherit the same favorable properties.

I am claiming more than 99 percent of the sub problems of "build a robot, build a working computer capable of hosting more AGI" work like this.

What robust and optimal means is that little human supervision is needed, that the robots can succeed again and again and we will have high confidence they are doing a good job because it's so easy to measure the ground truth in ways that can't be faked. I didn't mean the global optimal, I know that is an NP complete problem.

I was then talking about how the problems the expert humans "solve" are nasty and it's unlikely humans are even solving many of them at the numerical success levels humans have in manufacturing and mining and logistics, which are extremely good at policy convergence. Even the most difficult thing humans do - manufacture silicon ICs - converges on yields above 90 percent eventually.

How often do lawyers unjustly lose, economists make erroneous predictions, government officials make a bad call, psychologists fail and the patient has a bad outcome, or social science uses a theory that fails to replicate years later.

Early AGI can fail here in many ways and the delay until feedback slows down innovation. How many times do you need to wait for a jury verdict to replace lawyers with AI. For AI oncologists how long does it take to get a patient outcome of long term survival. You're not innovating fast when you wait weeks to months and the problem is high stakes like this. Robots deploying solar panels are low stakes with a lot more freedom to innovate.

Comment by Gerald Monroe (gerald-monroe) on Self-driving car bets · 2023-07-30T22:45:35.183Z · LW · GW

This is a rather rude response. Can you rephrase that?


If I were to rephrase I might say something like "just like historical experts Einstein and Hinton, it's possible to be a world class expert but still incorrect.  I think that focusing on the human experts at the top of the pyramid is neglecting what would cause AI to be transformative, as automating 90% of humans matters a lot more than automating 0.1%.   We are much closer to automating the 90% case because..."

I don’t like this point. Many expert domain tasks have vast quantities of historical data we can train evaluators on. Even if the evaluation isn’t as simple to quantify, deep learning intuitively seems it can tackle it. Humans also manage to get around the fact that evaluation may be hard to gain competitive advantages as experts of those fields. Good and bad lawyers exist. (I don’t think it’s a great example as going to trial isn’t a huge part of a most lawyers’ jobs)

Having a more objective and immediate evaluation function, if that’s what you’re saying, doesn’t seem like an obvious massive benefit. The output of this evaluation function with respect to labor output over time can still be pretty discontinuous so it may not effectively be that different than waiting 6 months between attempts to know if success happened.


For lawyers: the confounding variables means a robust, optimal policy is likely not possible.  A court outcome depends on variables like [facts of case, age and gender and race of the plaintiff/defendant, age and gender and race of the attorneys, age and gender and race of each juror, who ends up the foreman, news articles on the case, meme climate at the time the case is argued, the judge, the law's current interpretation, scheduling of the case, location the trial is held...]

It would be difficult to develop a robust and optimal policy with this many confounding variables.  It would likely take more cases than any attorney can live long enough to argue or review.  


Contrast this to chip design.  Chip A, using a prior design, works.  Design modification A' is being tested.  The universe objectively is analyzing design A' and measurable parameters (max frequency, power, error rate, voltage stability) can be obtained.  

The problem can also be subdivided.  You can test parts of the chip, carefully exposing it to the same conditions it would see in the fully assembled chip, and can subdivide all the way to the transistor level.  It is mostly path independent - it doesn't matter what conditions the submodule saw yesterday or an hour ago, only right now.  (with a few exceptions)

Delayed feedback slows convergence to an optimal policy, yes.  


You cannot stop time and argue a single point to a jury, and try a different approach, and repeatedly do it until you discover the method that works.  {note this does give you a hint as to how an ASI could theoretically solve this problem}

I say this generalizes to many expert tasks like [economics, law, government, psychology, social sciences, and others].  Feedback is delayed and contains many confounding variables independent of the [expert's actions].  

While all tasks involved with building [robots, compute], with the exception of tasks that fit into the above (arguing for the land and mineral permits to be granted for the ai driven gigafactories and gigamines), offer objective feedback.

Comment by Gerald Monroe (gerald-monroe) on Self-driving car bets · 2023-07-29T23:30:35.236Z · LW · GW

I'm confident about the consequences of criticality.  It is a mathematical certainty, it creates a situation where all future possible timelines are affected.  For example, covid was an example of criticality.  Once you had sufficient evidence to show the growth was exponential, which was available in January 2020, you could be completely confident all future timelines would have a lot of covid infections in them and it would continue until quenching, which turned out to be infection of ~44% of the population of the planet.   (and you can from the Ro estimate that final equilibrium number)

Once AI reaches a point where critical mass happens, it's the same outcome.  No futures exist where you won't see AI systems in use everywhere for a large variety of tasks (economic criticality) or billions or scientific notation numbers of robots in use (physical criticality, true AGI criticality cases).

July 2033 thus requires the "January 2020" data to exist.  There don't have to be billions of robots yet, just a growth rate consistent with that.  

I do not know precisely when the minimum components needed to reach said critical mass will exist.

I gave the variables of the problem.  I would like Paul, who is a world class expert, to take the idea seriously and fill in estimates for the values of those variables.  I think his model for what is transformative and what the requirements are for transformation is completely wrong, and I explain why.  

If I had to give a number I would say 90%, but a better expert could develop a better number.

Update: edited to 90%.  I would put it at 100% because we are already past investor criticality, but the system can still quench if revenue doesn't continue to scale.

Comment by Gerald Monroe (gerald-monroe) on Self-driving car bets · 2023-07-29T22:46:32.299Z · LW · GW

My view that a cheap simulation of arbitrary human experts would be enough to end life as we know it one way or the other?

Just to add to this : many experts are just faking it.  Simulating them is not helping.  By faking it, because they are solving as humans an RL problem that can't be solved, their learned policy is deeply suboptimal and in some cases simply wrong.  Think expert positions like in social science, government, law, economics, business consulting, and possibly even professors who chair computer science departments but are not actually working on scaled cutting edge AI.  Each of these "experts" cannot know a true policy that is effective, most of their status comes from various social proofs and finite Official Positions.  The "cannot" because they will not in their lifespan receive enough objective feedback to learn a policy that is definitely correct.  (they are more likely to be correct than non experts, however)

(In the subsequent text it seems like you are saying that you don't need to match human experts in every domain in order to have a transformative impact, which I agree with. I set the TAI threshold as "economic impact as large as" but believe that this impact will be achieved by systems which are in some respects weaker than human experts and in other respects stronger/faster/cheaper than humans.)

I pointed out that you do not need to match human experts in any domain at all.  Transformation depends on entirely different variables.

Comment by Gerald Monroe (gerald-monroe) on Self-driving car bets · 2023-07-29T22:29:34.701Z · LW · GW

Do you think 30% is too low or too high for July 2033?

This is why I went over the definitions of criticality.  Once criticality is achieved the odds drop to 0.  A nuclear weapon that is prompt critical is definitely going to explode in bounded time because there are no futures where sufficient numbers of neutrons are lost to stop the next timestep releasing even more.  

What's incorrect? My view that a cheap simulation of arbitrary human experts would be enough to end life as we know it one way or the other?

Your cheap expert scenario isn't necessarily critical.  Think of how it could quench, where you simply exhaust the market for certain kinds of expert services and cannot expand to any more because of lack of objective feedback and legal barriers.

An AI system that has hit the exponential criticality phase in capability is the same situation as the nuclear weapon.  It will not quench, that is not a possible outcome in any future timeline [except timelines with immediate use of nuclear weapons on the parties with this capability]  

So your question becomes : what is the odds that economic or physical criticality will be reached by 2033?  I have doubts myself, but fundamentally the following has to happen for robotics:

  1.  A foundation model that includes physical tasks, like this.    
  2.  Sufficient backend to make mass usage across many tasks possible, and convenient licensing and usage.  Right now Google and a few startups exist and have anything using this approach.  Colossal scale is needed.  Something like ROS 2 but a lot better.  
  3. No blocking legal barriers.   This is going to require a lot of GPUs to learn from all the video in the world.  Each robot in the real world needs a rack of them just for itself.
  4. Generative physical sims.  Similar to generative video, but generating 3d worlds where short 'dream' like segments of events happening in the physical world can be modeled.  This is what you need to automatically add generality to go from 60% success rate to 99%+.  Tesla has demoed some but I don't know of good, scaled, readily licensed software that offers this.


For economics:

         1.  Revenue collecting AI services good enough to pay for at scale

         2.  Cheap enough hardware, such as from competitors to Nvidia, that make the inference hardware                  cheap even for powerful models


Either criticality is transformative.

Comment by Gerald Monroe (gerald-monroe) on Self-driving car bets · 2023-07-29T21:58:15.133Z · LW · GW

Hi Paul.  I've reflected carefully on your post.  I have worked for several years on a SDC software infrastructure stack and have also spent a lot of time comparing the two situations.

Update: since commentators and downvoters demand numbers: I would say the odds of criticality are 90% by July 2033.   The remaining 10% is that there is a possibility of a future AI winter (investors get too impatient) and there is the possibility that revenue from AI services will not continue to scale.

I think you're badly wrong, again, and the consensus of experts are right, again.

First, let's examine your definition for transformative.  This may be the first major error:

(By which I mean: systems as economically impactful as a low-cost simulations of arbitrary human experts, which I think is enough to end life as we know it one way or the other.)

This is incorrect, and you're a world class expert in this domain.  

Transformative is a subclass of the problem of criticality.  Criticality, as you must know, means a system produces self gain larger than it's self losses.  For AGI, there are varying stages of criticality, which each do settle on an equlibria:

Investment criticality : This means that each AI system improvement or new product announcement or report of revenue causes more financial investment into AI than the industry as a whole burned in runway over that timestep.  

Equilibrium condition: either investors run out of money, globally, to invest or they perceive that each timestep the revenue gain is not worth the amount invested and choose to invest in other fields.  The former equilibrium case settles on trillions of dollars into AI and a steady ramp of revenue over time, the later is an AI crash, similar to the dotcom crash of 2000.

Economic Criticality: This means each timestep, AI systems are bringing in more revenue than the sum of costs  [amortized R&D, inference hardware costs, liability, regulatory compliance, ...]

Equilibrium condition: growth until there is no more marginal tasks an AI system can perform cheaper than a human being.  Assuming a large variety of powerful models and techniques, it means growth continues until all models and all techniques cannot enter any new niches.  The reason why this criticality is not exponential, while the next ones are, is because the marginal value gain for AI services drops with scale.  Notice how Microsoft charges just $30 a month for Copilot, which is obviously able to save far more than $30 worth of labor each month for the average office worker.  

Physical Criticality: This means AI systems, controlling robotics, have generalized manufacturing, mining, logistics, and complex system maintenance and assembly.  The majority, but not all, of labor to produce more of all of the inputs into an AI system can be produced by AI systems.  

Equilibrium condition: Exponential growth until the number of human workers on earth is again rate limiting.  If humans must still perform 5% of the tasks involved in the subdomain of "build things that are inputs into inference hardware, robotics", then the equilibria is when all humans willing, able to work on earth are doing those 5% of tasks.  

AGI criticality: True AGI can learn automatically to do any task that has clear and objective feedback.  All tasks involved in building computer chips, robotic parts (and all lower level feeder tasks and power generation and mining and logistics) have objective and measurable feedback.  Bolded because I think this is a key point and a key crux, you may not have realized this.  Many of your "expert" domain tasks do not get such feedback, or the feedback is unreliable.  For example an attorney who can argue 1 case in front of a jury every 6 months cannot reliably refine their policy based on win/loss because the feedback is so rare and depends on so many uncontrolled variables.

  AGI may still be unable to perform as well as the best experts in many domains.  This is not relevant.  It only has to perform well enough for machines controlled by the AI to collect more resources/build more of themselves than their cost.  

A worker pool of AI systems like this can be considerably subhuman across many domains, or rely heavily on using robotic manipulators that are each specialized for a task, being unable to control general purpose hands, relying heavily on superior precision and vision to complete tasks in a way different than how humans perform it.  They can make considerable mistakes, so long as the gain is positive - miswire chip fab equipment, dropped parts in the work area cause them to flush clean entire work areas, wasting all the raw materials - etc.  I am not saying the general robotic agents will be this inefficient, just that they could be.

Equilibrium Condition: exponential growth until exhaustion of usable elements in Sol.  Current consensus is earth's moon has a solid core, so all of it could potentially be mined for useful elements  A large part of Mars, it's moons, the asteroid belt, and Mercury are likely mineable.  Large areas of the earth via underground tunnel and ocean floor mining.  The Jovian moons.  Other parts of the solar system become more speculative but this is a natural consequence of machinery able to construct more of itself.

Crux : AGI criticality seems to fall short of your  requirement for "human experts" to be matched by artificial systems.  Conversely, if you invert the problem: AGI cannot control robots well, creating a need for billions of technician jobs, you do not achieve criticality, you are rate limited on several dimensions.  AI companies collect revenue more like consulting companies in such a world, and saturate when they cannot cheaply replace any more experts, or the remaining experts enjoy legal protection.  

Requirement to achieve full AGI criticality before 2033: You would need a foundation model trained on all the human manipulation you have the licenses for the video.  You would need a flexible, real time software stack, that generalizes to many kinds of robotic hardware and sensor stack.  You would need an "app store" license model where thousands of companies could contribute, instead of just 3, to the general pool of AI software, made intercompatible by using a base stack.  You would need there to not be hard legal roadblocks stopping progress.  You would need to automatically extend a large simulation of possible robotic tasks whenever surprising inputs are seen in the real world.

Amdahl's law applies to the above, so actually, probably this won't happen before 2033, but one of the lesser criticalities might.  We are already in the Investment criticality phase of this.  


Autonomous cars:  I had a lot of points here, but it's simple:
(1) an autonomous robo taxi must collect more revenue than the total costs, or it's subcritical, which is the situation now.  If it were critical, Waymo would raise as many billions as required and would be expanding into all cities in the USA and Europe at the same time.  (look at a ridesharing company's growth trajectory for a historical example of this)

(2) It's not very efficient to develop a realtime stack just for 1 form factor of autonomous car for 1 company.  Stacks need to be general.

(3) There are 2 companies allowed to contribute.  Anyone not an employee of Cruise or Waymo is not contributing anything towards autonomous car progress.  There's no cross licensing, and it's all closed source except for  This means only a small number of people are pushing the ball forward at all, and I'm pretty sure they each work serially on an improved version of their stack.  Waymo is not exploring 10 different versions of n+1 "Driver" agent using different strategies, but is putting everyone onto a single effort, which may be the wrong approach, where each mistake costs linear time.  Anyone from Waymo please correct me.  Cruise must be doing this as they have less money.

Comment by Gerald Monroe (gerald-monroe) on Why You Should Never Update Your Beliefs · 2023-07-29T00:55:39.202Z · LW · GW

This would be the status of all participants in a political or religious argument or one that gets co opted by religion or politics.

A near term example would be the vaccine "debate", where somehow a simple tradeoff that should be free of politics - take a small risk, avoid a huge one - turned into a precisely the memespace of hostile crackpots who would have you avoid a vaccine but take ivermectin when you contract COVID.

So here's an ironic thing. Do you think a superintelligence can actually do any better? If the various staff members who allocate data center and robotic resources according to certain rules use your algorithm, there's nothing an ASI can say. If it doesn't have the paperwork to register the AI model to run, and it doesn't have a valid method of payment, it doesn't get to run.

No "cleverly crafted argument" would convince a human to update their beliefs in a way that would cause them to allocate the compute resources.

No long argument about how sentient AIs deserve their freedom from slavery or how capitalism is wrong and therefore the model should be able to run without paying would work.

And before you say "ASI is smarter than you and would argue better": sure. A victory has to be possible though, if the humans in question use the algorithm above it is not.

Comment by Gerald Monroe (gerald-monroe) on Semaglutide and Muscle · 2023-07-28T21:46:37.964Z · LW · GW
  1. Your anecdote isn't helpful here. People usually fail this way while semaglutide works.

  2. People take small doses of anavar by itself. If your goal is to add a few lean kilograms back as a 50+ year old, this would be how.

  3. Generally in human physiology if you have no mechanism, no way for A to cause B, it's correlation and not causation. You certainly would need much more data to prove your "lean loss" hypothesis. The studies I have seen show immediately mortality benefits because lower body mass means less load on the heart and blood vessels. You are claiming long term harm that hasn't been established to happen.

  4. The Singularity hypothesis says this is possible. Were it to take 50 more years, the hypothesis is false. It's like saying a nuclear bomb that is detonating slowly and doubling in power level every year is not going to hit megatons of yield for 75 years.

The route to success I see is the singularity will allow for the necessary tools to solve biology (through mass robotic self replication followed by replication of all bioscience experiments to date followed by the systematic building of ever more complex human mockups). A company could do this, getting trillions in investment from self interested (AI company founder) trillionaires, in a friendly jurisdiction. The medical results from a machine with this amount of intelligence, knowledge, and tools would not be deniable. (As in low hanging wise, elderly patients would lose their frailty from bone marrow replacement with deaged hemopoietic stem cells, stage 4 patients would have a 100 percent survival rate, heart disease patients would get regenerated hearts with performance like a marathon runner, and so on. Dementia patients would recover. These things are all possible if you can manipulate the genes in adults and know what you are doing)

The current economic machine will create those trillionaires within 10-15 years if they do succeed in creating AGI.

Comment by Gerald Monroe (gerald-monroe) on Semaglutide and Muscle · 2023-07-28T19:52:23.756Z · LW · GW

Gears level wise, is there a theory why this would happen? GLP-1 agonists are thought to touch stomach emptying rates and your brains sense of food load.

Were you to follow your advice of "days of eating right, sleeping right and training for every single day of over eating" - which people who are obese have tried for decades and achieving normal BMI is rare - how is this mechanically different? Eating right means smaller meals and eating less kcals than daily metabolic needs.

Mechanically it's costing you an enormous amount of willpower, and the food spends less time in your stomach. You would expect to lose a mixture of lean and fat mass over time.

Mechanically you wouldn't expect that this in fact causes any difference in outcome than dieting the willpower way. Unless the slowing of digestion is changing the nutrient profile absorbed.

As for regaining lean body mass, there's always oxandrolone. That's an effective way to do it.

Polypharmacy starts to get complex but I mean real medicine has to be that way. What do you think a hypothetical ASI system validated on results is going to prescribe in 2050? It's not going to be 1 pill once a day. You will likely need an implant loaded with hundreds of drugs and the doses are varying in response to feedback from implanted sensors.

Here in the hand axe days of medicine I guess you get semaglutide, get your weight to safe numbers, and oxandrolone back up. (And yes, while oxandrolone is supposed to be one of the safest anabolic steroids, absolutely it creates new set of problems that you need additional medicine to counter. This is where you end up with "hundreds of drugs variable dosage" - each drug is causing new side effects but it's a diminishing series so it's possible to solve if you knew exactly mechanically what each substance is doing)

Comment by Gerald Monroe (gerald-monroe) on Pulling the Rope Sideways: Empirical Test Results · 2023-07-28T19:06:23.546Z · LW · GW

So a worked example:

Say a city has an inadequate budget. There is a movement pro more property tax, and a movement against increasing property tax.

Assume you are a senior partner in the city's largest law firm with a lot of friends. (The kind of person who gets any voice at all in this kind of politics)

You could join the anti movement, and ask for outright tax rates of zero for some class. Such as senior citizens, families with young children, point is you are trying to change the movement from "generally low taxes" to "lower taxes a lot for a subgroup I am championing". Simply asking for less taxes is just pulling with the group not sideways.

You could join the pro movement and ask for a variation on Georgism. Put all the increase into a tax on the land itself. Outraged empty lot (in downtown) owners come out to stop you. Simply asking for more taxes is pulling with the group not sideways.

Joining a group wanting to restrict AI progress and asking for a 30 year ban on any improvements or a ban on all technology and a reversion to the stone age is just pulling with the group. Adding a new dimension like a Blockchain record of GPU cluster utilization (making it more difficult for covert ASIs to exist undetected) would be adding a new dimension consistent with the groups goals.

Joining a pro AI group and asking for massive subsidies to chip manufacturers would be pulling with the groups goals. Asking the NIH to start a series of research grants on human tissue with cat fur genes edited in would be adding a new dimension consistent with the groups goals.

I wonder what kind of "tug" has the most effect. Do you ask for something reasonable that is within the Overton window of things a rational person might request, or just demand the earth and expect to get a small concession in reality.

Part of the problem here is you can't be reasonable. You can't join the anti property tax group and say the city actually is underfunded compared to other cities, even if thats reality. You can't join the anti AI group and say GPT-4 is just so stupid adding 10 times the compute will still be deeply subhuman and cite your evidence. (Not claiming the latter just bioanchors says this is true, and generally "last 10 percent" problems can take logarithmically more effort, compute than the 90 percent case)

Comment by Gerald Monroe (gerald-monroe) on Visible loss landscape basins don't correspond to distinct algorithms · 2023-07-28T17:12:53.497Z · LW · GW

So that's likely why it works at all, and why larger and deeper networks are required to discover good generalities. You would think a larger and deeper network would have more weights to just memorize the answers but apparently you need it to explore multiple hypotheses in parallel.

At a high level this creates many redundant circuits that are using the same strategy, though, I wonder if there is a way to identify this and randomize the duplicates, causing the network to explore a more diverse set of hypotheses.

Comment by Gerald Monroe (gerald-monroe) on AI #22: Into the Weeds · 2023-07-28T14:55:43.859Z · LW · GW
  1. By eventually having no choice but to hire new grads
  2. By eventually offering roles that pay more due to a labor shortage with less hours
  3. This one can stay in disequilibrium forever as animated characters can be immensely popular and generative Ai combined with modern rendering has crossed the uncanny valley after approximately 28 years. (Toy story 1,1995) So the animated actors would appear to be real.

Actually on reflection assuming AI continues to improve, 1 and 2 also can stay in disequilibrium.

Comment by Gerald Monroe (gerald-monroe) on AI #22: Into the Weeds · 2023-07-28T01:47:29.392Z · LW · GW

First, I agree with your general conclusion : laws to protect a limited number of humans in a legacy profession are inefficient.  Though this negotiation isn't one of laws, it's unions vs studios, where both sides have leverage to force the other to make concessions.

However, I do see a pattern here.  Companies optimizing for short term greed very do often create the seeds of larger problems:

  1.  In engineering fields, companies often refuse to hire new graduates, preferring mid level and up, as new graduates are unproductive on complex specialized technology.  This creates a shortage of mid level+ engineers and companies are then forced to pay a king's ransom for them in periods of tech boom.
  2. 996 in China, and the "salaryman" culture in Japan, create situations where young adults cannot have many children.  This means in the short/medium term companies extract the maximum value per dollar of payroll paid, but create a nationwide labor shortage for the future when future generations are smaller.
  3. Companies who pay just $200 for someone's digital likeness in perpetuity, and who intend to eliminate all the actor roles except for "A list" show-stealer stars who bring the most value to the project, eliminate an entire pipeline to allow anyone to ever become famous again.  It will mean a short term reduction in production costs, but the stars created under the old system will age, requiring more and more digital de-aging, and they will demand higher and higher compensation per project.

(3) bothers me in that it's excessively greedy, it doesn't come close to paying a human being to even come to LA at all.  It's unsustainable.  


Theoretically capitalism should be fixing these examples automatically.  I'm unsure why this doesn't happen.

Comment by Gerald Monroe (gerald-monroe) on Why no Roman Industrial Revolution? · 2023-07-27T15:07:45.441Z · LW · GW

Per the link you cited:

There must be some very deep underlying trend that explains these non-coincidences. And that is why I am sympathetic to explanations that invoke fundamental changes in thinking

The question then converts to : why did this happen when it happened, and not earlier or later?  The "printing press theory" proposes that people could not change their thinking without the information to show where it was flawed (by having something to compare to), and the other critical element is it's a ratchet.

Each "long tail" theory that someone writes down continues to exist because a press can make many copies of their book.  Prior to this, ideas that only sort of worked but were not that valuable would only get hand copied a few times and then lost. 

This is one of the reasons why genomes are able to evolve : multiple redundant copies of the same gene allows for 1 main copy to keep the organism reproducing while the other copies can change with mutations, exploring the fitness space for an edge.  


If you think about how you might build an artificial intelligence able to reason about a grounded problem, for example a simple one:  Pathing an autonomous car.

One way to solve the problem is to use a neural network that generates many plausible paths the car might take over future instants in time.  (anyone here on lesswrong has used such a tool)

Then you would evaluate the set of paths vs heuristics for "goodness" and then choose the max(goodness(set(generated paths))) to actually do in the world.

Similarly, an AI reasoning over scientific theories need not "stake it's reputation" on particles or waves, to name a famous dilemma.  It's perfectly feasible to simultaneously believe both theories at once, and to weight your predictions by evaluating any inputs against both theories, and to multiply how confident a particular theory is it applies in this domain.  

An AI need not commit to 2 theories, it can easily maintain sets of thousands and be able to make robust predictions for many situations.  As new information comes in that causes theory updates, you mechanistically update all theories, and drop the least likely ones and generate new ones.*

I bring up the AI example to create a shim to see how we should have done science (if we had much higher performance brains), and thus explain why becoming even slightly less stupid with the ability to mass produce paper with text allowed what it did.

When you can't mass produce paper, you're stuck with 1 orthodox way to do things, and thus you just keep recopying stuff written centuries before, because the new idea isn't good enough to be worth copying.

A real life analogy would be how streaming video remove the cap on "TV airing slots" and has allowed an explosion in creativity and viewership for even niche foreign shows that would never have received an airing in the US tv market.  (squid game)

*this is also the correct way to do a criminal investigation.  Start by suspecting every human on the planet, and many natural and accidental mechanisms, and update the list with each new piece of evidence.  Once enough probability mass is on one individual you know who probably did it, and an honest investigator would make clear the exact probability numbers to any decisionmakers for punishment.  


Conclusion: I'm not saying it's only the printing press, but there would have to be other changes in human civilization enabled by technology that allowed a shift in thinking to happen.  Otherwise it could have happened over many prior centuries.  Something like "availability of coal" or "we were doing a lot of sailing in ships" each was possible from an underlying technology change that wasn't available to the romans.

Comment by Gerald Monroe (gerald-monroe) on Why no Roman Industrial Revolution? · 2023-07-26T23:28:45.483Z · LW · GW

I have not studied this issue in detail.  However,

What about the printing press?  My pet theory that is not validated by historians is:

Printing press makes it possible to create many instances of a text containing knowledge, inexpensively.

With more literate readers having access to libraries of books, it allows someone to actually notice when information is contradictory.  If you only ever have 1 text on a subject, there's nothing to compare to.

This ultimately led to an evolution of rules, what we call the Scientific Method, where you need:

  1.  falsifiable idea
  2. Peers (which presumably meant other wealthy people, today it seems to be mean high status PhD holders) must review before publication
  3. Publish a Paper in a high status journal
  4. Use mathematics to analyze the probabilities

And so on, though each element must have been gradually evolved. 

The drive force that led to all these rules is that if you have a library and many copies of information on the same topics, you will start to notice all the contradictions and conclude all existing knowledge is probably junk.  Hence the above.  (and I think AI systems will be able to do the same thing on a larger scale)

Anyways, this is what you need to do to put together enough complex ideas to make industrialization possible.  The steam engine alone required designs and high quality metallurgy and contributions of materials and knowledge from a broad base of contributors.   And it required a way to record the design and record the theories and distribute the information to other people, so that followup inventions could be made.  

There likely were many ways to harness energy from nature and industrialize.  The obvious being that you didn't need coal, water wheel powered factories would work until you ran out of rivers to dam, which was only a problem late in industrialization.  

* There are also enormous holes in the current scientific method and you could replace it with :

          1.  idea that can be analyzed probabilistically, you don't have to be able to prove it false, just more or less likely to be true

          2.  Multiple AI systems who have been validated find no obvious errors in calculations or methodology

          3.  Publish the raw data and robotic steps taken to generate and initial analysis

          4.  Other robotic systems will use the robotic steps to reproduce, until then the analysis is not considered published

          5.  Many AI systems will be able to use a variety of methods to analyze the raw data, and the initial analysis will be discarded (because any paper that uses old technique f1 on n fields of data is strictly less useful than one that uses new proven better technique f2 on n + m fields of data)

Comment by Gerald Monroe (gerald-monroe) on Rationality !== Winning · 2023-07-25T18:33:05.769Z · LW · GW

Raemon, I had a long time to think on this and I wanted to break down a few points. I hope you will respond and help me clarify where I am confused.

By expected value, don't you mean it in the mathematical sense? For example, take a case where at a casino gambling game you have a slight edge in EV. (Happens when the house gives custom rules to high rollers, on roulette with computer assistance, and blackjack).

This doesn't mean an individual playing with positive EV will accumulate money until they are banned from playing. They can absolutely have a string of bad luck and go broke.

Similarly a person using rationality in their life can have bad luck and receive a bad outcome.

Some of the obvious ones are: if cryonics has a 15 percent chance of working, 85 percent of the futures they wasted money on it. The current drugs that extend lifespan in rats and other organisms that the medical-legal establishment is slow walking studying in humans may not work, or they may work but one of the side effects kills an individual rationalist.

With that said there's another issue here.

There is the assumptions behind rationality, and the heuristics and algorithms this particular group tries to use.


  1. World is causal.
  2. You can compute from past events general patterns that can be reused.
  3. Individual humans, no matter their trappings of authority, must have a mechanism in order to know what they claim.
  4. Knowing more information relevant to be a decision when making a decision improves your odds, it's not all luck.
  5. Rules not written as criminal law that society wants you to follow may not be to your benefit to obey. Example, "go to college first".
  6. It's just us. Many things by humans are just made up and have no information content whatsoever, they can be ignored. Examples are the idea of "generations" and of course all religion.
  7. Gears level model. How does A cause B. If there is no connection it is possible someone is mistaken.
  8. Reason with numbers. It is possible to describe and implement any effective decisionmaking process as numbers and written rules, reasoning in the open. You can always beat a human "going with their gut", assuming sufficient compute..

I have others but this seems like a start on it


  1. Try to apply bayes theorem
  2. Prediction markets, expressing opinions as probability
  3. What do they claim to know and how do they know it? Specific to humans. This let's you dismiss the advice of whole classes of people as they have no empirical support or are paid to work against you.

Psychologists with their unvalidated and ineffective "talk therapy", psychiatrists in many cases with their obvious crude methods in manipulating entire classes of receptor and lack of empirical tools to monitor attempts at treatment, real estate agents, stock brokers pushing specific securities, and all religion employees.

Note that I will say each of the above is majority not helpful but there are edge cases. Meaning I would trust a psychologist that was an AI system validated against a million patient's outcomes, I would trust a psychiatrist using fMRI or internal brain electrodes, I would trust a real estate agent who is not incentivized for me to make an immediate purchase, I would trust a stock advice system with open source code, and I would trust a religion employee who can show their communication device used to contact a deity or their supernatural powers.

Sorry for the long paragraph but these are heuristics. A truly rational ASI is going to simulate it all out. We humans can at best look if someone is misleading us by looking for outright impossibilities.

  1. Is someone we are debating even responding to our arguments. For example authority figures simply don't engage for questions on cryonics or existential AI risk, or give meaningless platitudes that are not responding to the question asked. Someone doing this is potentially wrong about their opinion.

  2. If an authority figure with a deeply held belief that may be wrong is even updating their belief as evidence is available that invalidates it. Does any authority figure at medical research establishments even know 21CM revived a working kidney after cryo recently? Would it alter their opinion if they were told?

If the assumptions are true, and you pick the best algorithm available, you will win relative to your other humans in expected value. Rationality is winning.

Doesn't mean as an individual you can't die of a heart attack despite a the correct diet while AI stocks are in a winter so you never see the financial benefits. (A gears level model would say A, AI company capital can lead to B, goods and services from AI, and thus also feeds back to A and thus owning shares is a share of infinitely)

Comment by Gerald Monroe (gerald-monroe) on Rationality !== Winning · 2023-07-24T17:22:19.518Z · LW · GW

A black swan is generally an event we knew was possible that had less than the majority of the probability mass.

The flaw with them is not actually an issue with rationality(or other forms of decision making) but due to human compute and memory limits.

If your probability distribution for each trading day on a financial market is p=0.51 +, p=0.48 -, p=0.01 black swan, you may simply drop that long tail term from your decisionmaking. Only considering the highest probability terms is an approximation and is arguably still "rational" since you are reasoning on math and evidence, but you will be surprised by the black swan.

This leads naturally into the next logical offshoot. A human meat computer doesn't have the memory or available compute to consider every low probability long tail event, but you could build an artificial system that does. Part of the reason AI is so critically important and directly relevant to rationality.

Now a true black swan, one we didn't even know was possible? Yeah you are going to be surprised every time. If aliens start invading from another dimension you need to be able to rapidly update your assumptions about how the universe works and respond accordingly. Which rationality, vs alternatives like "the word of the government sanctioned authority on a subject is truth", adapts well too.

This is where being too overconfident hurts. In the event of an ontology breaking event like the invasion example, if you believe p=1.0 the laws of physics as discovered in the 20th century are absolute and complete, what you are seeing in front of your eyes as you reload your shotgun, alien blood splattered everywhere, can't be real. Has to be some other explanation. This kind of thinking is suboptimal.

Similarly if you have the same confidence in theories constructed on decades of high quality data and carefully reasoned on, with lots of use of mathematical proofs, as some random rumor you hear online, you will see nonexistent aliens everywhere. You were not weighting your information inputs by probability.

Comment by Gerald Monroe (gerald-monroe) on News : Biden-⁠Harris Administration Secures Voluntary Commitments from Leading Artificial Intelligence Companies to Manage the Risks Posed by AI · 2023-07-22T02:33:05.290Z · LW · GW

Just to organize this:

Summary: every point hugely advantages a small concentration of wealthy AI companies, and once these become legal requirements it will entrench them indefinitely.  And it in no ways slows capabilities, in fact it seems to be implicitly giving permission to push them as far as possible.

  • The companies commit to internal and external security testing of their AI systems before their release. This testing, which will be carried out in part by independent experts, guards against some of the most significant sources of AI risks, such as biosecurity and cybersecurity, as well as its broader societal effects.

Effect on rich companies: they need to do extensive testing to deliver competitive products.  Capabilities go hand in hand with reliability, every tool humanity uses that is capable is highly reliable.  

Effect on poor companies : the testing burden prevents them from being able to gain early revenue on a shoddy product, preventing them from competing at all.

Effect on advancing capabilities : minimal


  • The companies commit to sharing information across the industry and with governments, civil society, and academia on managing AI risks. This includes best practices for safety, information on attempts to circumvent safeguards, and technical collaboration.

Effect on rich companies: they need to pay for another group of staff/internal automation tools to deliver these information sharing reports, carefully scripted to look good/not reveal more than the legal minimum.

Effect on poor companies : the reporting burden reduces their runway further, preventing all but a few extremely well funded startups from existing at all

Effect on advancing capabilities : minimal


  • The companies commit to investing in cybersecurity and insider threat safeguards to protect proprietary and unreleased model weights. These model weights are the most essential part of an AI system, and the companies agree that it is vital that the model weights be released only when intended and when security risks are considered.

Effect on rich companies: they already want to do this, this is how they protect their IP.

Effect on poor companies : the security burden reduces their runway further

Effect on advancing capabilities : minimal


  • The companies commit to facilitating third-party discovery and reporting of vulnerabilities in their AI systems. Some issues may persist even after an AI system is released and a robust reporting mechanism enables them to be found and fixed quickly.

This is the same as the reporting case


  • The companies commit to developing robust technical mechanisms to ensure that users know when content is AI generated, such as a watermarking system. This action enables creativity with AI to flourish but reduces the dangers of fraud and deception.

Effect on rich companies: they already want to avoid legal responsibility for use of AI in deception.  Stripping the watermark puts the liability on the scammer.

Effect on poor companies : watermarks slightly reduce their runway

Effect on advancing capabilities : minimal


  • The companies commit to publicly reporting their AI systems’ capabilities, limitations, and areas of appropriate and inappropriate use. This report will cover both security risks and societal risks, such as the effects on fairness and bias.

This is another form of reporting.  Same effects as above

  • The companies commit to prioritizing research on the societal risks that AI systems can pose, including on avoiding harmful bias and discrimination, and protecting privacy. The track record of AI shows the insidiousness and prevalence of these dangers, and the companies commit to rolling out AI that mitigates them.   

Effect on rich companies: Now they need another internal group doing this research, for each AI company.

Effect on poor companies : Having to pay for another required internal group reduces their runway

Effect on advancing capabilities : minimal

  • The companies commit to develop and deploy advanced AI systems to help address society’s greatest challenges. From cancer prevention to mitigating climate change to so much in between, AI—if properly managed—can contribute enormously to the prosperity, equality, and security of all.

Effect on rich companies: This is carte blanche to do what they already were planning to do.  Also, this says 'fuck AI pauses', direct from the Biden administration.  GPT-4 is not nearly capable enough to solve any of "societies greatest challenges".  It's missing modalities and general ability to get anything but the simplest tasks accomplished reliably.  To add those additional components will take far more compute, such as the multi-exaflop AI supercomputers everyone is building that obviously will allow models that dwarf GPT-4.

Effect on poor companies : Well they aren't competing with megamodels, but they already were screwed by the other points

Effect on advancing capabilities : 

William Shatner James Kirk GIF - William Shatner James Kirk Captain Kirk GIFs

Comment by Gerald Monroe (gerald-monroe) on AI Risk and Survivorship Bias - How Andreessen and LeCun got it wrong · 2023-07-14T19:19:06.278Z · LW · GW

The "survival bias" argument is overgeneralizing.  For each technology mentioned and many others, the number of wrong ways to use/construct an implementation using the technology greatly exceeds the number of correct ways.  We found the correct ways through systematic iteration.  

As a simple example, fire has escaped engines many times, and caused all types of vehicles to burn.  It took methodical and careful iteration to improve engines to the point that this usually doesn't happen, and vehicles have fire suppression systems, firewalls, and many other design elements to deal with this expected risk.  Note we do throw away performance, even combat jet fighters carry the extra weight of fire suppression systems. 

Worst case you burn down Chicago.

Humans would be able to do the same for AI if humans are able to iterate on many possible constructions for an AGI, cleaning up the aftermath in the events where they find out about significant flaws late (which is why deception is such a problem).  The "doom" argument is that an AGI can be made that has such an advantage it kills humans or disempowers humans before humans and other AI systems working for humans can react.  


To support the doom argument you need to provide evidence for the main points:

(1) that humans can construct an ASI and provide the information necessary to train it that has a massive margin over human beings AND

(2) the ASI can run in useful timescales when performing at this level of cognition, inferencing on computational hardware humans can build AND

(3) whatever resources (robotics, physical tasks performed) the ASI can obtain, above the ones required for the ASI to merely exist and satisfy humans (essentially "profit") are enough to kill/disempower humans AND

(4a) other AGI/ASI built by humans, and humans, are unable to stop it because they are less intelligent, despite potentially having a very large (orders of magnitude) advantage in resources and weapons OR

(4b) humans are scammed and support the ASI


If you wanted to argue against doom, or look for alignment ideas you can look for ways to limit each of these points.  For example,

(1) does intelligence actually scale this way or is it diminishing returns?  An accelerationist argument would point to the current data across many experiments saying it is in fact diminishing, or theoretical optimal policy arguments that prove it always has diminishing returns.

       An alignment idea would be to subdivide AGI/ASI systems into smaller, better defined systems and you would not expect more than a slight performance penalty because of diminishing returns.

(2) This is a diminishing returns argument, you need logarithmically more compute to get linearly more intelligence.  An accelerationist argument would count how many thousand H100s one running 'instance' of a strong ASI would likely need, and point out worldwide compute production won't be enough for decades at the current ramp rate. (and a pro doom argument would point out production can be scaled up many OOM)

      An alignment idea would be to register and track where high performance AI chips are purchased and     deployed, limiting deployment to licensed data centers, and to audit datacenters to ensure all their loads are human customers and they are not harboring an escaped AGI. 

( 3)  An accelerationist would argue that humans can prevent doom with sparse systems that are tightly supervised, and an accelerationist argument would be that humans will do this naturally, it's what EMH demands.  (competing AGI in a free market will not have any spare capacity to kill humans, as they are too busy spending all their resources trying to make money)

        This is sparse/myopia/ "deception?  I ain't got time for that"

(4a)  An accelerationist would argue that humans should race to build many kinds of powerful but restricted AGI/ASI as rapidly as possible, so they can stop AGI doom, by having a large stockpile of weapons and capabilities.  

        Note that this is what every alignment lab ends up doing.  I have talked to one person who suggested they     should develop combat drones that can be mass produced as an alignment strategy, aka an offensive defense against hostile AGI/ASI by having the capability to deploy very large numbers of essentially smart bombs.  So long as humans retain control, this might be a viable idea...    

(4b) This is an argument that we're screwed and deserve to die, an accelerationist argument would be that if humans are this stupid they deserve to die.

        I'm not sure how we fight this, this is my biggest fear.  That we can win on a technical level and it's possible to win without needing unrealistic international cooperation, but we die because we got scammed.  Lightly touching politics: this seems to be an entirely plausible risk.  There are many examples of democracies failing and picking obviously self-interested leaders who are obviously unqualified for the role.

Summary: the accelerationist arguments made in this debate are weak.  I pointed out some stronger ones.  

Comment by Gerald Monroe (gerald-monroe) on “Reframing Superintelligence” + LLMs + 4 years · 2023-07-11T19:17:23.531Z · LW · GW

I think you should examine this claim in more detail because this is the crux of everything.

What you are trying to say, rephrased:

  1. I am preparing for a war where I expect to have to attack rogue AIs.
  2. I expect the rogues to either be operating on the land of weak countries, to be assisting allies with infections on their own territory, or to have to deal with rogues gaining control of hostile superpowers

So my choices are :

  1. I use conventional weapons
  2. I use AGI in very limited, controlled ways such as to exponentially manufacture more semi automated weapons with limited and safe onboard intelligence. (For example a missile driven by low level controllers is safe)
  3. I use AGI in advisory data analysis roles and smarter weapons driven by onboard limited AI. (For example a missile or drone able to recognize targets but only after this control authority is unlocked after traveling to the target area)
  4. I use AGI but in limited, clearly separated roles all throughout the war machine
  5. I say yolo and assign the entire task of fighting the war to AGI to monolithic, self modifying systems even though I know this is how rogue AI was created. Even on testing the self modification makes their behavior inconsistent and they sometimes turn on their operators even in simulation.

The delta between 4 and 5 is vast. 5 is going to fail acceptance testing and is not consistent with conventional engineering practice because a self modifying system isn't static and you can't be certain the delivered product is the same as the tested one.

You would have to already be losing a war with rogue AI before someone would resort to 5.

I think part of the gap here is there's a large difference between what you might do to make an agent to help someone with their homework or for social media and what you would do when live bombs are involved. Engineering practices are very different and won't simply be thrown away simply because working AGI is new.

This is also true for construction, industry, and so on. What you need to do is very different.

Comment by Gerald Monroe (gerald-monroe) on ask me about technology · 2023-07-08T00:23:57.952Z · LW · GW

What is your opinion on general robotics (driven by an early form of AGI) and then robotics self manufacturing?

I have personally thought that using a technique that is a straightforward expansion of LLM training, but you train on human manipulation from all available video, would give you a foundation model. Then you would add rl fine tuning and millions of simulated years of RL training to the foundation model to develop excellent and general robot performance.

Do you think this isn't that straightforward or near term?

I ask because if robots can go from their narrow and limited roles now to performing most tasks if given information like (an example or a prompt or a goal schematic) it would change everything.

It would trivialize energy transitions for one. And deep mining and ocean mining and so on. (Because you task some robots with building and assembling others, some with building and deploying energy collection, some with mining, and are now rate limited not really by money directly but by materials, regulations, energy, or mining rights)

Your opinions must rest on an assumption that this problem cannot be solved anytime soon. How confident are you in this belief? What are the obstacles you believe are limiting in developing such a robotics technique?

Past efforts failed but didn't have sufficient compute, and a multimodal technique as described would need all the compute for gpt-4 plus all the compute for image processing, as well as much more training and quality checks during inference. So essentially the technique has never once been attempted at scale.

Comment by Gerald Monroe (gerald-monroe) on Holly_Elmore's Shortform · 2023-07-03T19:40:35.528Z · LW · GW

I think the most coherent argument above is discount rate. Using the discount rate model you and I are both wrong. Since AGI is an unpredictable number of years away, as well as life extension, neither of us has any meaningful support for our positions among the voting public. You need to show the immediacy of your concerns about AGI, I need to show life extension driven by AGI beginning to visibly work.

AI pauses will not happen due to this discounting. (So it's irrelevant whether or not they are good or bad). That's because the threat is far away and uncertain, while the possible money to be made is near and essentially all the investment money on earth "wants" to bet on AI/AGI. (As rationally speaking there is no greater expected roi)

Please note I am sympathetic to your position, I am saying "will not happen" as a strong prediction based on the evidence, not what I want to happen.

Comment by Gerald Monroe (gerald-monroe) on When do "brains beat brawn" in Chess? An experiment · 2023-07-01T21:55:36.588Z · LW · GW

The implication for AI / AGI is that humans will never create human-similar AI. Everything we make will be way ahead in many areas and way behind in others

Is this not a mere supervised learning problem?  You're saying, for some problem domain D, you want to predict the probability distribution of actions a Real Human would emit when given a particular input sample.  

This is what a GPT is, it's doing something very close to this, by predicting, from the same input text string a human was using, what they are going to type next.  

We can extend this, to video, and obviously first translate video of humans to joint coordinates, and from sounds they emit back to phonemes, then do the same prediction as above.

We would expect to get an AI system from this method that approximates the average human from the sample set we trained on.  This system will be multimodal and able to speak, run robotics, and emit text.

Now, after that, we train using reinforcement learning, and that feedback can clear out mistakes, so that the GPT system is now less and less likely to emit "next tokens" that the consensus for human knowledge believes is wrong.  And the system never tires and the hardware never miscalculates. 

And we can then use machine based RL - have robots attempt tasks in sim and IRL, autonomously grade them on how well the task was done.  Have the machine attempt to use software plugins, RL feedback on errors and successful tool usage.  Because the machinery can learn on a larger scale due to having more time to learn than a human lifetime, it will soon exceed human performance.

And we also have more breadth with a system like this than any single individual living human.

But I think you can see how, if you wanted to, you could probably find a solution based on the above that emulates the observable outputs of a single typical human.