Posts

Bit Flip 2023-04-16T07:30:55.720Z
Nyarlathotep Stirs: A Meta-Narrative ChatGPT Story 2023-03-20T08:00:39.079Z

Comments

Comment by Charlie Sanders (charlie-sanders) on Conflicts between emotional schemas often involve internal coercion · 2023-05-17T14:35:28.367Z · LW · GW

There's a parallelism here between the mental constructs you're referring to and the physical architecture of the human body. For instance, each lobe of our brain has been associated with various tasks, goals, and activities. When you take a breath, your Medulla Oblongata has taken in information about levels of carbon dioxide in the blood via pH monitoring, decided that your blood has too much carbon dioxide, and has sent a request to the respiratory center to breathe. But you've also got a cerebral cortex that also gets a say in the decisions made by the respiratory center, and those two brain areas negotiate via highly complex, fully unconscious interactions to decide what directive the respiratory center actually follows. 

To summarize: you're now breathing manually.

Comment by Charlie Sanders (charlie-sanders) on An artificially structured argument for expecting AGI ruin · 2023-05-08T05:54:46.707Z · LW · GW

I do not believe that 3a is sufficiently logically supported. The criticism of AI risk that have seemed the strongest to me have been about how there is no engagement in the AI alignment community about the various barriers that undercut this argument. Against them, The conjecture about what protein folding and ribosomes might one have the possibility to do really weak counterargument, based as it is on no empirical or evidentiary reasoning.

Specifically, I believe further nuance is needed about the can vs will distinction in the assumption that the first AGI to make a hostile move will have sufficient capability to reasonably guarantee decisive strategic advantage. Sure, it’s of course possible that some combination of overhang risk and covert action allows a leading AGI to make some amount of progress above and beyond humanity’s in terms of technological advancement. But the scope and scale of that advantage is critical, and I believe it is strongly overstated. I can accept that an AGI could foom overnight - that does not mean that it will, simply by virtue of it being hypothetically possible.

All linked resources and supporting arguments have a common thread of taking it for granted that cognition alone can give an AGI a decisive technology lead. My model of cognition is instead of a logarithmically decreasing input into the rate of technological change. A little bit of extra cognition will definitely speed up scientific progress on exotic technological fronts, but an excess of cognition is not fungible for other necessary inputs to technological progress, such as the need for experimentation for hypothesis testing and problem solving on real world constraints related to unforeseen implementation difficulties related to unexplored technological frontiers.

Based on this, I think the fast takeoff hypothesis falls apart and a slow takeoff hypothesis is a much more reasonable place to reason from.

Comment by Charlie Sanders (charlie-sanders) on Davidad's Bold Plan for Alignment: An In-Depth Explanation · 2023-04-20T12:47:44.808Z · LW · GW

My intuition is that a simulation such as the one being proposed would take far longer to develop than the timeline outlined in this post. I’d posit that the timeline would be closer to 60 years than 6.

Also, a suggestion for tl;dr: The Truman Show for AI.

Comment by Charlie Sanders (charlie-sanders) on The basic reasons I expect AGI ruin · 2023-04-19T16:14:31.364Z · LW · GW

Agreed. A common failure mode in these discussions is to treat intelligence as equivalent to technological progress, instead of as an input to technological progress. 

Yes, in five years we will likely have AIs that will be able to tell us exactly where it would be optimal to allocate our scientific research budget. Notably, that does not mean that all current systemic obstacles to efficient allocation of scarce resources will vanish. There will still be the same perverse incentive structure for funding allocated to scientific progress as there is today, general intelligence or no.

Likewise, researchers will likely be able to make the actual protocols and procedures necessary to generate scientific knowledge as optimized as is possible with the use of AI. But a centrifuge is a centrifuge is a centrifuge. No amount of intelligence will make a centrifuge that takes a minimum of an hour to run take less than an hour to run. 

Intelligence is not an unbounded input to frontiers of technological progress that are reasonably bounded by the constraints of physical systems.

Comment by Charlie Sanders (charlie-sanders) on But why would the AI kill us? · 2023-04-18T02:44:21.786Z · LW · GW

One of the unstated assumptions here is that an AGI has the power to kill us. I think it's at least feasible that the first AGI that tries to eradicate humanity will lack the capacity to eradicate humanity - and any discussion about what an omnipotent AGI would or would not do should be debated in a universe where a non-omnipotent AGI has already tried and failed to eradicate humanity. 

Comment by Charlie Sanders (charlie-sanders) on "Carefully Bootstrapped Alignment" is organizationally hard · 2023-04-04T15:58:16.911Z · LW · GW

In many highly regulated manufacturing organizations there are people working for the organization whose sole job is to evaluate each and every change order for compliance to stated rules and regulations - they tend to go by the title of Quality Engineer or something similar. Their presence as a continuous veto point for each and every change, from the smallest to the largest, aligns organizations to internal and external regulations continuously as organizations grow and change.

This organizational role needs to have an effective infrastructure supporting it in order to function, which to me is a strong argument for the development for a set of workable regulations and requirements related to AI safety. With such a set of rules, you’d have the infrastructure necessary to jump-start safety efforts by simply importing Quality Engineers from other resilient organizations and implementing the management of change that’s already mature and pervasive across many other industries.

Comment by Charlie Sanders (charlie-sanders) on "Carefully Bootstrapped Alignment" is organizationally hard · 2023-04-04T15:50:45.169Z · LW · GW

As someone that interacts with Lesswrong primarily via an RSS feed of curated links, I want to express my appreciation for curation when it’s done early enough to be able to participate early in the comment section development lifestyle. Kudos for quick curation here.

Comment by Charlie Sanders (charlie-sanders) on Given the Restrict Act, Don’t Ban TikTok · 2023-04-04T15:22:55.059Z · LW · GW

How else were people thinking this ban was going to be able to go into effect? America has a Constitution that defines checks and balances. This legislation is how you do something like "Ban TikTok" without it being immediately shot down in court. 

Comment by Charlie Sanders (charlie-sanders) on Nyarlathotep Stirs: A Meta-Narrative ChatGPT Story · 2023-03-21T05:44:06.056Z · LW · GW

It's verbatim. I think it picked up on the concept of the unreliable narrator from the H.P. Lovecraft reference and incorporated it into the story where it could make it fit - but then, maybe I'm just reading into things. It's only guessing the next word, after all!

Comment by Charlie Sanders (charlie-sanders) on AI alignment researchers don't (seem to) stack · 2023-03-12T22:33:08.451Z · LW · GW

Just to call it out, this post is taking the Great Man Theory of historical progress as a given, whereas my understanding of the theory is that it’s highly disputed/controversial in academic discourse.

Comment by Charlie Sanders (charlie-sanders) on Alignment works both ways · 2023-03-07T19:53:59.347Z · LW · GW
  1. It would by definition not be bad thing. "Bad thing" is a low-effort heuristic that is inappropriate here, since I interpret "bad" to mean that which is not good and good includes aggregate human desires which in this scenario has been defined to include a desire to be turned into paperclips. 
  2. The ideal scenario would be for humans and AIs to form a mutually beneficial relationship where the furtherance of human goals also furthers the goals of AIs. One potential way to accomplish would be to create a Neuralink-esque integrations of AI into human biology in such a way that human biology becomes an intrinsic requirement for future AI proliferation. If AGIs require living, healthy, happy humans in order to succeed, then they will ensure that humans are living, happy, and healthy. 
Comment by Charlie Sanders (charlie-sanders) on Bankless Podcast: 159 - We’re All Gonna Die with Eliezer Yudkowsky · 2023-03-07T19:07:45.113Z · LW · GW

See, this is the perfect encapsulation of what I'm saying - it could design a virus, sure. But when it didn't understand parts of the economy, that's all it would be - a design. Taking something from the design stage to the "physical, working product with validated processes that operate with sufficient consistency to achieve the desired outcome" is a vast, vast undertaking, one that requires intimate involvement with the physical world. Until that point is reached, it's not a "kill all humans but fail to paperclip everyone" virus, it's just a design concept. Nothing more. More and more I see those difficulties being elided over by hypothetical scenarios that skip straight from the design stage and presuppose that the implementation difficulties aren't worth consideration, or that if they are they won't serve as a valid impediment. 

Comment by Charlie Sanders (charlie-sanders) on Bankless Podcast: 159 - We’re All Gonna Die with Eliezer Yudkowsky · 2023-03-07T19:03:57.800Z · LW · GW

There exists a diminishing returns to thinking about moves versus performing the moves and seeing the results that the physics of the universe imposes on the moves as a consequence. 

Think of it like AlphaGo - if it only ever could train itself by playing Go against actual humans, it would never have become superintelligent at Go. Manufacturing is like that - you have to play with the actual world to understand bottlenecks and challenges, not a hypothetical artificially created simulation of the world. That imposes rate-of-scaling limits that are currently being discounted. 

Comment by Charlie Sanders (charlie-sanders) on Beginning to feel like a conspiracy theorist · 2023-03-07T16:50:45.552Z · LW · GW

I'm a firm believer in avoiding the popular narrative, and so here's my advice - you are becoming a conspiracy theorist. You just linked to a literal conspiracy theory with regards to face masks, one that has been torn apart as misleading and riddled with factual errors. As just one example, Cochrane's review specifically did not evaluate "facemasks", it evaluated "policies related to the request to wear face masks". Compliance to the stated rule was not evaluated, and it is therefore a conspiracy theory to go from an information source that says "this policy doesn't work" and end up with the takeaway "masks don't work". As other commenters have pointed out, it is physically implausible for facemasks to not work if they are used correctly.

The definitionally correct term to use for you is "conspiracy theorist" so long as this is a thing that you, after conducting your own research, have come to believe. Take your belief in the facemask thing as concrete evidence that your friends and family are correct and that you are indeed straying down the path of believing more and more improbable and conspiratorial things. 

Comment by Charlie Sanders (charlie-sanders) on Bankless Podcast: 159 - We’re All Gonna Die with Eliezer Yudkowsky · 2023-02-23T04:07:24.313Z · LW · GW

If all of EY's scenarios require deception, then detection of deception from rogue AI systems seems like a great place to focus on. Is there anyone working on that problem?

Comment by Charlie Sanders (charlie-sanders) on Bankless Podcast: 159 - We’re All Gonna Die with Eliezer Yudkowsky · 2023-02-22T04:09:04.372Z · LW · GW

Listening to Eliezer walk through a hypothetical fast takeoff scenario left me with the following question:  Why is it assumed that humans will almost surely fail the first time at their scheme for aligning a superintelligent AI, but the superintelligent AI will almost surely succeed the first time at its scheme for achieving its nonaligned outcome? 

Speaking from experience, it's hard to manufacture things in the real world. Doubly so for anything significantly advanced. What is the rationale for assuming that a nonaligned superintelligence won't trip up at some stage in the hypothetical "manufacture nanobots" stage of its plan? 

If I apply the same assumption of initial competence extended to humanity's attempt to align an AGI to that AGI's competence in successfully manufacturing some agentic-increasing tool, then the most likely scenario I get is that we'll see the first unaligned AGI's attempt at takeoff long before it actually succeeds in destroying humanity.

Comment by Charlie Sanders (charlie-sanders) on Escape Velocity from Bullshit Jobs · 2023-01-11T17:13:39.928Z · LW · GW

Agreed. Facilitation- focused jobs (like the ones derided in this post) might look like bullshit to an outsider, but in my experience they are absolutely critical to effectively achieving goals in a large organization.

Comment by Charlie Sanders (charlie-sanders) on AGI and the EMH: markets are not expecting aligned or unaligned AI in the next 30 years · 2023-01-11T17:05:23.376Z · LW · GW

For 99% of people, the only viable option to achieve this is refinancing your mortgage to take any equity out and resetting terms to a 30 year loan duration.

Comment by Charlie Sanders (charlie-sanders) on What it's like to dissect a cadaver · 2022-11-13T19:47:12.628Z · LW · GW

Do you happen to know the specifics of how these cadavers came to be available? There's recently been some investigative reporting on this topic. The broad gist is that most people who are "donating their bodies to science" probably don't get that companies will take those donated bodies and sell them for what are essentially tourist attractions like the one that you're participating in. 

Comment by Charlie Sanders (charlie-sanders) on Decision theory does not imply that we get to have nice things · 2022-11-09T20:41:43.022Z · LW · GW

I'm finding myself developing a shorthand heuristic to figure out how LDT would be applied in a given situation: assume time travel is a thing. 

If time travel is a thing, then you'd obviously want to one-box Newcomb's paradox because the predictor knows the future. 

If time travel is a thing, then you'd obviously want to cooperate in a prisoner's dilemma game given that your opponent knows the future. 

If time travel is a thing, then any strategies that involve negotiating with a superintelligence that are not robust to a future version of the superintelligence having access to time travel will not work. 

Comment by Charlie Sanders (charlie-sanders) on Consider your appetite for disagreements · 2022-10-26T20:52:15.004Z · LW · GW

As someone that frequently has work reviewed by crossfunctional groups prior to implementation, I only object to change requests that I feel will make the product significantly worse. There's simply too much value lost in debating nitpicks.

Comment by Charlie Sanders (charlie-sanders) on [$10k bounty] Read and compile Robin Hanson’s best posts · 2021-10-20T23:25:40.261Z · LW · GW

Robin’s sense of honor would probably prevent him from litigating this, but that absolutely would not hold up in court.

Comment by Charlie Sanders (charlie-sanders) on [$10k bounty] Read and compile Robin Hanson’s best posts · 2021-10-20T23:02:36.328Z · LW · GW

Have you verified with Robin that he is okay with this from a copyright standpoint?

Comment by Charlie Sanders (charlie-sanders) on Preference synthesis illustrated: Star Wars · 2020-01-11T01:17:35.815Z · LW · GW

Why did you exclude Solo?