Posts

Book review: Deep Utopia 2024-04-23T19:55:50.417Z
Exploring Whole Brain Emulation 2024-04-06T02:38:15.631Z
Self-Resolving Prediction Markets 2024-03-03T02:39:42.212Z
Manifold Markets 2024-02-02T17:48:36.630Z
Dark Skies Book Review 2023-12-29T18:28:59.352Z
When Will AIs Develop Long-Term Planning? 2023-11-19T00:08:18.603Z
Disentangling Our Terminal and Instrumental Values 2023-10-14T03:35:23.852Z
Provably Safe AI 2023-10-05T22:18:26.013Z
The Coming Wave 2023-09-28T22:59:58.551Z
Require AGI to be Explainable 2023-09-21T16:11:25.112Z
Near San Francisco - Hike to watch the Blue Angels 2023-09-16T22:00:39.113Z
Will an Overconfident AGI Mistakenly Expect to Conquer the World? 2023-08-25T23:24:32.206Z
Existential Risk Persuasion Tournament 2023-07-17T18:04:02.794Z
Foom Liability 2023-06-30T03:55:22.698Z
How to Slow AI Development 2023-06-07T00:29:38.355Z
Four Battlegrounds: Power in the Age of Artificial Intelligence (Book review) 2023-05-21T21:19:17.228Z
OpenAI's GPT-4 Safety Goals 2023-04-22T19:11:42.945Z
On Caring about our AI Progeny 2023-04-14T19:32:15.727Z
Pause AI Development? 2023-04-06T17:23:50.580Z
Book review: How Social Science Got Better 2023-02-15T19:58:13.611Z
Review of AI Alignment Progress 2023-02-07T18:57:41.329Z
Evidence under Adversarial Conditions 2023-01-09T16:21:07.890Z
Investing for a World Transformed by AI 2023-01-01T02:47:06.004Z
Review: LOVE in a simbox 2022-11-27T17:41:26.067Z
Drexler’s Nanotech Forecast 2022-07-30T00:45:43.169Z
QNR Prospects 2022-07-16T02:03:37.258Z
Cooperation with and between AGI\'s 2022-07-07T16:45:23.605Z
Agenty AGI – How Tempting? 2022-07-01T23:40:16.610Z
The Amish 2022-04-12T02:54:03.966Z
Book review: The Dawn of Everything 2022-03-11T01:56:13.494Z
AI Fire Alarm Scenarios 2021-12-28T02:20:34.637Z
Book Review: Why Everyone (Else) Is a Hypocrite 2021-10-09T03:31:27.711Z
Book review: Shut Out 2021-09-03T19:07:31.446Z
Book review: The Explanation of Ideology 2021-07-20T03:42:36.250Z
New Dementia Trial Results 2021-07-02T23:04:46.548Z
Avoid News, Part 2: What the Stock Market Taught Me about News 2021-06-14T20:54:05.386Z
Book review: The Geography of Thought 2021-02-09T18:47:35.781Z
The Flynn Effect Clarified 2020-12-12T05:18:53.327Z
Book review: WEIRDest People 2020-11-30T03:33:17.510Z
Book review: Age Later 2020-10-27T04:21:12.428Z
Black Death at the Golden Gate (book review) 2020-06-26T16:09:06.866Z
COVID-19: The Illusion of Stability 2020-06-08T18:46:54.259Z
Book review: Human Compatible 2020-01-19T03:32:04.989Z
Another AI Winter? 2019-12-25T00:58:48.715Z
Book Review: The AI Does Not Hate You 2019-10-28T17:45:26.050Z
Drexler on AI Risk 2019-02-01T05:11:01.008Z
Bundle your Experiments 2019-01-18T23:22:08.660Z
Time Biases 2019-01-12T21:35:54.276Z
Book review: Artificial Intelligence Safety and Security 2018-12-08T03:47:17.098Z
Where is my Flying Car? 2018-10-15T18:39:38.010Z

Comments

Comment by PeterMcCluskey on Increasing IQ by 10 Points is Possible · 2024-03-20T14:46:26.819Z · LW · GW

What evidence do you have about how much time it takes per day to maintain the effect after the end of the 2 weeks?

Comment by PeterMcCluskey on Increase the tax value of donations with high-variance investments? · 2024-03-03T02:59:37.735Z · LW · GW

The part about "securities with huge variance" is somewhat widely used. See how much EA charities get from crypto and tech startup stock donations.

It's unclear whether the perfectly anti-correlated pair improves this kind of strategy. I guess you're trying to make the strategy more appealing to risk-averse investors? That sounds like it maybe should work, but is hard because risk-averse investors don't want to be early adopters of a new strategy?

Comment by PeterMcCluskey on Cooperating with aliens and AGIs: An ECL explainer · 2024-02-27T04:55:33.195Z · LW · GW

Doesn't this depend on what we value?

In particular, you appear to assume that we care about events outside of our lightcone in roughly the way we care about events in our near future. I'm guessing a good deal of skepticism of ECL is a result of people not caring much about distant events.

Comment by PeterMcCluskey on Things You're Allowed to Do: At the Dentist · 2024-02-12T06:00:04.236Z · LW · GW

I had nitrous oxide once at a dentist. It is a dissociative anesthetic. It may have caused something like selective amnesia. I remember that the dentist was drilling, but I have no clear memory of pain associated with it. It's a bit hard to evaluate exactly what it does, but it definitely has some benefits. Maybe the pain seemed too distant from me to be worth my attention?

Comment by PeterMcCluskey on Why have insurance markets succeeded where prediction markets have not? · 2024-01-21T01:18:47.925Z · LW · GW

A much higher fraction of the benefits of prediction markets are public goods.

Most forms of insurance did took a good deal of time and effort before they were widely accepted. It's unclear whether there's a dramatic difference in the rate of adoption of prediction markets compared to insurance.

Comment by PeterMcCluskey on LOVE in a simbox is all you need · 2024-01-14T00:49:59.335Z · LW · GW

I'm reaffirming my relatively extensive review of this post.

The simbox idea seems like a valuable guide for safely testing AIs, even if the rest of the post turns out to be wrong.

Here's my too-terse summary of the post's most important (and more controversial) proposal: have the AI grow up in an artificial society, learning self-empowerment and learning to model other agents. Use something like retargeting the search to convert the AI's goals from self-empowerment to empowering other agents.

Comment by PeterMcCluskey on QNR prospects are important for AI alignment research · 2024-01-14T00:48:41.115Z · LW · GW

I'm reaffirming my relatively long review of Drexler's full QNR paper.

Drexler's QNR proposal seems like it would, if implemented, guide AI toward more comprehensible systems. It might modestly speed up capabilities advances, while being somewhat more effective at making alignment easier.

Alas, the full paper is long, and not an easy read. I don't think I've managed to summarize its strengths well enough to persuade many people to read it.

Comment by PeterMcCluskey on You Are Not Measuring What You Think You Are Measuring · 2024-01-11T21:55:08.107Z · LW · GW

This post didn't feel particularly important when I first read it.

Yet I notice that I've been acting on the post's advice since reading it. E.g. being more optimistic about drug companies that measure a wide variety of biomarkers.

I wasn't consciously doing that because I updated due to the post. I'm unsure to what extent the post changed me via subconscious influence, versus deriving the ideas independently.

Comment by PeterMcCluskey on Prediction markets are consistently underconfident. Why? · 2024-01-11T04:35:09.287Z · LW · GW

Exchanges require more capital to move the price closer to the extremes than to move it closer to 50%.

Comment by PeterMcCluskey on Inner and outer alignment decompose one hard problem into two extremely hard problems · 2024-01-11T00:36:01.873Z · LW · GW

This post is one of the best available explanations of what has been wrong with the approach used by Eliezer and people associated with him.

I had a pretty favorable recollection of the post from when I first read it. Rereading it convinced me that I still managed to underestimate it.

In my first pass at reviewing posts from 2022, I had some trouble deciding which post best explained shard theory. Now that I've reread this post during my second pass, I've decided this is the most important shard theory post. Not because it explains shard theory best, but because it explains what important implications shard theory has for alignment research.

I keep being tempted to think that the first human-level AGIs will be utility maximizers. This post reminds me that maximization is perilous. So we ought to wait until we've brought greater-than-human wisdom to bear on deciding what to maximize before attempting to implement an entity that maximizes a utility function.

Comment by PeterMcCluskey on AI Impacts Survey: December 2023 Edition · 2024-01-05T20:40:58.509Z · LW · GW

Oops. I misread which questions you were comparing.

Now that I've read the full questions in the actual paper, it looks like some of the difference is due to "within 100 years" versus at any time horizon.

I consider it far-fetched that much of the risk is over 100 years away, but it's logically possible, and Robin Hanson might endorse a similar response.

Comment by PeterMcCluskey on AI Impacts Survey: December 2023 Edition · 2024-01-05T17:49:51.202Z · LW · GW

I don't quite see this logical contradiction that your Twitter poll asks about.

I wouldn't be surprised if the answers reflect framing effects. But the answers seem logically consistent if we assume that some people believe that severe disempowerment is good.

Comment by PeterMcCluskey on Theoretically, could we balance the budget painlessly? · 2024-01-04T17:50:49.357Z · LW · GW

The Fed can stimulate nominal demand at the ZLB. But (outside of times when it's correcting the results of overly tight monetary conditions) that means mostly more inflation, and has strongly diminishing returns on increased real consumption.

Comment by PeterMcCluskey on Theoretically, could we balance the budget painlessly? · 2024-01-03T18:20:11.676Z · LW · GW

Eventually the economy would reach a new equilibrium (which presumably would contain the same amount of private consumption as the old equilibrium).

I expect less consumption in the new equilibrium.

The Fed has limited power to affect real demand. Fed stimulus is only helpful if there's unemployment due to something like deflation.

Comment by PeterMcCluskey on When Will AIs Develop Long-Term Planning? · 2023-12-15T03:11:43.699Z · LW · GW

I realize now that some of this post was influenced by a post that I'd forgotten reading: Causal confusion as an argument against the scaling hypothesis, which does a better job of explaining what I meant by causal modeling being hard.

Comment by PeterMcCluskey on The likely first longevity drug is based on sketchy science. This is bad for science and bad for longevity. · 2023-12-12T18:06:59.768Z · LW · GW

I agree there's something strange about Loyal's strategy.

But it's not like all aging researchers act like they back Loyal's approach. Intervene Immune has been getting good biomarker results in human trials by taking nearly the opposite approach: raising IGF-1 levels for a while.

I wrote a longer discussion about IGF-1 and aging in my review of Morgan Levine's book True Age.

Comment by PeterMcCluskey on Principles For Product Liability (With Application To AI) · 2023-12-11T03:46:48.999Z · LW · GW

If someone comes into the hospital

That's a bad criterion to use.

See Robin Hanson's Buy Health proposal for a better option.

Comment by PeterMcCluskey on The case for aftermarket blind spot mirrors · 2023-10-10T01:29:47.716Z · LW · GW

Is this the post you're looking for?

I've got a Mercedes with an Active Blind Spot Assist that eliminates the need to worry about this.

Comment by PeterMcCluskey on Provably Safe AI · 2023-10-10T01:18:34.736Z · LW · GW

I understand how we can avoid trusting an AI if we've got a specification that the proof checker understands.

Where I expect to need an AI is for generating the right specifications.

Comment by PeterMcCluskey on Provably Safe AI · 2023-10-06T15:33:20.070Z · LW · GW

Note that effectively we are saying to trust the neural network

I expect that we're going to have to rely on some neural networks regardless of how we approach AI. This paper guides us to be more strategic about what reliance to put on which neural networks.

Comment by PeterMcCluskey on "Diamondoid bacteria" nanobots: deadly threat or dead-end? A nanotech investigation · 2023-09-29T21:23:23.710Z · LW · GW

Freitas' paper on ecophagy has a good analysis of these issues.

Comment by PeterMcCluskey on Orthogonal: A new agent foundations alignment organization · 2023-09-29T04:16:53.888Z · LW · GW

I initially dismissed Orthogonal due to a guess that their worldview was too similar to MIRI's, and that they would give up or reach a dead end for reasons similar to why MIRI hasn't made much progress.

Then the gears to ascension prodded me to take a closer look.

Now that I've read their more important posts, I'm more confused.

I still think Orthogonal has a pretty low chance of making a difference, but there's enough that's unique about their ideas to be worth pursuing. I've donated $15k to Orthogonal.

Comment by PeterMcCluskey on Is Bjorn Lomborg roughly right about climate change policy? · 2023-09-28T03:31:32.302Z · LW · GW

See also my review of his book Cool It. He's often right, but is not a reliable source.

Comment by PeterMcCluskey on What is to be done? (About the profit motive) · 2023-09-08T21:16:54.439Z · LW · GW

Eliminating the profit motive would likely mean that militaries develop dangerous AI a few years later.

I'm guessing that most people's main reason is that it looks easier to ban AI research than to sufficiently reduce the profit motive.

Comment by PeterMcCluskey on The Illusion of Universal Morality: A Dynamic Perspective on Genetic Fitness and Ethical Complexity · 2023-09-05T22:59:18.717Z · LW · GW

The belief in a universal, independent standard for altruism, morality, and right and wrong is deeply ingrained in societal norms.

That's true of the norms in WEIRD cultures. It is far from universal.

Comment by PeterMcCluskey on Will an Overconfident AGI Mistakenly Expect to Conquer the World? · 2023-08-26T00:44:35.010Z · LW · GW

I expect such acausal collaboration to be harder to develop than good calibration, and therefore less likely to happen at the stage I have in mind.

Comment by PeterMcCluskey on Whole Brain Emulation: No Progress on C. elegans After 10 Years · 2023-08-25T01:32:13.943Z · LW · GW

Another report of progress: Mapping the Mind: Worm’s Brain Activity Fully Decoded (full paper).

Comment by PeterMcCluskey on Monthly Roundup #9: August 2023 · 2023-08-07T16:17:04.787Z · LW · GW

the people choosing this many white cars seem low-level insane

The increase in white cars seems to follow a 2007 study An Investigation into the Relationship between Vehicle Colour and Crash Risk which says light-colored cars are safer. Maybe it's just a coincidence.

Comment by PeterMcCluskey on [deleted post] 2023-07-19T22:56:37.780Z

Thank you for narrowing my confusion over what AI_0 does.

My top question now is: how long does AI_0 need to run, and why is it safe from other AIs during that period?

AI_0 appears to need a nontrivial fraction of our future lightcone to produce a decent approximation of the intended output. Yet keeping it boxed seems to leave the world vulnerable to other AIs.

Comment by PeterMcCluskey on Economic Time Bomb: An Overlooked Employment Bubble Threatening the US Economy · 2023-07-12T15:32:17.975Z · LW · GW

I disagree. The macro environment is good enough that the Fed could easily handle any contraction, provided they focus on forward looking indicators, such as the TIPS spread, or near-realtime indicators such as the ISM purchasing manager numbers.

Now seems like a good time for the Fed to start decreasing interest rates.

Comment by PeterMcCluskey on Economic Time Bomb: An Overlooked Employment Bubble Threatening the US Economy · 2023-07-09T16:04:06.356Z · LW · GW

On inflation, see Kevin Erdmann (also here).

Comment by PeterMcCluskey on Economic Time Bomb: An Overlooked Employment Bubble Threatening the US Economy · 2023-07-09T01:42:23.632Z · LW · GW

This is less than half correct.

There's still a widespread labor shortage. A slowdown might mean significant unemployment in Silicon Valley, but it will mean a return to normal in most places.

Inflation is back to normal. It only looks high to people who are focused on lagging indicators such as the CPI.

Comment by PeterMcCluskey on Another medical miracle · 2023-06-26T17:30:44.067Z · LW · GW

Most of the problem with the reference ranges is that they are usually just intended to reflect what 95% of the reference population will have. That's much easier to measure than the range which indicates good health.

There isn't much incentive for any authority to establish guidelines for healthy ranges. So too many people end up equating "normal" results with good results, because normal is what gets quantified, and is usually what is reported on test results.

Comment by PeterMcCluskey on A way to make solving alignment 10.000 times easier. The shorter case for a massive open source simbox project. · 2023-06-21T15:48:13.297Z · LW · GW

I recommend comparing your ideas to a similar proposal in this post.

Comment by PeterMcCluskey on [deleted post] 2023-06-19T03:12:09.525Z

I see hints that a fair amount of value might hiding in this post. Here's an attempt at rewriting the parts of this post that I think I understand, with my own opinions shown in {braces}. I likely changed a good deal of the emphasis to reflect my worldview. I presume my comments will reveal some combination of my confused mangling of your ideas, and your cryptic communication style. I erred on the side of rewriting it from scratch to reduce the risk that I copy your text without understanding it. I'm posting a partial version of it in order to get feedback on how well I've understood the first half, before deciding how to tackle the harder parts.

Eliezer imagined a rapid AI takeoff via AI's being more logical (symbolic) than humans, enabling them to better compress evidence about reality, and to search more efficiently through the space of possible AI designs to more rapidly find improvements. {It's hard to pin down Eliezer's claims clearly here, since much of what he apparently got wrong was merely implicit, not well articulated. Why do you call it regression? Did Eliezer expect training data to matter? }

This was expected to produce a more coherent, unified mind than biological analogies suggested. Eliezer imagines that such minds are very sensitive to initial conditions. { I'm unclear whether framing this in chaos theory terms captures Eliezer's intent well. I'd frame it more in terms of first-mover advantage, broadly applied to include the most powerful parts of an AI's goal stomping out other goals within the AI. }

Recent advances in AI suggest that Eliezer overestimated the power of the kind of rigorous, symbolic thinking associated with math (and/or underestimated the power of connectionist approaches?).

Neural nets provide representations of knowledge that are smooth, in the sense that small changes in evidence / input generate small changes in how the resulting knowledge is encoded. E.g. as a small seedling slowly becomes tall enough to be classified as a tree, the neural net alters its representation from "slightly treelike" to "pretty treelike".

In contrast, symbolic approaches to AI have representations with sharp boundaries. This produces benefits in some conspicuous human interactions (i.e. we want to design the rules of chess so there's no room for a concept like "somewhat checkmated").

It wasn't obvious in advance which approach would work better for having an AI write better versions of it's own code. We now have enough evidence to say that the neural net approach can more usefully absorb large amounts of data, while doing a tolerable job of creating sharp boundaries where needed.

One can imagine something involving symbolic AI that embodies knowledge in a form that handles pattern matching so as to provide functionality similar to neural networks. In particular, it would need to encode the symbolic knowledge in a way that improved versions of the symbolic source code are somehow "near" the AI's existing source code. This "nearness" would provide a smoothness that's comparable to what gradient descent exploits.

{Drexler's QNR? Combining symbolic and connectionist AI. Probably not close to what Eliezer had in mind, and doesn't look like it would cause a much faster takeoff than what Deep Learning suggests. QNR leaves much key knowledge "inscrutable matrices", which I gather is incompatible with Eliezer's model.} {I'm guessing you use the term "short programs" to indicate that in what might be Eliezer's model, code remains separate from the knowledge database, and the important intelligence increases can be accomplished via rewriting the code, and leaving the database relatively constant? Unlike neural nets, where intelligence and a database need to be intertwined. } {I have little idea whether you're accurately portraying Eliezer's model here.}

Neural networks work because they are able to represent knowledge so that improved ideas are near existing ideas. That includes source code: when using neural nets to improve source code, that "nearness" enables a smooth, natural search for better source code.

Eliezer freaks out about foom due to the expectation that there's some threshold of intelligence above which a symbolic AI can do something as powerful as gradient descent on its own source code, presumably without the training phases that neural networks need. Existing research does not suggest that's imminent. We're in trouble if it happens before we have good ways to check each step for safety.

Comment by PeterMcCluskey on Updating Drexler's CAIS model · 2023-06-17T02:40:13.383Z · LW · GW

Oops, you're right. Section 36.6 does advocate modularity, in a way that hints at the vibe you describe. And my review of the CAIS paper did say things about modularity that seem less likely now than they did 4 years ago.

Comment by PeterMcCluskey on Updating Drexler's CAIS model · 2023-06-17T01:13:20.061Z · LW · GW

I agree that people have gotten vibes from the paper which have been somewhat discredited.

Yet I don't see how that vibe followed from what he wrote. He tried to clarify that having systems with specialized goals does not imply they have only narrow knowledge. See section 21 of the CAIS paper ("Broad world knowledge can support safe task performance").

Are people collapsing "AI with narrow goals" and "AI with only specialized knowledge" into one concept "narrow AI"?

Comment by PeterMcCluskey on What will GPT-2030 look like? · 2023-06-08T16:37:27.105Z · LW · GW

Verified safe software means the battle shifts to vulnerabilities in any human who has authority over the system.

Comment by PeterMcCluskey on AI #14: A Very Good Sentence · 2023-06-03T21:04:03.244Z · LW · GW

What I don’t understand is, either in my model or Critch’s, where we find more hope by declining a pivotal act, once one becomes feasible?

Part of the reason for more hope is that people are more trustworthy if they commit to avoiding the worst forms of unilateralist curses and world conquest. So by having committed to avoiding the pivotal act, leading actors became more likely to cooperate in ways that avoided the need for a pivotal act.

If a single pivotal act becomes possible, then it seems likely that it will also be possible to find friendlier pivotal processes that include persuading most governments to take appropriate actions. An AI that can melt nearly all GPUs will be powerful enough to scare governments into doing lots of things that are currently way outside the Overton window.

Comment by PeterMcCluskey on Book review: WEIRDest People · 2023-06-03T20:15:04.277Z · LW · GW

Cheap printing was likely a nontrivial factor, but was influenced by much more than just the character sets. Printing presses weren't very reliable or affordable until a bunch of component technologies reached certain levels of sophistication. Even after they became practical, most cultures had limited interest in them.

Comment by PeterMcCluskey on The case for removing alignment and ML research from the training dataset · 2023-05-31T23:13:06.324Z · LW · GW

Filtering out entire sites seems too broad and too crude to have much benefit.

I see plenty of room to turn this into a somewhat good proposal by having GPT-4 look through the dataset for a narrow set of topics. Something close to "how we will test AIs for deception".

Comment by PeterMcCluskey on Language Agents Reduce the Risk of Existential Catastrophe · 2023-05-30T20:06:22.818Z · LW · GW

A good deal of this post is correct. But the goals of language models are more complex than you admit, and not fully specified by natural language. LLMs do something that's approximately a simulation of a human. Those simulated quasi-humans are likely to have quasi-human goals that are unstated and tricky to observe, for much the same reasons that humans have such goals.

LLMs also have goals that influence what kind of human they simulate. We'll know approximately what those goals are, due to our knowledge of what generated those goals. But how do we tell whether approximately is good enough?

Comment by PeterMcCluskey on Book Review: How Minds Change · 2023-05-27T16:51:06.398Z · LW · GW

No. I found a claim of good results here. Beyond that I'm relying on vague impressions from very indirect sources, plus fictional evidence such as the movie Latter Days.

Comment by PeterMcCluskey on Book Review: How Minds Change · 2023-05-27T02:19:37.172Z · LW · GW

Many rationalists do follow something resembling the book's advice.

CFAR started out with too much emphasis on lecturing people, but quickly noticed that wasn't working, and pivoted to more emphasis on listening to people and making them feel comfortable. This is somewhat hard to see if you only know the rationalist movement via its online presence.

Eliezer is far from being the world's best listener, and that likely contributed to some failures in promoting rationality. But he did attract and encourage people who overcame his shortcomings for CFAR's in-person promotion of rationality.

I consider it pretty likely that CFAR's influence has caused OpenAI to act more reasonably than it otherwise would act, due to several OpenAI employees having attended CFAR workshops.

It seems premature to conclude that rationalists have failed, or that OpenAI's existence is bad.

Sorry, it doesn’t look like the conservatives have caught on to this kind of approach yet.

That's not consistent with my experiences interacting with conservatives. (If you're evaluating conservatives via broadcast online messages, I wouldn't expect you to see anything more than tribal signaling).

It may be uncommon for conservatives to use effective approaches at explicitly changing political beliefs. That's partly because politics are less central to conservative lives. You'd likely reach a more nuanced conclusion if you compare how Mormons persuade people to join their religion, which incidentally persuades people to become more conservative.

Comment by PeterMcCluskey on Babble on growing trust · 2023-05-21T22:34:29.811Z · LW · GW

Does the literature on the economics of reputation have ideas that are helpful?

Comment by PeterMcCluskey on LeCun’s “A Path Towards Autonomous Machine Intelligence” has an unsolved technical alignment problem · 2023-05-09T21:14:59.287Z · LW · GW

I haven't thought this out very carefully. I'm imagining a transformer trained both to predict text, and to predict the next frame of video.

Train it on all available videos that show realistic human body language.

Then ask the transformer to rate on a numeric scale how positively or negatively a human would feel in any particular situation.

This does not seem sufficient for a safe result, but implies that LeCun is less nutty than your model of him suggests.

Comment by PeterMcCluskey on LeCun’s “A Path Towards Autonomous Machine Intelligence” has an unsolved technical alignment problem · 2023-05-09T01:49:36.434Z · LW · GW

Why assume LeCun would use only supervised learning to create the IC module?

If I were trying to make this model work, I'd use mainly self-supervised learning that's aimed at getting the module to predict what a typical human would feel. (I'd also pray for a highly multipolar scenario if I were making this module immutable when deployed.)

Comment by PeterMcCluskey on A Case for the Least Forgiving Take On Alignment · 2023-05-06T02:56:32.000Z · LW · GW

Might this paradigm be tested by measuring LLM fluid intelligence?

I predict that a good test would show that current LLMs have modest amounts of fluid intelligence, and that LLM fluid intelligence will increase in ways that look closer to continuous improvement than to a binary transition from nothing to human-level.

I'm unclear whether it's realistic to get a good enough measure of fluid intelligence to resolve this apparent crux, but I'm eager to pursue any available empirical tests of AI risk.

Comment by PeterMcCluskey on A Case for the Least Forgiving Take On Alignment · 2023-05-06T02:55:39.773Z · LW · GW

Upvoted for clarifying a possibly important crux. I still have trouble seeing a coherent theory here.

I can see a binary difference between Turing-complete minds and lesser minds, but only if I focus on the infinite memory and implicitly infinite speed of a genuine Turing machine. But you've made it clear that's not what you mean.

When I try to apply that to actual minds, I see a wide range of abilities at general-purpose modeling of the world.

Some of the differences in what I think of as general intelligence are a function of resources, which implies a fairly continuous scale, not a binary distinction.

Other aspects are a function of accumulated knowledge. That's somewhat lumpier, but still doesn't look close to a binary difference.

Henrich's books The Secret of Our Success and The WEIRDest People in the World suggest that humans have been gradually building up the ability to handle increasingly abstract problems.

Our ancestors of a couple million years ago had language that enabled them to handle a somewhat larger class of mental tasks than other apes.

Tools such as writing, and new concepts such as the Turing machine, enabled them to model ideas that they'd previously failed to find ways to handle.

I see plenty of hints that other mammals have weaker versions of this abstract thought. I'd be surprised if humans have reached the limits of what is possible.

So, when I try to treat general intelligence as a binary, I alternate between doubting that humans have it, and believing that most animals and LLMs have it.

Comment by PeterMcCluskey on A Case for the Least Forgiving Take On Alignment · 2023-05-06T02:54:07.933Z · LW · GW

In the hypothetical where there’s no general intelligence, there’s no such thing as “smarter”,

It sure looks like many species of animals can be usefully compared as smarter than others. The same is true of different versions of LLMs. Why shouldn't I conclude that most of those have what you call general intelligence?