LessWrong 2.0 Reader

View: New · Old · Top

← previous page (newer posts) · next page (older posts) →

Would you be a better RLHF labeler than GPT-4?
kache · 2023-03-27T18:10:04.108Z · comments (1)

LLM Powered LW Search
odraode17 · 2023-03-27T18:09:32.958Z · comments (0)

Announcing the Swiss Existential Risk Initiative (CHERI) 2023 Research Fellowship
Tobias H (clearthis) · 2023-03-27T16:36:25.508Z · comments (0)

Industrialization/Computerization Analogies
Gordon Seidoh Worley (gworley) · 2023-03-27T16:34:21.659Z · comments (2)

Lessons from Convergent Evolution for AI Alignment
Jan_Kulveit · 2023-03-27T16:25:13.571Z · comments (9)

GPT-4 is bad at strategic thinking
Christopher King (christopher-king) · 2023-03-27T15:11:47.448Z · comments (8)

The salt in pasta water fallacy
Thomas Sepulchre · 2023-03-27T14:53:07.718Z · comments (38)

CAIS-inspired approach towards safer and more interpretable AGIs
Peter Hroššo (peter-hrosso) · 2023-03-27T14:36:12.712Z · comments (7)

[link] An Overview of Sparks of Artificial General Intelligence: Early experiments with GPT-4
Annapurna (jorge-velez) · 2023-03-27T13:44:43.805Z · comments (0)

A Hivemind of GPT-4 bots REALLY IS A HIVEMIND!
Erlja Jkdf. (erlja-jkdf) · 2023-03-27T12:44:27.971Z · comments (1)

Duploish Marble Runs
jefftk (jkaufman) · 2023-03-27T12:20:01.370Z · comments (1)

GPT-4 Plugs In
Zvi · 2023-03-27T12:10:00.926Z · comments (47)

Please help me sense-check my assumptions about the needs of the AI Safety community and related career plans
peterslattery · 2023-03-27T08:23:28.359Z · comments (4)

Practical Pitfalls of Causal Scrubbing
Jérémy Scheurer (JerrySch) · 2023-03-27T07:47:31.309Z · comments (17)

[question] What If: An Earthquake in Taiwan?
Sable · 2023-03-27T07:31:34.545Z · answers+comments (2)

What can we learn from Lex Fridman’s interview with Sam Altman?
Karl von Wendt · 2023-03-27T06:27:40.465Z · comments (22)

[question] Steelmanning OpenAI's Short-Timelines Slow-Takeoff Goal
FinalFormal2 · 2023-03-27T02:55:29.439Z · answers+comments (2)

The default outcome for aligned AGI still looks pretty bad
GeneSmith · 2023-03-27T00:02:33.318Z · comments (18)

LLM Modularity: The Separability of Capabilities in Large Language Models
NickyP (Nicky) · 2023-03-26T21:57:03.445Z · comments (3)

Testing ChatGPT for white lies
twkaiser · 2023-03-26T21:32:12.321Z · comments (2)

Don't take bad options away from people
Dumbledore's Army · 2023-03-26T20:12:43.278Z · comments (100)

[link] What would a compute monitoring plan look like? [Linkpost]
Akash (akash-wasil) · 2023-03-26T19:33:46.896Z · comments (9)

[question] GPT-4 Specs: 1 Trillion Parameters?
infinibot27 · 2023-03-26T18:56:21.753Z · answers+comments (8)

[link] Sentience in Machines - How Do We Test for This Objectively?
Mayowa Osibodu (mayowa-osibodu) · 2023-03-26T18:56:02.082Z · comments (0)

If it quacks like a duck...
RationalMindset · 2023-03-26T18:54:59.985Z · comments (0)

Chronostasis: The Time-Capsule Conundrum of Language Models
RationalMindset · 2023-03-26T18:54:59.943Z · comments (0)

[question] What happens with logical induction when...
Donald Hobson (donald-hobson) · 2023-03-26T18:31:19.656Z · answers+comments (2)

Draft: Introduction to optimization
Alex_Altair · 2023-03-26T17:25:55.093Z · comments (8)

[link] Chat bot as CEO at NetDragon Websoft
ChristianKl · 2023-03-26T16:01:35.800Z · comments (2)

Datapoint: median 10% AI x-risk mentioned on Dutch public TV channel
Chris van Merwijk (chrisvm) · 2023-03-26T12:50:11.612Z · comments (1)

[question] How Politics interacts with AI ?
qbolec · 2023-03-26T09:53:43.114Z · answers+comments (4)

Descriptive vs. specifiable values
TsviBT · 2023-03-26T09:10:56.334Z · comments (2)

The alignment stability problem
Seth Herd · 2023-03-26T02:10:13.044Z · comments (10)

Survey on lifeloggers for a research project
Mati_Roy (MathieuRoy) · 2023-03-26T00:02:40.090Z · comments (0)

[link] Manifold: If okay AGI, why?
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2023-03-25T22:43:53.820Z · comments (37)

A stylized dialogue on John Wentworth's claims about markets and optimization
So8res · 2023-03-25T22:32:53.216Z · comments (22)

Reproject on Cropping
jefftk (jkaufman) · 2023-03-25T21:50:01.567Z · comments (5)

[link] Sam Altman on GPT-4, ChatGPT, and the Future of AI | Lex Fridman Podcast #367
Gabe M (gabe-mukobi) · 2023-03-25T19:08:55.249Z · comments (4)

$500 Bounty/Contest: Explain Infra-Bayes In The Language Of Game Theory
johnswentworth · 2023-03-25T17:29:51.498Z · comments (7)

The Patent Clerk
Alex Beyman (alexbeyman) · 2023-03-25T15:58:48.615Z · comments (5)

Aligned AI as a wrapper around an LLM
cousin_it · 2023-03-25T15:58:41.361Z · comments (19)

Good News, Everyone!
jbash · 2023-03-25T13:48:22.499Z · comments (23)

ChatGPT Plugins - The Beginning of the End
Bary Levy (bary-levi) · 2023-03-25T11:45:32.877Z · comments (4)

AI Capabilities vs. AI Products
Darmani · 2023-03-25T01:14:51.867Z · comments (1)

Nudging Polarization
jefftk (jkaufman) · 2023-03-24T23:50:01.476Z · comments (14)

Why There Is No Answer to Your Philosophical Question
Bryan Frances · 2023-03-24T23:22:24.724Z · comments (10)

[link] "Slightly Evil" AI Apps
intellectronica · 2023-03-24T22:52:45.516Z · comments (2)

[question] Seeking Advice on Raising AI X-Risk Awareness on Social Media
MrThink (ViktorThink) · 2023-03-24T22:25:22.528Z · answers+comments (1)

Hutter-Prize for Prompts
rokosbasilisk · 2023-03-24T21:26:41.810Z · comments (10)

How likely do you think worse-than-extinction type fates to be?
span1 · 2023-03-24T21:03:12.184Z · comments (4)

← previous page (newer posts) · next page (older posts) →

^{^}

A simple example would be "Having introspected and tested different policies before determining that they're not at risk of burnout from the policy which gives this action."

A more complex example would be "a particular action can be irrational in isolation but downstream of a policy which produces irrational behavior less than is typical", which (now) seems to me to be OP was trying to show with this example given their comment

LessWrong 2.0 Reader

Archive

Recent comments