LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[question] Is there any rigorous work on using anthropic uncertainty to prevent situational awareness / deception?
David Scott Krueger (formerly: capybaralet) (capybaralet) · 2024-09-04T12:40:07.678Z · answers+comments (6)

[link] AlignedCut: Visual Concepts Discovery on Brain-Guided Universal Feature Space
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-09-14T23:23:26.296Z · comments (1)

[link] CultFrisbee
Gauraventh (aryangauravyadav) · 2024-08-11T21:36:36.550Z · comments (3)

[link] Jonothan Gorard:The territory is isomorphic to an equivalence class of its maps
Daniel C (harper-owen) · 2024-09-07T10:04:47.840Z · comments (18)

The new UK government's stance on AI safety
Elliot_Mckernon (elliot) · 2024-07-31T15:23:59.235Z · comments (0)

Simulation-aware causal decision theory: A case for one-boxing in CDT
kongus_bongus · 2024-08-09T18:09:20.013Z · comments (11)

Determining the power of investors over Frontier AI Labs is strategically important to reduce x-risk
Lucie Philippon (lucie-philippon) · 2024-07-25T01:12:20.518Z · comments (7)

Physical Therapy Sucks (but have you tried hiding it in some peanut butter?)
Declan Molony (declan-molony) · 2024-09-10T05:54:47.000Z · comments (12)

My career exploration: Tools for building confidence
lynettebye · 2024-09-13T11:37:55.843Z · comments (0)

Automating LLM Auditing with Developmental Interpretability
htlou · 2024-09-04T15:50:04.337Z · comments (0)

[link] Holomorphic surjection theorem (Picard's little theorem)
dkl9 · 2024-07-21T13:24:18.300Z · comments (0)

[link] Pronouns are Annoying
ymeskhout · 2024-09-18T13:30:04.620Z · comments (16)

[question] If AI is in a bubble and the bubble bursts, what would you do?
Remmelt (remmelt-ellen) · 2024-08-19T10:56:03.948Z · answers+comments (5)

Announcing the Ultimate Jailbreaking Championship
InnerHufflepuff (grayswan) · 2024-09-04T00:35:31.234Z · comments (1)

Room Available in Boston Group House
NoSignalNoNoise (AspiringRationalist) · 2024-07-23T02:55:59.602Z · comments (1)

AI labs can boost external safety research
Zach Stein-Perlman · 2024-07-31T19:30:16.207Z · comments (0)

Emergence, The Blind Spot of GenAI Interpretability?
Quentin FEUILLADE--MONTIXI (quentin-feuillade-montixi) · 2024-08-10T10:07:53.654Z · comments (5)

Cat Sustenance Fortification
jefftk (jkaufman) · 2024-07-31T02:30:04.898Z · comments (7)

[link] The Ap Distribution
criticalpoints · 2024-08-24T21:45:35.029Z · comments (3)

Are LLMs on the Path to AGI?
Davidmanheim · 2024-08-30T03:14:04.710Z · comments (2)

Against Explosive Growth
c.trout (ctrout) · 2024-09-04T21:45:03.120Z · comments (1)

Longevity: A critical look at "Loss of epigenetic information as a cause of mammalian aging"
Anna Crow · 2024-07-24T01:40:57.634Z · comments (2)

[link] Benefits of Psyllium Dietary Fiber in Particular
Brendan Long (korin43) · 2024-08-28T18:13:23.891Z · comments (6)

[link] AI x Human Flourishing: Introducing the Cosmos Institute
Brendan McCord (brendan-mccord) · 2024-09-05T18:23:32.690Z · comments (5)

[question] Looking to interview AI Safety researchers for a book
jeffreycaruso · 2024-08-24T19:57:33.119Z · answers+comments (0)

[link] Verification methods for international AI agreements
Akash (akash-wasil) · 2024-08-31T14:58:10.986Z · comments (1)

Primary Perceptive Systems
ChristianKl · 2024-08-15T11:26:01.667Z · comments (2)

Funding for work that builds capacity to address risks from transformative AI
abergal · 2024-08-14T23:52:09.922Z · comments (0)

Rabin's Paradox
Charlie Steiner · 2024-08-14T05:40:25.572Z · comments (39)

[link] Does robustness improve with scale?
ChengCheng (ccstan99) · 2024-07-25T20:55:53.359Z · comments (0)

[link] Diffusion Guided NLP: better steering, mostly a good thing
Nathan Helm-Burger (nathan-helm-burger) · 2024-08-10T19:49:50.963Z · comments (0)

[question] Building an Inexpensive, Aesthetic, Private Forum
Aaron Graifman (aaron-graifman) · 2024-09-09T17:10:42.677Z · answers+comments (15)

[question] Looking for intuitions to extend bargaining notions
ProgramCrafter (programcrafter) · 2024-08-24T05:00:13.995Z · answers+comments (0)

[question] How great is the utility of "saving" endangered languages?
SpectrumDT · 2024-08-20T13:14:32.895Z · answers+comments (29)

Something Is Lost When AI Makes Art
utilistrutil · 2024-08-18T22:53:46.951Z · comments (0)

My Experience Using Gamification
Wyatt S (wyatt-s) · 2024-07-26T23:06:53.392Z · comments (4)

Ball Sq Pathways
jefftk (jkaufman) · 2024-07-21T02:20:06.607Z · comments (1)

[link] GPT-2 Sometimes Fails at IOI
Ronak_Mehta · 2024-08-14T23:24:39.268Z · comments (0)

A bet for Samo Burja
Nathan Helm-Burger (nathan-helm-burger) · 2024-09-05T16:01:35.440Z · comments (2)

Avoiding the Bog of Moral Hazard for AI
Nathan Helm-Burger (nathan-helm-burger) · 2024-09-13T21:24:34.137Z · comments (8)

[link] How to Fake Decryption
ohmurphy · 2024-09-05T09:18:41.586Z · comments (0)

[question] What is AI Safety’s line of retreat?
Remmelt (remmelt-ellen) · 2024-07-28T05:43:05.021Z · answers+comments (12)

Apartment Price Map Discontinuity
jefftk (jkaufman) · 2024-08-19T15:30:05.386Z · comments (0)

[Cross-post] Book Review: Bureaucracy, by James Q Wilson
davekasten · 2024-08-19T13:57:10.872Z · comments (0)

[link] Notes on Reading 'Who Gets What and Why' (Part 1): Matching Markets
ohmurphy · 2024-07-27T15:05:28.647Z · comments (0)

How I Wrought a Lesser Scribing Artifact (You Can, Too!)
Lorxus · 2024-08-02T03:35:00.972Z · comments (0)

Critique of 'Many People Fear A.I. They Shouldn't' by David Brooks.
Axel Ahlqvist (axelahlqvist1995@gmail.com) · 2024-08-15T18:38:13.437Z · comments (8)

SYSTEMA ROBOTICA
Ali Ahmed (roboticali) · 2024-08-12T20:34:45.879Z · comments (2)

2/3 Aussie & NZ AI Safety folk often or sometimes feel lonely or disconnected (and 16 other barriers to impact)
yanni kyriacos (yanni) · 2024-08-01T01:15:02.620Z · comments (0)

Can Large Language Models effectively identify cybersecurity risks?
emile delcourt (emile-delcourt) · 2024-08-30T20:20:21.345Z · comments (0)

← previous page (newer posts) · next page (older posts) →

^{^}

Let the reader feel free take the political decision of restricting the subject observation set class that defines "real human values" to sane humans.

^{^}

The US Army has something like an IQ test. So does the US Postal Service. So does the NFL. I've also personally worked in a fairly large tech company (not one of the top ones, before I moved to the Bay Area) that had ~IQ tests as one of the entrance criteria. AFAIK there has never been any uproar about it.

If we exclude human "System 2" "slow thinking" capabilities for the purpose of this comparison. ↩︎

LessWrong 2.0 Reader

Archive

Recent comments