LessWrong 2.0 Reader

View: New · Old · Top

← previous page (newer posts) · next page (older posts) →

Loose thoughts on AGI risk
Yitz (yitz) · 2022-06-23T01:02:24.938Z · comments (3)

Air Conditioner Test Results & Discussion
johnswentworth · 2022-06-22T22:26:26.643Z · comments (42)

Announcing the LessWrong Curated Podcast
Ben Pace (Benito) · 2022-06-22T22:16:58.170Z · comments (27)

Google's new text-to-image model - Parti, a demonstration of scaling benefits
Kayden (kunvar-thaman) · 2022-06-22T20:00:59.930Z · comments (4)

Building an Epistemic Status Tracker
rcu · 2022-06-22T18:57:34.198Z · comments (6)

[link] Confusion about neuroscience/cognitive science as a danger for AI Alignment
Samuel Nellessen (samuel-nellessen) · 2022-06-22T17:59:31.140Z · comments (1)

[question] How do I use caffeine optimally?
randomstring · 2022-06-22T17:59:18.259Z · answers+comments (31)

Make learning a reality
Dalton Mabery (dalton-mabery) · 2022-06-22T15:58:05.959Z · comments (2)

Reflection Mechanisms as an Alignment target: A survey
Marius Hobbhahn (marius-hobbhahn) · 2022-06-22T15:05:55.703Z · comments (1)

House Phone
jefftk (jkaufman) · 2022-06-22T14:20:06.586Z · comments (2)

How to Visualize Bayesianism
David Udell · 2022-06-22T13:57:09.721Z · comments (2)

[question] Are there spaces for extremely short-form rationality content?
Aleksi Liimatainen (aleksi-liimatainen) · 2022-06-22T10:39:30.259Z · answers+comments (1)

Solstice Movie Review: Summer Wars
JohnBuridan · 2022-06-22T01:09:26.749Z · comments (6)

Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment
elspood · 2022-06-21T23:55:39.918Z · comments (42)

[link] A Quick List of Some Problems in AI Alignment As A Field
NicholasKross · 2022-06-21T23:23:31.719Z · comments (12)

[question] What is the difference between AI misalignment and bad programming?
puzzleGuzzle · 2022-06-21T21:52:57.362Z · answers+comments (2)

[link] What I mean by the phrase “getting intimate with reality”
Luise · 2022-06-21T19:42:56.578Z · comments (0)

[link] What I mean by the phrase "taking ideas seriously"
Luise · 2022-06-21T19:42:56.547Z · comments (2)

Hydrophobic Glasses Coating Review
jefftk (jkaufman) · 2022-06-21T18:00:05.426Z · comments (6)

[link] Progress links and tweets, 2022-06-20
jasoncrawford · 2022-06-21T17:12:44.361Z · comments (2)

[link] Debating Whether AI is Conscious Is A Distraction from Real Problems
sidhe_they · 2022-06-21T16:56:04.474Z · comments (10)

Mitigating the damage from unaligned ASI by cooperating with aliens that don't exist yet
MSRayne · 2022-06-21T16:12:01.753Z · comments (7)

The inordinately slow spread of good AGI conversations in ML
Rob Bensinger (RobbBB) · 2022-06-21T16:09:57.859Z · comments (62)

Getting from an unaligned AGI to an aligned AGI?
Tor Økland Barstad (tor-okland-barstad) · 2022-06-21T12:36:13.928Z · comments (7)

Common but neglected risk factors that may let you get Paxlovid
DirectedEvolution (AllAmericanBreakfast) · 2022-06-21T07:34:02.685Z · comments (8)

Dagger of Detect Evil
lsusr · 2022-06-21T06:23:01.264Z · comments (20)

[question] How easy/fast is it for a AGI to hack computers/a human brain?
Noosphere89 (sharmake-farah) · 2022-06-21T00:34:34.590Z · answers+comments (1)

[question] What is the most probable AI?
Zeruel017 · 2022-06-20T23:26:01.467Z · answers+comments (0)

Evaluating a Corsi-Rosenthal Filter Cube
jefftk (jkaufman) · 2022-06-20T19:40:01.980Z · comments (3)

Survey re AIS/LTism office in NYC
RyanCarey · 2022-06-20T19:21:33.642Z · comments (0)

Is This Thing Sentient, Y/N?
Thane Ruthenis · 2022-06-20T18:37:59.380Z · comments (9)

Steam
abramdemski · 2022-06-20T17:38:58.548Z · comments (13)

Parable: The Bomb that doesn't Explode
Lone Pine (conor-sullivan) · 2022-06-20T16:41:14.611Z · comments (5)

On corrigibility and its basin
Donald Hobson (donald-hobson) · 2022-06-20T16:33:06.286Z · comments (3)

Announcing the DWATV Discord
Zvi · 2022-06-20T15:50:03.051Z · comments (9)

Key Papers in Language Model Safety
aogara (Aidan O'Gara) · 2022-06-20T15:00:59.858Z · comments (1)

Relationship Advice Repository
Ruby · 2022-06-20T14:39:36.548Z · comments (36)

Adaptation Executors and the Telos Margin
Plinthist (Kredo) · 2022-06-20T13:06:29.519Z · comments (8)

Are we there yet?
theflowerpot · 2022-06-20T11:19:56.253Z · comments (2)

Causal confusion as an argument against the scaling hypothesis
RobertKirk · 2022-06-20T10:54:05.623Z · comments (30)

An AI defense-offense symmetry thesis
Chris van Merwijk (chrisvm) · 2022-06-20T10:01:18.968Z · comments (9)

Let's See You Write That Corrigibility Tag
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2022-06-19T21:11:03.505Z · comments (69)

Half-baked alignment idea: training to generalize
Aaron Bergman (aaronb50) · 2022-06-19T20:16:43.735Z · comments (2)

Where I agree and disagree with Eliezer
paulfchristiano · 2022-06-19T19:15:55.698Z · comments (219)

[question] AI misalignment risk from GPT-like systems?
fiso64 (fiso) · 2022-06-19T17:35:41.095Z · answers+comments (8)

[Link-post] On Deference and Yudkowsky's AI Risk Estimates
bmg · 2022-06-19T17:25:14.537Z · comments (8)

Have The Effective Altruists And Rationalists Brainwashed Me?
UtilityMonster (Matt Goldwater) · 2022-06-19T16:05:04.380Z · comments (2)

Hebbian Learning Is More Common Than You Think
Aleksi Liimatainen (aleksi-liimatainen) · 2022-06-19T15:57:08.378Z · comments (2)

[link] The Malthusian Trap: An Extremely Short Introduction
Davis Kedrosky · 2022-06-19T15:25:44.026Z · comments (0)

Parliaments without the Parties
Yair Halberstadt (yair-halberstadt) · 2022-06-19T14:06:23.167Z · comments (18)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

wassname on Instruction-following AGI is easier and more likely than value aligned AGI

When you rephrase this to be about search engines

I think the main reason why we won't censor search to some abstract conception of "community values" is because users won't want to rent or purchase search services that are censor to such a broad target

It doesn't describe reality. Most of us consume search and recommendations that has been censored (e.g. removing porn, piracy, toxicity, racism, taboo politics) in a way that put's cultural values over our preferences.

So perhaps it won't be true for AI either. At least in the near term, the line between AI and search is a blurred line.

wassname on romeostevensit's Shortform

A before and after would be even better!

8e9 on Language Models Model Us

note that the Brier score at the bottom is a few percentage points lower than what's shown in the chart; the probability distributions GPT outputs differ a bit between runs despite a temperature of 0

It's now possible to get mostly deterministic outputs if you set the seed parameter to an integer of your choice, the other parameters are identical, and the model hasn't been updated.

d0themath on D0TheMath's Shortform

A Theory of Usable Information Under Computational Constraints

We propose a new framework for reasoning about information in complex systems. Our foundation is based on a variational extension of Shannon's information theory that takes into account the modeling power and computational constraints of the observer. The resulting \emph{predictive V-information} encompasses mutual information and other notions of informativeness such as the coefficient of determination. Unlike Shannon's mutual information and in violation of the data processing inequality, V-information can be created through computation. This is consistent with deep neural networks extracting hierarchies of progressively more informative features in representation learning. Additionally, we show that by incorporating computational constraints, V-information can be reliably estimated from data even in high dimensions with PAC-style guarantees. Empirically, we demonstrate predictive V-information is more effective than mutual information for structure learning and fair representation learning.

h/t Simon Pepin Lehalleur

honest_annie on Ilya Sutskever and Jan Leike resign from OpenAI

Organizational structure is an alignment mechanism.

While I sympathize with the stated intentions, I just can't wrap my head around the naivety. OpenAI corporate structure was a recipe for bad corporate governance. "We are the good guys here, the structure is needed to make others align with us."- an organization where ethical people can rule as benevolent dictators is the same mistake committed socialists made when they had power.

If it was that easy, AI alignment would be solved by creating ethical AI committed to alignment and giving it as much power as possible.

Altruists are normal humans. Nothing changes priorities faster than large sums of money. Any mix of ideals and profit-making must be arranged in a way that concerns don't mix. People in charge of non-profits making life-changing money if the profit-making is a success can't work.

Bad organizational structure puts well meaning humans like Sutskever repeteatly into position where he must choose between wast sums of money or his ethical commitments.

egi on Reconsider the anti-cavity bacteria if you are Asian

What you are missing here is that S. mutants often lives in pockets between tooth an epithelium or between teeth with direct permanent contact to epithelium. Due to the geometry of these spaces access to saliva is very poor so metabolites can enrich to levels way beyond those you suggest here.

This mechanism is also a big problem with the pH study above.

drbm on A Dozen Ways to Get More Dakka

hat's fantastic to hear! I am thrilled the information was helpful for me.

drbm on A Dozen Ways to Get More Dakka

hat's fantastic to hear! I'm thrilled the information was helpful for me.

dr_s on D&D.Sci (Easy Mode): On The Construction Of Impossible Structures

I admit it's cheating a bit the spirit of the challenge, but in practice, I guess it's the round amount that makes me suspicious that it might be intentional. But it's true there doesn't seem to be a broader materials related pattern, so it may just be as you say.

chris_leong on AISafety.com – Resources for AI Safety

Great work! It's easy to overlook the importance of this kind of community infrastructure, but I suspect that it makes a significant difference.