LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

AI #38: Let’s Make a Deal
Zvi · 2023-11-16T19:50:05.442Z · comments (2)

[link] Fluent dreaming for language models (AI interpretability method)
tbenthompson (ben-thompson) · 2024-02-06T06:02:59.296Z · comments (4)

ProLU: A Nonlinearity for Sparse Autoencoders
Glen Taggart · 2024-04-23T14:09:21.592Z · comments (4)

On the Contrary, Steelmanning Is Normal; ITT-Passing Is Niche
Zack_M_Davis · 2024-01-09T23:12:20.349Z · comments (31)

Userscript to always show LW comments in context vs at the top
Vlad Sitalo (harcisis) · 2023-11-21T17:53:30.418Z · comments (8)

Incidental polysemanticity
Victor Lecomte (victor-lecomte) · 2023-11-15T04:00:00.000Z · comments (7)

The Case for Predictive Models
Rubi J. Hudson (Rubi) · 2024-04-03T18:22:20.243Z · comments (7)

My intellectual journey to (dis)solve the hard problem of consciousness
Charbel-Raphaël (charbel-raphael-segerie) · 2024-04-06T09:32:41.612Z · comments (41)

[link] EPUBs of MIRI Blog Archives and selected LW Sequences
mesaoptimizer · 2023-10-26T14:17:11.538Z · comments (6)

2023 LessWrong Community Census, Request for Comments
Screwtape · 2023-11-01T16:32:19.102Z · comments (37)

[link] Jacob on the Precipice
Richard_Ngo (ricraz) · 2023-09-26T21:16:39.590Z · comments (8)

AXRP Episode 25 - Cooperative AI with Caspar Oesterheld
DanielFilan · 2023-10-03T21:50:07.552Z · comments (0)

[link] An EPUB of Arbital's AI Alignment section
mesaoptimizer · 2023-10-16T19:36:29.109Z · comments (1)

Job Listing: Managing Editor / Writer
Gretta Duleba (gretta-duleba) · 2024-02-21T23:41:26.818Z · comments (2)

[question] "Deception Genre" What Books are like Project Lawful?
Double · 2024-08-28T17:19:52.172Z · answers+comments (20)

Humanity isn't remotely longtermist, so arguments for AGI x-risk should focus on the near term
Seth Herd · 2024-08-12T18:10:56.543Z · comments (10)

The need for multi-agent experiments
Martín Soto (martinsq) · 2024-08-01T17:14:16.590Z · comments (3)

Another argument against utility-centric alignment paradigms
Fiora from Rosebloom · 2024-09-22T07:28:27.856Z · comments (8)

Understanding Positional Features in Layer 0 SAEs
bilalchughtai (beelal) · 2024-07-29T09:36:40.701Z · comments (0)

Sci-Fi books micro-reviews
Yair Halberstadt (yair-halberstadt) · 2024-06-24T09:49:28.523Z · comments (27)

An Introduction to AI Sandbagging
Teun van der Weij (teun-van-der-weij) · 2024-04-26T13:40:00.126Z · comments (8)

New Executive Team & Board — PIBBSS
Nora_Ammann · 2024-07-01T19:30:45.261Z · comments (1)

[link] rapid growth
Chipmonk · 2024-06-05T00:43:51.501Z · comments (0)

Locating My Eyes (Part 3 of "The Sense of Physical Necessity")
LoganStrohl (BrienneYudkowsky) · 2024-02-29T03:09:25.810Z · comments (4)

[question] Where is the Town Square?
Gretta Duleba (gretta-duleba) · 2024-02-13T03:53:18.205Z · answers+comments (8)

[Valence series] 4. Valence & Liking / Admiring
Steven Byrnes (steve2152) · 2024-06-10T14:19:51.194Z · comments (12)

Why does generalization work?
Martín Soto (martinsq) · 2024-02-20T17:51:10.424Z · comments (16)

[link] Why Georgism Lost Its Popularity
Zero Contradictions · 2024-07-20T15:08:41.469Z · comments (50)

[question] Does reducing the amount of RL for a given capability level make AI safer?
Chris_Leong · 2024-05-05T17:04:01.799Z · answers+comments (22)

The Next ChatGPT Moment: AI Avatars
kolmplex (luke-man) · 2024-01-05T20:14:10.074Z · comments (10)

Childhood and Education Roundup #4
Zvi · 2024-01-30T13:50:06.033Z · comments (10)

[link] Non-alignment project ideas for making transformative AI go well
Lukas Finnveden (Lanrian) · 2024-01-04T07:23:13.658Z · comments (1)

[link] How bad is chlorinated water?
bhauth · 2023-12-13T18:00:12.640Z · comments (18)

Taking responsibility and partial derivatives
Ruby · 2023-12-31T04:33:51.419Z · comments (1)

When fine-tuning fails to elicit GPT-3.5's chess abilities
Theodore Chapman · 2024-06-14T18:50:52.855Z · comments (3)

[link] cold aluminum for medicine
bhauth · 2023-12-16T14:38:03.260Z · comments (4)

[link] Project ideas: Epistemics
Lukas Finnveden (Lanrian) · 2024-01-05T23:41:23.721Z · comments (4)

[link] We Need Major, But Not Radical, FDA Reform
Maxwell Tabarrok (maxwell-tabarrok) · 2024-02-24T16:54:33.061Z · comments (12)

MonoPoly Restricted Trust
ymeskhout · 2024-01-02T23:02:55.066Z · comments (37)

[link] Post series on "Liability Law for reducing Existential Risk from AI"
Nora_Ammann · 2024-02-29T04:39:50.557Z · comments (1)

D&D.Sci Long War: Defender of Data-mocracy
aphyer · 2024-04-26T22:30:15.780Z · comments (20)

Trust as a bottleneck to growing teams quickly
benkuhn · 2024-07-13T18:00:04.579Z · comments (3)

[link] Soviet comedy film recommendations
Nina Panickssery (NinaR) · 2024-06-09T23:40:58.536Z · comments (11)

Principled Satisficing To Avoid Goodhart
JenniferRM · 2024-08-16T19:05:27.204Z · comments (2)

US Presidential Election: Tractability, Importance, and Urgency
kuhanj · 2024-05-29T23:52:22.420Z · comments (2)

Deep and obvious points in the gap between your thoughts and your pictures of thought
KatjaGrace · 2024-02-23T07:30:07.461Z · comments (6)

Case studies on social-welfare-based standards in various industries
HoldenKarnofsky · 2024-06-20T13:33:44.780Z · comments (0)

How I internalized my achievements to better deal with negative feelings
Raymond Koopmanschap · 2024-02-27T15:10:24.149Z · comments (7)

Unit economics of LLM APIs
dschwarz · 2024-08-27T16:51:22.692Z · comments (0)

How difficult is AI Alignment?
Sammy Martin (SDM) · 2024-09-13T15:47:10.799Z · comments (6)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

sharmake-farah on A shot at the diamond-alignment problem

For my money, the nice properties that human and AI systems have that matter for alignment is IMO not the properties from Shard Theory, but rather several other properties that mattered:

Alignment generalizes further than capabilities because of verifying being easier to generate, as well as learning values being easier than having a lot of other real world capabilities.
It's looking like the values of humans are far, far simpler than a lot of evopsych literature and Yudkowsky thought, and related to this, values are less fragile than people thought 15-20 years ago, in the sense that values generalize far better OOD than people used to think 15-20 years ago.
The brain and DL AIs, while not the same thing, are doing reasonably similar things such that we can transport a lot of AI insights into neuroscience/human brain insights, and vice versa.
One of those lessons is the bitter lesson from Sutton applies to human values and morals, which cashes out into the fact that the data matter much more than the algorithm when predicting it's values, especially OOD generalization of values, and thus controlling the data is basically equivalent to controlling the values.

trevorone on Lighthaven Sequences Reading Group #3 (Tuesday 09/24)

This is an idea and NOT a recommendation. Unintended consequences abound.

Have you thought about sorting into groups based on carefully-selected categories? For example, econ/social sciences vs quant background with extra whiteboard space, a separate group for new arrivals who didn't do the readings from the other weeks (as their perspectives will have less overlap), a separate group for people who deliberately took a bunch of notes and made a concise list vs a more casual easygoing group, etc?

finalformal2 on Laziness death spirals

I'd be interested in reading much more about this. Energy and akrasia as it's popularly called here continue to be my biggest life challenges. High fiber diet seems to help, and high novelty seems to help.

elityre on Applications of Chaos: Saying No (with Hastings Greer)

Why does the video show up so tiny?

nathan-helm-burger on ektimo's Shortform

So like... Quadratic voting? https://en.m.wikipedia.org/wiki/Quadratic_voting

review-bot on AI doom from an LLM-plateau-ist perspective

The LessWrong Review [? · GW] runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2024. The top fifty or so posts are featured prominently on the site throughout the year.

Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?

jenniferrm on World War Zero

This writeup is great. Very simple. Beat by beat. Motion by motion. The character of the writing makes me feel like anything was possible, and history was a series of accidents, which I think is a "true feeling" about history.

fer32dwt34r3dfsz on Laziness death spirals

So far I've not had this issue (i.e, ...that one thing will come back to bite me a few days later...) with my obligations but have had it with some private projects. Infrequently referencing my obligations in to-do list format allows them to stress me adequately, as they occur first on my mind and aren't detracted from by the gravity of the other tasks I have, which would be easily visible and "present" if on a nearby to-do list.

Having a complex task system containing both job-related obligations and private work tasks somewhat deprioritizes the job-related obligations for me ^[1], relative to how I believe most people might prioritize their job-related obligations, and the near absence of my multiple to-do lists has allowed me, of late, to "calibrate my stress" (the levels of stress I mention here are not severe).

Do you find that you're able to mentally keep track of things better than before...

I am better mentally able to keep track of job-related obligations (which, for the time being and as far as I've surmised, are more important than my private projects / tasks) but less able to remember private project tasks. The magnitude of a task that comes to mind naturally is felt more strongly than if I had proceeded through the same task from a list. I've been tackling tasks that cause me higher levels of stress sooner.

Why? I expect that I have more trouble than is average separating "job" versus "non-job" work, meaning that how much I value one over the other oscillates. ↩︎

gwern on Applications of Chaos: Saying No (with Hastings Greer)

See my comment. The problem with the post is revealed in the fourth sentence:

To demonstrate how chaos theory imposes some limits on the skill of an arbitrary intelligence, I will also look at a game: pinball.

Note that predicting a ball is not at all the same thing as skill in manipulating a ball. It's just a giant non sequitur being slipped in before he begins the math. Which is why he is 100% wrong when he concludes

This is not a problem that is solvable by applying more cognitive effort.

It totally is solvable. The 'cognitive effort' here is 'git gud at pinball, scrub, and stop making excuses for losing', and as he admits in the footnote he didn't include in the LW version, in real life, when adequately incentivized to win rather than find excuses involving 'well, chaos theory shows you can't predict ball bounces more than n bounces out', pinball pros learn how to win and rack up high scores despite 'muh chaos'.

And that is why I don't believe your anecdotal survey responses imply anything good. I think that several or all of those cases, if we were able to investigate them adequately, would turn out to be similar to this pinball essay: a lot of browbeating intimidation-by-math, possibly completely valid insofar as it went, but ultimately, proving an irrelevant claim and the problem in fact soluble.

quila on What does it mean for an event or observation to have probability 0 or 1 in Bayesian terms?

I am not sure whether this is the answer you're looking for, but I think it's true and could be de-confusing, and others have given the standard/practical answer already.

You can try running a program which computes Bayesian updates to determine what happens when this program is passed as input an 'observation' to which it assigns probability 0. Two possible outcomes (of many, dependent on the exact program) that come to mind:

The program returns a 'cannot divide by 0' error upon attempting to compute the observation's update.
The program updates on the observation in a way which rules out the entirety of its probability-space, as it was all premised on the non-0 possibilities. The next time the program tries to update on a new observation, it fails to find priors about that observation.

Bayes' theorem is an algorithm which is used because it happens to help predict the world, rather than something with metaphysical status.

We could also imagine very-different (mathematical)-worlds where prediction is not needed/useful, or, maybe, where the world is so differently-structured that Bayes' theorem is not predictive.