LessWrong 2.0 Reader

View: New · Old · Top

next page (older posts) →

Utilitarianism and the replaceability of desires and attachments
MichaelStJules · 2024-07-27T01:57:42.419Z · comments (0)

Inspired by: Failures in Kindness
X4vier · 2024-07-27T01:21:42.848Z · comments (0)

My Experience Using Gamification
Wyatt S (wyatt-s) · 2024-07-26T23:06:53.392Z · comments (1)

How the AI safety technical landscape has changed in the last year, according to some practitioners
tlevin (trevor) · 2024-07-26T19:06:47.126Z · comments (2)

A Visual Task that's Hard for GPT-4o, but Doable for Primary Schoolers
Lennart Finke (l-f) · 2024-07-26T17:51:28.202Z · comments (1)

Unaligned AI is coming regardless.
verbalshadow · 2024-07-26T16:41:11.608Z · comments (0)

Index of rationalist groups in the Bay July 2024
Lucie Philippon (lucie-philippon) · 2024-07-26T16:32:25.337Z · comments (2)

[link] End Single Family Zoning by Overturning Euclid V Ambler
Maxwell Tabarrok (maxwell-tabarrok) · 2024-07-26T14:08:45.046Z · comments (1)

Common Uses of "Acceptance"
Yi-Yang (yiyang) · 2024-07-26T11:18:30.719Z · comments (0)

Universal Basic Income and Poverty
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2024-07-26T07:23:50.151Z · comments (39)

A Solomonoff Inductor Walks Into a Bar: Schelling Points for Communication
johnswentworth · 2024-07-26T00:33:42.000Z · comments (0)

What does a Gambler's Verity world look like?
ErioirE (erioire) · 2024-07-25T22:03:56.447Z · comments (2)

[link] Pacing Outside the Box: RNNs Learn to Plan in Sokoban
Adrià Garriga-alonso (rhaps0dy) · 2024-07-25T22:00:55.398Z · comments (4)

[link] Sex, Death, and Complexity
Zero Contradictions · 2024-07-25T21:22:56.558Z · comments (0)

[link] Does robustness improve with scale?
ChengCheng (ccstan99) · 2024-07-25T20:55:53.359Z · comments (0)

Organisation for Program Equilibrium reading group
Smaug123 · 2024-07-25T19:11:02.332Z · comments (3)

In Text
Valerii Kremnev (valerii-kremnev-1) · 2024-07-25T18:22:36.000Z · comments (0)

[link] "AI achieves silver-medal standard solving International Mathematical Olympiad problems"
gjm · 2024-07-25T15:58:57.638Z · comments (33)

[link] [Talk transcript] What “structure” is and why it matters
Alex_Altair · 2024-07-25T15:49:00.844Z · comments (0)

Constructing Benchmarks and Interventions for Combating Hallucinations in LLMs
Adi Simhi (adi.simhi) · 2024-07-25T14:58:00.400Z · comments (3)

AI #74: GPT-4o Mini Me and Llama 3
Zvi · 2024-07-25T13:50:06.528Z · comments (4)

AI Constitutions are a tool to reduce societal scale risk
Sammy Martin (SDM) · 2024-07-25T11:18:17.826Z · comments (0)

Determining the power of investors over Frontier AI Labs is strategically important to reduce x-risk
Lucie Philippon (lucie-philippon) · 2024-07-25T01:12:20.518Z · comments (6)

FLI is hiring across Comms and Ops
beisenpress · 2024-07-25T00:06:59.433Z · comments (0)

A framework for thinking about AI power-seeking
Joe Carlsmith (joekc) · 2024-07-24T22:41:01.685Z · comments (7)

Llama Llama-3-405B?
Zvi · 2024-07-24T19:40:07.565Z · comments (7)

[link] AI Safety Memes Wiki
plex (ete) · 2024-07-24T18:53:04.977Z · comments (1)

Research Discussion on PSCA with Claude Sonnet 3.5
Robert Kralisch (nonmali-1) · 2024-07-24T16:53:41.857Z · comments (0)

Reading More Each Day: A Simple $35 Tool
aysajan · 2024-07-24T13:54:04.290Z · comments (2)

You should go to ML conferences
Jan_Kulveit · 2024-07-24T11:47:52.214Z · comments (10)

[link] The last era of human mistakes
owencb · 2024-07-24T09:58:42.116Z · comments (2)

Longevity: A critical look at "Loss of epigenetic information as a cause of mammalian aging"
Anna Crow · 2024-07-24T01:40:57.634Z · comments (2)

[link] The Cancer Resolution?
PeterMcCluskey · 2024-07-24T00:25:17.322Z · comments (20)

Establishing a Connection (Ch 17-20)
a littoral wizard · 2024-07-23T21:56:48.122Z · comments (1)

[link] Positive visions for AI
L Rudolf L (LRudL) · 2024-07-23T20:15:26.064Z · comments (4)

On extinction risk over time and AI
FVelde · 2024-07-23T18:05:16.225Z · comments (4)

Unlearning via RMU is mostly shallow
Andy Arditi (andy-arditi) · 2024-07-23T16:07:52.223Z · comments (2)

Monthly Roundup #20: July 2024
Zvi · 2024-07-23T12:50:07.991Z · comments (4)

Confusing the metric for the meaning: Perhaps correlated attributes are "natural"
NickyP (Nicky) · 2024-07-23T12:43:18.681Z · comments (3)

My covid-related beliefs and questions
Severin T. Seehrich (sts) · 2024-07-23T03:27:09.348Z · comments (0)

[question] Is there a Schelling point for group house room listings?
NoSignalNoNoise (AspiringRationalist) · 2024-07-23T03:03:29.639Z · answers+comments (0)

Room Available in Boston Group House
NoSignalNoNoise (AspiringRationalist) · 2024-07-23T02:55:59.602Z · comments (0)

D&D.Sci Scenario Index
aphyer · 2024-07-23T02:00:43.483Z · comments (0)

How to avoid death by AI.
Krantz · 2024-07-23T01:59:54.339Z · comments (7)

[link] ML Safety Research Advice - GabeM
Gabe M (gabe-mukobi) · 2024-07-23T01:45:42.288Z · comments (1)

Ransomware Payments Should Require a Sin Tax
Brian Bien (brian-bien) · 2024-07-22T21:16:29.029Z · comments (8)

The Elusive Root Cause of Schizophrenia - Thesis Introduction Only
kareempforbes · 2024-07-22T20:24:43.155Z · comments (0)

Is Chinese AGI a valid concern for the USA?
sammyboiz · 2024-07-22T20:21:57.800Z · comments (2)

Trying to understand Hanson's Cultural Drift argument
Kemp (ethan-kemp) · 2024-07-22T20:20:32.734Z · comments (1)

Efficient Dictionary Learning with Switch Sparse Autoencoders
Anish Mudide (anish-mudide) · 2024-07-22T18:45:53.502Z · comments (15)

next page (older posts) →

Archive

Recent comments

sharmake-farah on How the AI safety technical landscape has changed in the last year, according to some practitioners

My question is why do you consider most work on concentration of power risk net-negative?

remmelt-ellen on The case for stopping AI safety research

Appreciating your thoughtful comment.

It's hard to pin down ambiguity around how much alignment "techniques" make models more "usable", and how much that in turn enables more "scaling". This and the safety-washing concern gets us into messy considerations. Though I generally agree that participants of MATS or AISC programs can cause much less harm through either than researchers working directly on aligning eg. OpenAI's models for release.

Our crux though is about the extent of progress that can be made – on engineering fully autonomous machinery to control* their own effects in line with continued human safety. I agree with you that such machinery can be engineered to start off performing more** of the tasks we want it to complete (ie. progress on alignment is possible). At the same time, there are fundamental limits to controllability [LW · GW] (ie. progress on alignment is capped).

This is where I think we need more discussion – is the extent of AGI control possible at least more than the extent of control needed to prevent long-term lethal outcomes?

* I use the term "control" in the established control theory sense, consistent with Yampolskiy's definition. Just to avoid confusing people, as the term gets used in more specialised ways in the alignment community (eg. in conversations about the shut-down problem or control agenda).
** This is a rough way of stating it. It's also about the machinery performing fewer of the tasks we wouldn't want the machinery to complete. And the relevant measure is not as much about the number of preferred tasks completed, as the preferred consequences. Finally, this raises a question about who the 'we' is who can express preferences that the machinery is to act in line with, and whether coherent alignment with different persons' preferences expressed from within different perceived contexts is even a sound concept.

chris_leong on How the AI safety technical landscape has changed in the last year, according to some practitioners

I don’t know the exact dates, but: a)proof-based methods seem to be receiving a lot of attention b) def/acc is becoming more of a thing c) more focus on concentration of power risk (tbh, while there are real risks here, I suspect most work here is net-negative)

closed-limelike-curves on Closed Limelike Curves's Shortform

I'm less interested in spreading rationalism per se and more in teaching people about rationality. The other articles are very strongly+closely related to rationality; I chose them since they're articles describing key concepts in rational choice.

czynski on Index of rationalist groups in the Bay July 2024

It would be nice to move this to a standalone website like the old Bay Rationality site. I've been considering that for months and dragging my feet about asking for funding to host it; I'd also like to contact whoever used to run it, check whether anything complicated brought it down, and maybe just yoink their codebase and update the content. I don't know who that was, though.

elizabeth-1 on 80,000 hours should remove OpenAI from the Job Board (and similar EA orgs should do similarly)

huh. was it the particular meme (brave dude telling the truth), the size, or some third thing?

adamzerner on Universal Basic Income and Poverty

I actually disagree with this. I haven't thought too hard about it and might just not be seeing it, but on first thought I am not really seeing how such evidence would make the post "much stronger".

To elaborate, I like to use Paul Graham's Disagreement Hierarchy as a lens to look through for the question of how strong a post is. In particular, I like to focus pretty hard on the central point (DH6) rather than supporting and tangential points. I think the central point plays a very large role in determining how strong a post is.

Here, my interpretation of the central point(s) is something like this:

Poverty is largely determined by the weakest link in the chain.
Anoxan is a helpful example to illustrate this.
It's not too clear what drives poverty today, and so it's not too clear that UBI will meaningfully reduce poverty.

I thought the post did a nice job of making those central points. Sure, something like a survey of the research in positive psychology could provide more support for point #1, for example, but I dunno, I found the sort of intuitive argument for point #1 to be pretty strong, I'm pretty persuaded by it, and so I don't think I'd update too hard in response to the survey of positive psychology research.

Another thing I think about when asking myself how strong I think a post is is how "far along" [LW · GW] it is. Is it an off the cuff conversation starter? An informal write up of something that's been moderately refined? A formal write up of something that has been significantly refined?

I think this post was somewhere towards the beginning of the spectrum (note: it was [LW · GW] originally a tweet, not a LessWrong post). So then, for things like citations supporting empirical claims, I don't think it's reasonable to very much from the author, and so I lean away from viewing the lack of citations as something that (meaningfully) weakens the post.

jacob-pfau on Jacob Pfau's Shortform

What I want to see from Manifold Markets

I've made a lot of manifold markets, and find it a useful way to track my accuracy and sanity check my beliefs against the community. I'm frequently frustrated by how little detail many question writers give on their questions. Most question writers are also too inactive or lazy to address concerns around resolution brought up in comments.

Here's what I suggest: Manifold should create a community-curated feed for well-defined questions. I can think of two ways of implementing this:

(Question-based) Allow community members to vote on whether they think the question is well-defined
(User-based) Track comments on question clarifications (e.g. Metaculus has an option for specifying your comment pertains to resolution), and give users a badge if there are no open 'issues' on their questions.

Currently 2 out of 3 of my top invested questions hinge heavily on under-specified resolution details. The other one was elaborated on after I asked in comments. Those questions have ~500 users active on them collectively.

matthew-barnett on quila's Shortform

I'm thinking of this in the context of a post-singularity future, where we wouldn't need to worry about things like conflict or selection processes.

I'm curious why you seem to think we don't need to worry about things like conflict or selection processes post-singularity.

drocta on There are no coherence theorems

Yes. I believe that is consistent with what I said.

"not((necessarily, for each thing) : has [x] -> those [x] are such that P_1([x]))"
is equivalent to, " (it is possible that something) has [x], but those [x] are not such that P_1([x])"

not((necessarily, for each thing) : has [x] such that P_2([x]) -> those [x] are such that P_1([x]))
is equivalent to "(it is possible that something) has [x], such that P_2([x]), but those [x] are not sure that P_1([x])" .

The latter implies the former, as (A and B and C) implies (A and C), and so the latter is stronger, not weaker, than the former.

Right?