LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] AI Impacts 2023 Expert Survey on Progress in AI
habryka (habryka4) · 2024-01-05T19:42:17.226Z · comments (1)

Updates to Open Phil’s career development and transition funding program
abergal · 2023-12-04T18:10:29.394Z · comments (0)

[question] How did you integrate voice-to-text AI into your workflow?
ChristianKl · 2023-11-20T12:01:37.696Z · answers+comments (12)

Can quantised autoencoders find and interpret circuits in language models?
charlieoneill (kingchucky211) · 2024-03-24T20:05:50.125Z · comments (4)

Ackshually, many worlds is wrong
tailcalled · 2024-04-11T20:23:59.416Z · comments (42)

Cryonics p(success) estimates are only weakly associated with interest in pursuing cryonics in the LW 2023 Survey
Andy_McKenzie · 2024-02-29T14:47:28.613Z · comments (6)

Towards Quantitative AI Risk Management
Henry Papadatos (henry) · 2024-10-16T19:26:48.817Z · comments (1)

[link] A new process for mapping discussions
Nathan Young · 2024-09-30T08:57:20.029Z · comments (7)

Monthly Roundup #19: June 2024
Zvi · 2024-06-25T12:00:03.333Z · comments (9)

Childhood and Education Roundup #6: College Edition
Zvi · 2024-06-26T11:40:03.990Z · comments (8)

[link] AI Safety at the Frontier: Paper Highlights, August '24
gasteigerjo · 2024-09-03T19:17:24.850Z · comments (0)

[link] Liquid vs Illiquid Careers
vaishnav92 · 2024-10-20T23:03:49.725Z · comments (5)

Domain-specific SAEs
jacob_drori (jacobcd52) · 2024-10-07T20:15:38.584Z · comments (0)

Superintelligence Can't Solve the Problem of Deciding What You'll Do
Vladimir_Nesov · 2024-09-15T21:03:28.077Z · comments (11)

An AI crash is our best bet for restricting AI
Remmelt (remmelt-ellen) · 2024-10-11T02:12:03.491Z · comments (1)

[link] Evaluating Synthetic Activations composed of SAE Latents in GPT-2
Giorgi Giglemiani (Rakh) · 2024-09-25T20:37:48.227Z · comments (0)

[link] If-Then Commitments for AI Risk Reduction [by Holden Karnofsky]
habryka (habryka4) · 2024-09-13T19:38:53.194Z · comments (0)

[link] Predicting Influenza Abundance in Wastewater Metagenomic Sequencing Data
jefftk (jkaufman) · 2024-09-23T17:25:58.380Z · comments (0)

Investigating Sensitive Directions in GPT-2: An Improved Baseline and Comparative Analysis of SAEs
Daniel Lee (daniel-lee) · 2024-09-06T02:28:41.954Z · comments (0)

[link] Goodhart's Law Example: Training Verifiers to Solve Math Word Problems
Chris_Leong · 2023-11-25T00:53:26.841Z · comments (2)

[link] [Linkpost] Concept Alignment as a Prerequisite for Value Alignment
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2023-11-04T17:34:36.563Z · comments (0)

When and why should you use the Kelly criterion?
Garrett Baker (D0TheMath) · 2023-11-05T23:26:38.952Z · comments (25)

Survey on the acceleration risks of our new RFPs to study LLM capabilities
Ajeya Cotra (ajeya-cotra) · 2023-11-10T23:59:52.515Z · comments (1)

AISC Project: Modelling Trajectories of Language Models
NickyP (Nicky) · 2023-11-13T14:33:56.407Z · comments (0)

[link] Found Paper: "FDT in an evolutionary environment"
the gears to ascension (lahwran) · 2023-11-27T05:27:50.709Z · comments (47)

EA Infrastructure Fund's Plan to Focus on Principles-First EA
Linch · 2023-12-06T03:24:55.844Z · comments (0)

A short dialogue on comparability of values
cousin_it · 2023-12-20T14:08:29.650Z · comments (7)

[link] align your latent spaces
bhauth · 2023-12-24T16:30:09.138Z · comments (8)

How to develop a photographic memory 2/3
PhilosophicalSoul (LiamLaw) · 2023-12-30T20:18:14.255Z · comments (7)

Uncertainty in all its flavours
Cleo Nardo (strawberry calm) · 2024-01-09T16:21:07.915Z · comments (6)

Reprograming the Mind: Meditation as a Tool for Cognitive Optimization
Jonas Hallgren · 2024-01-11T12:03:41.763Z · comments (3)

Without Fundamental Advances, Rebellion and Coup d'État are the Inevitable Outcomes of Dictators & Monarchs Trying to Control Large, Capable Countries
Roko · 2024-01-31T10:14:02.042Z · comments (34)

A Strange ACH Corner Case
jefftk (jkaufman) · 2024-02-10T03:00:05.930Z · comments (2)

Weak vs Quantitative Extinction-level Goodhart's Law
VojtaKovarik · 2024-02-21T17:38:15.375Z · comments (1)

An Affordable CO2 Monitor
Pretentious Penguin (dylan-mahoney) · 2024-03-21T03:06:53.255Z · comments (1)

On the 2nd CWT with Jonathan Haidt
Zvi · 2024-04-05T17:30:05.223Z · comments (3)

Tackling Moloch: How YouCongress Offers a Novel Coordination Mechanism
Hector Perez Arenas (hector-perez-arenas) · 2024-05-15T23:13:48.501Z · comments (9)

Scientific Notation Options
jefftk (jkaufman) · 2024-05-18T15:10:02.181Z · comments (13)

Probably Not a Ghost Story
George Ingebretsen (george-ingebretsen) · 2024-06-12T22:55:26.264Z · comments (4)

Sparse autoencoders find composed features in small toy models
Evan Anders (evan-anders) · 2024-03-14T18:00:43.339Z · comments (12)

[link] ML Safety Research Advice - GabeM
Gabe M (gabe-mukobi) · 2024-07-23T01:45:42.288Z · comments (2)

[link] David Burns Thinks Psychotherapy Is a Learnable Skill. Git Gud.
Morpheus · 2024-01-27T13:21:05.068Z · comments (20)

NYU Code Debates Update/Postmortem
David Rein (david-rein) · 2024-05-24T16:08:06.151Z · comments (4)

[link] Link Collection: Impact Markets
Saul Munn (saul-munn) · 2023-12-26T09:01:48.815Z · comments (0)

[question] Me & My Clone
SimonBaars (simonbaars) · 2024-07-18T16:25:40.770Z · answers+comments (22)

[link] Solving alignment isn't enough for a flourishing future
mic (michael-chen) · 2024-02-02T18:23:00.643Z · comments (0)

Response to Dileep George: AGI safety warrants planning ahead
Steven Byrnes (steve2152) · 2024-07-08T15:27:07.402Z · comments (7)

The economy is mostly newbs (strat predictions)
lukehmiles (lcmgcd) · 2024-02-01T19:15:49.420Z · comments (6)

Deceptive agents can collude to hide dangerous features in SAEs
Simon Lermen (dalasnoin) · 2024-07-15T17:07:33.283Z · comments (0)

[question] Supposing the 1bit LLM paper pans out
O O (o-o) · 2024-02-29T05:31:24.158Z · answers+comments (11)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

elizabeth-1 on Why I quit effective altruism, and why Timothy Telleen-Lawton is staying (for now)

Reading this makes me feel really sad because I’d like to believe it, but I can’t, for all the reasons outlined in the OP.

I could get into more details, but it would be pretty costly for me for (I think) no benefit. The only reason I came back to EA criticism was that talking to Timothy feels wholesome and good, as opposed to the battery acid feeling I get from most discussions of EA.

austin-chen on Why I quit effective altruism, and why Timothy Telleen-Lawton is staying (for now)

Mm I basically agree that:

there are real value differences between EA folks and rationalists
good intentions do not substitute for good outcomes

However:

I don't think differences in values explain much of the differences in results - sure, truthseeking vs impact can hypothetically lead one in different directions, but in practice I think most EAs and rationalists are extremely value aligned
I'm pushing back against Tsvi's claims that "some people don't care" or "EA recruiters would consciously choose 2 zombies over 1 agent" - I think ascribing bad intentions to individuals ends up pretty mindkilly

Basically insofar as EA is screwed up, its mostly caused by bad systems not bad people, as far as I can tell.

drake-thomas on Drake Thomas's Shortform

I think my original comment was ambiguous - I also consider myself to have mostly figured it out, in that I thought through these considerations pretty extensively before joining and am in a "monitoring for new considerations or evidence or events that might affect my assessment" state rather than a "just now orienting to the question" state. I'd expect to be most useful to people in shoes similar to my past self (deciding whether to apply or accept an offer) but am pretty happy to talk to anyone, including eg people who are confident I'm wrong and want to convince me otherwise.

elizabeth-1 on Why I quit effective altruism, and why Timothy Telleen-Lawton is staying (for now)

There were ~20 in round 2 [LW · GW], and I've gotten reports of other people being inspired by the post to get tested themselves that I estimate at least double that.

eggsyntax on LLM Generality is a Timeline Crux

Hi, apologies for having failed to respond; I went out of town and lost track of this thread. Reading back through what you've said. Thank you!

eggsyntax on The Mask Comes Off: At What Price?

I agree that that's presumably the underlying reality. I should have made that clearer.

But it seems like the board would still need to create some justification for public consumption, and for avoiding accusations of violating their charter & fiduciary duty. And it's really unclear to me what that justification is.

benito on Why I quit effective altruism, and why Timothy Telleen-Lawton is staying (for now)

I haven't grokked the notion of "an addiction to steam" yet, so I'm not sure whether I agree with that account, but I have a feeling that when you write "I'd encourage y'all to extend somewhat more charity to these folks, who I generally find to be very kind and well-intentioned" you are papering over real values differences.

Tons of EAs will tell you that honesty and integrity and truth-seeking are of course 'important', but if you observe their behavior they'll trade them off pretty harshly with PR concerns or QALYs bought or plan-changes. I think there's a difference in the culture and values between (on one hand) people around rationalist circles who worry a lot about how to give honest answers to things like 'How are you doing today?', who hold themselves to the standards of intent to inform [LW · GW] rather than simply whether they out and out lied, who will show up and have long arguments with people who have moral critiques of them, and (on the other hand) most of the people in the EA culture and positions of power who don't do this, and so the latter can much more easily deceive and take advantage of people by funneling them into career paths which basically boil down to 'devoting yourself to whatever whoever is powerful in EA thinks is a maybe-good idea this month'. Paths that people wouldn't go down if they candidly were told up front what was going on.

I think it's fair to say that many/most EAs (including those involved with student groups) don't care about integrity and truth-seeking things very much, or at least not enough to bend them off the path of reward and momentum by the standards of the EA ideology / EA leaders & grantmakers when the path is going wrong, and I think this is a key reason why EA student groups are able to be like ponzi schemes. 'Well-intentioned' does not get you 'has good values' and it is not a moral defense of ponzi schemes to argue that everyone involved was "kind and well-intentioned".

I would guess that the feedback loop from EA college recruiting is super long and is weakly aligned. Those in charge of setting recruiting strategy (eg CEA Groups team, and then university organizers) don't see the downstream impacts of their choices, unlike in a startup where you work directly with your hires, and quickly see whether your choices were good or bad.

I agree it is hard to get feedback, but this doesn't mean one cannot have good standards. A ton of my work involves maintaining of boundaries where I'm not quite sure what the concrete outputs will look like. I kind of think this is one of the main things people are talking about when we talk about values — what heuristics do you operate by in the world for most of the time when you're mostly not going to get feedback?

johnswentworth on johnswentworth's Shortform

Two responses:

It's a pretty large part - somewhere between a third and half - just not a majority.
I was also tracking that specific hypothesis, which was why I specifically flagged "about 25% of IQ variability (using a method which does not require identifying all the relevant SNPs, though I don't know the details of that method)". Again, I don't know the method, but it sounds like it wasn't dependent on details of the regression methods.

christiankl on Why I quit effective altruism, and why Timothy Telleen-Lawton is staying (for now)

I liked Zach's recent talk/Forum post about EA's commitment to principles first [EA · GW]. I hope this is at least a bit hope-inspiring, since I get the sense that a big part of your critique is that EA has lost its principles.

The problem is that Zach does not mention being truth-aligned as one of the core principles that we wants to uphold.

He writes "CEA focuses on scope sensitivity, scout mindset, impartiality, and the recognition of tradeoffs".

If we take an act like deleting out inconvenient information like the phrase Leverage Research from a photo on the CEA website, it does violate the principle of being truth aligned but not any of the one's that Zach mentioned.

If I would ask Zach whether he thinks releasing the people that CEA bars with nondisclosure agreements about that one episode with Leverage about which we unfortunately don't know more than that there are nondisclosure agreements, I don't think he would release them. A sign of being truth-aligned would be to release the information but none of the principles Zach points in the direction of releasing people from the nondisclosure agreements.

Saying that your principle is "impartiality" instead of saying that it is "understanding conflicts of interests and managing them effectively" seems to me like a bad sign.

When talking about kidney donation in the start he celebrates self-chosen sacrifice as example of great ethics. Kidney donation is extreme virtue signaling. I would rather have EA value honesty and accountability than celebrating self-sacrifice. Instead, of celebrating people for taking actions nobody would object to he could have celebrated Ben Hoffman for the courage to speak out about problems at GiveWell and facing social rejection for it.

crissman on Join a LessWrong Team for the Unaging System Challenge

Hmm... "Join a LessWrong team..."? Changing from "the" to "a" should make it clear that these aren't the honorable folks who run this website.