LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Memo on some neglected topics
Lukas Finnveden (Lanrian) · 2023-11-11T02:01:55.834Z · comments (2)

Evaporation of improvements
Viliam · 2024-06-20T18:34:40.969Z · comments (27)

Online Dialogues Party — Sunday 5th November
Ben Pace (Benito) · 2023-10-27T02:41:00.506Z · comments (1)

{Book Summary} The Art of Gathering
Tristan Williams (tristan-williams) · 2024-04-16T10:48:41.528Z · comments (0)

[link] Conversation Visualizer
ethanmorse · 2023-12-31T01:18:01.424Z · comments (4)

3. Premise three & Conclusion: AI systems can affect value change trajectories & the Value Change Problem
Nora_Ammann · 2023-10-26T14:38:14.916Z · comments (4)

Cryonics p(success) estimates are only weakly associated with interest in pursuing cryonics in the LW 2023 Survey
Andy_McKenzie · 2024-02-29T14:47:28.613Z · comments (6)

Experiments with an alternative method to promote sparsity in sparse autoencoders
Eoin Farrell · 2024-04-15T18:21:48.771Z · comments (7)

Deconfusing “ontology” in AI alignment
Dylan Bowman (dylan-bowman) · 2023-11-08T20:03:43.205Z · comments (3)

[link] A new process for mapping discussions
Nathan Young · 2024-09-30T08:57:20.029Z · comments (7)

Towards Quantitative AI Risk Management
Henry Papadatos (henry) · 2024-10-16T19:26:48.817Z · comments (1)

[link] AI Safety at the Frontier: Paper Highlights, August '24
gasteigerjo · 2024-09-03T19:17:24.850Z · comments (0)

Domain-specific SAEs
jacob_drori (jacobcd52) · 2024-10-07T20:15:38.584Z · comments (0)

[question] Any real toeholds for making practical decisions regarding AI safety?
lukehmiles (lcmgcd) · 2024-09-29T12:03:08.084Z · answers+comments (6)

Interpretability of SAE Features Representing Check in ChessGPT
Jonathan Kutasov (jonathan-kutasov) · 2024-10-05T20:43:36.679Z · comments (2)

[question] What prevents SB-1047 from triggering on deep fake porn/voice cloning fraud?
ChristianKl · 2024-09-26T09:17:39.088Z · answers+comments (21)

[link] Liquid vs Illiquid Careers
vaishnav92 · 2024-10-20T23:03:49.725Z · comments (5)

There aren't enough smart people in biology doing something boring
Abhishaike Mahajan (abhishaike-mahajan) · 2024-10-21T15:52:04.482Z · comments (13)

Distinguishing ways AI can be "concentrated"
Matthew Barnett (matthew-barnett) · 2024-10-21T22:21:13.666Z · comments (2)

[link] If-Then Commitments for AI Risk Reduction [by Holden Karnofsky]
habryka (habryka4) · 2024-09-13T19:38:53.194Z · comments (0)

[link] Predicting Influenza Abundance in Wastewater Metagenomic Sequencing Data
jefftk (jkaufman) · 2024-09-23T17:25:58.380Z · comments (0)

Investigating Sensitive Directions in GPT-2: An Improved Baseline and Comparative Analysis of SAEs
Daniel Lee (daniel-lee) · 2024-09-06T02:28:41.954Z · comments (0)

An AI crash is our best bet for restricting AI
Remmelt (remmelt-ellen) · 2024-10-11T02:12:03.491Z · comments (1)

European Progress Conference
Martin Sustrik (sustrik) · 2024-10-06T11:10:03.819Z · comments (11)

Superintelligence Can't Solve the Problem of Deciding What You'll Do
Vladimir_Nesov · 2024-09-15T21:03:28.077Z · comments (11)

[link] Evaluating Synthetic Activations composed of SAE Latents in GPT-2
Giorgi Giglemiani (Rakh) · 2024-09-25T20:37:48.227Z · comments (0)

Fifteen Lawsuits against OpenAI
Remmelt (remmelt-ellen) · 2024-03-09T12:22:09.715Z · comments (4)

When and why should you use the Kelly criterion?
Garrett Baker (D0TheMath) · 2023-11-05T23:26:38.952Z · comments (25)

An Affordable CO2 Monitor
Pretentious Penguin (dylan-mahoney) · 2024-03-21T03:06:53.255Z · comments (1)

On the 2nd CWT with Jonathan Haidt
Zvi · 2024-04-05T17:30:05.223Z · comments (3)

AISC Project: Modelling Trajectories of Language Models
NickyP (Nicky) · 2023-11-13T14:33:56.407Z · comments (0)

Scientific Notation Options
jefftk (jkaufman) · 2024-05-18T15:10:02.181Z · comments (13)

A short dialogue on comparability of values
cousin_it · 2023-12-20T14:08:29.650Z · comments (7)

Reprograming the Mind: Meditation as a Tool for Cognitive Optimization
Jonas Hallgren · 2024-01-11T12:03:41.763Z · comments (3)

[link] Solving alignment isn't enough for a flourishing future
mic (michael-chen) · 2024-02-02T18:23:00.643Z · comments (0)

[link] Found Paper: "FDT in an evolutionary environment"
the gears to ascension (lahwran) · 2023-11-27T05:27:50.709Z · comments (47)

How to develop a photographic memory 2/3
PhilosophicalSoul (LiamLaw) · 2023-12-30T20:18:14.255Z · comments (7)

[link] [Linkpost] Concept Alignment as a Prerequisite for Value Alignment
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2023-11-04T17:34:36.563Z · comments (0)

Uncertainty in all its flavours
Cleo Nardo (strawberry calm) · 2024-01-09T16:21:07.915Z · comments (6)

Probably Not a Ghost Story
George Ingebretsen (george-ingebretsen) · 2024-06-12T22:55:26.264Z · comments (4)

[link] align your latent spaces
bhauth · 2023-12-24T16:30:09.138Z · comments (8)

A Strange ACH Corner Case
jefftk (jkaufman) · 2024-02-10T03:00:05.930Z · comments (2)

EA Infrastructure Fund's Plan to Focus on Principles-First EA
Linch · 2023-12-06T03:24:55.844Z · comments (0)

Appraising aggregativism and utilitarianism
Cleo Nardo (strawberry calm) · 2024-06-21T23:10:37.014Z · comments (10)

Weak vs Quantitative Extinction-level Goodhart's Law
VojtaKovarik · 2024-02-21T17:38:15.375Z · comments (1)

Tackling Moloch: How YouCongress Offers a Novel Coordination Mechanism
Hector Perez Arenas (hector-perez-arenas) · 2024-05-15T23:13:48.501Z · comments (9)

flowing like water; hard like stone
lsusr · 2024-02-20T03:20:46.531Z · comments (4)

The economy is mostly newbs (strat predictions)
lukehmiles (lcmgcd) · 2024-02-01T19:15:49.420Z · comments (6)

Survey on the acceleration risks of our new RFPs to study LLM capabilities
Ajeya Cotra (ajeya-cotra) · 2023-11-10T23:59:52.515Z · comments (1)

[link] Goodhart's Law Example: Training Verifiers to Solve Math Word Problems
Chris_Leong · 2023-11-25T00:53:26.841Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

lsusr on is it possible to comment anonymously on a post?

No, but you can create an alt account.

alenoach on Are we dropping the ball on Recommendation AIs?

What do you think is the main issue preventing companies from making more ethical recommendation algorithms? Is it the difficulty of determining objectively what is accurate and ethical? Or is it more about the incentives, like an unwillingness to sacrifice addictiveness and part of their audience?

zack_m_davis on Claude Sonnet 3.5.1 and Haiku 3.5

The next major update can be Claude 4.0 (and Gemini 2.0) and after that we all agree to use actual normal version numbering rather than dating?

Date-based versions aren't the most popular, but it's not an unheard of thing that Anthropic just made up: see CalVer, as contrasted to SemVer. (For things that change frequently in small ways, it's convenient to just slap the date on it rather than having to soul-search about whether to increment the second or the third number.)

nathan-helm-burger on Claude Sonnet 3.5.1 and Haiku 3.5

For raw IQ, sure. I just mean "conversational flavor".

alenoach on Are we dropping the ball on Recommendation AIs?

Good recommendation engines are really important for our epistemic environment, in my opinion more than for example prediction markets. Because it indeed affects so much of the content that people ingest in their daily lives, on a large scale.

The tough question is how tractable it is. Tournesol has some audience, but also seems to struggle to scale it up despite pretty mature software. I really don't know how effective it would be to pressure companies like Facebook or TikTok, or to push for regulation, or to conduct more research on how to improve recommendation algorithms. Seems worth investigating whether there are cost-effective opportunities, whether through grants or job recommendations.

nisan on Linkpost: Memorandum on Advancing the United States’ Leadership in Artificial Intelligence

Section 3.3(f)(iii):

Within 120 days of the date of this memorandum, DOE, acting primarily through the National Nuclear Security Administration (NNSA) and in close coordination with AISI and NSA, shall seek to develop the capability to perform rapid systematic testing of AI models’ capacity to generate or exacerbate nuclear and radiological risks. This initiative shall involve the development and maintenance of infrastructure capable of running classified and unclassified tests, including using restricted data and relevant classified threat information. This initiative shall also feature the creation and regular updating of automated evaluations, the development of an interface for enabling human-led red-teaming, and the establishment of technical and legal tooling necessary for facilitating the rapid and secure transfer of United States Government, open-weight, and proprietary models to these facilities.

It sounds like the plan is for AI labs to transmit models to government datacenters for testing. I anticipate at least one government agency will quietly keep a copy for internal use.

brasslion on Hell is wasted on the evil

This is my favorite exchange I have ever read on LessWrong.

mo-putera on Adverse Selection by Life-Saving Charities

To be clear, GiveWell won’t be shocked by anything I’ve said so far. They’ve commissioned work and published reports on this. But as you might expect, these quality of life adjustments wouldnt feature in GiveWell’s calculations anyway, since the pitch to donors is about the price paid for a life, or a DALY.

Can you clarify what you mean by these quality of life adjustments not featuring in GiveWell's calculations?

To be more concrete, let's take their CEA of HKI's vitamin A supplementation (VAS) program in Burkina Faso. They estimate that a $1M grant would avert 553 under-5 deaths (~80% of total program benefit) and incrementally increase future income for the ~560,000 additional children receiving VAS (~20%) (these figures vary considerably by location by the way, from 60 deaths averted in Anambra, Nigeria to 1,475 deaths averted in Niger) then they convert this to 81,811 income-doubling equivalents (their altruistic common denominator — they don't use DALYs in any of their CEAs, so I'm always befuddled when people claim they do), make a lot of leverage- and funging-related adjustments which reduces this to 75,272 income doublings, then compare it with the 3,355 income doublings they estimate would be generated by donating that $1M to GiveDirectly to get their 22.4x cash multiplier for HKI VAS in Burkina Faso.

So: are you saying that GiveWell should add a "QoL discount" when converting lives saved and income increase, like what Happier Lives Institute suggests [EA · GW] for non-Epicurean accounts of badness of death?

baometrus on Universal Basic Income and Poverty

I have known some rich people. They don't act as a coordinated group almost ever; and the group they don't form, is flatly not capable of accurately predicting and deliberately directing world-historical equilibria over centuries.

Beg to differ. Rich people congregate, populate, and settle in areas, and form orgs and egregore that push away and hide the poor and the problems that make them uncomfortable, and this happens largely unconsciously - phenomenologically - yet there is organizational agency in it. And this kind of wealth phenomena has certainly steered world-historical equilibria over centuries. My point is: it's not a "conspiracy" of rich people, it's a phenomenon of rich people, and I think you should include the phenomenology of wealthy populations in your modeling.

jason-gross on Are we dropping the ball on Recommendation AIs?

I believe the closest research to this topic is under the heading "Performative Power" (cf, e.g., this arXiv paper). I think "The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power" by Shoshana Zuboff is also a pretty good book that seems related.