LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

AI #80: Never Have I Ever
Zvi · 2024-09-10T17:50:08.074Z · comments (20)

[link] AI Rights for Human Safety
Simon Goldstein (simon-goldstein) · 2024-08-01T23:01:07.252Z · comments (6)

In defense of technological unemployment as the main AI concern
tailcalled · 2024-08-27T17:58:01.992Z · comments (36)

Economics Roundup #3
Zvi · 2024-09-10T13:50:06.955Z · comments (9)

[Intuitive self-models] 5. Dissociative Identity (Multiple Personality) Disorder
Steven Byrnes (steve2152) · 2024-10-15T13:31:46.157Z · comments (6)

Case Study: Interpreting, Manipulating, and Controlling CLIP With Sparse Autoencoders
Gytis Daujotas (gytis-daujotas) · 2024-08-01T21:08:38.800Z · comments (6)

[question] "Deception Genre" What Books are like Project Lawful?
Double · 2024-08-28T17:19:52.172Z · answers+comments (20)

How difficult is AI Alignment?
Sammy Martin (SDM) · 2024-09-13T15:47:10.799Z · comments (6)

Ambiguity in Prediction Market Resolution is Still Harmful
aphyer · 2024-07-31T20:32:40.217Z · comments (17)

Which LessWrong/Alignment topics would you like to be tutored in? [Poll]
Ruby · 2024-09-19T01:35:02.999Z · comments (11)

Understanding Positional Features in Layer 0 SAEs
bilalchughtai (beelal) · 2024-07-29T09:36:40.701Z · comments (0)

Minimal Motivation of Natural Latents
johnswentworth · 2024-10-14T22:51:58.125Z · comments (14)

The need for multi-agent experiments
Martín Soto (martinsq) · 2024-08-01T17:14:16.590Z · comments (3)

Humanity isn't remotely longtermist, so arguments for AGI x-risk should focus on the near term
Seth Herd · 2024-08-12T18:10:56.543Z · comments (10)

[link] you should probably eat oatmeal sometimes
bhauth · 2024-08-25T14:50:37.570Z · comments (31)

[link] Rowing vs steering
Saul Munn (saul-munn) · 2024-08-10T07:00:17.594Z · comments (2)

Unit economics of LLM APIs
dschwarz · 2024-08-27T16:51:22.692Z · comments (0)

MATS AI Safety Strategy Curriculum v2
DanielFilan · 2024-10-07T22:44:06.396Z · comments (6)

A Robust Natural Latent Over A Mixed Distribution Is Natural Over The Distributions Which Were Mixed
johnswentworth · 2024-08-22T19:19:28.940Z · comments (4)

Formalizing the Informal (event invite)
abramdemski · 2024-09-10T19:22:53.564Z · comments (0)

Debate: Get a college degree?
Ben Pace (Benito) · 2024-08-12T22:23:34.744Z · comments (14)

[link] An Interactive Shapley Value Explainer
James Stephen Brown (james-brown) · 2024-09-28T05:01:21.169Z · comments (9)

Time Efficient Resistance Training
romeostevensit · 2024-10-07T15:15:44.950Z · comments (8)

[link] [Paper] Programming Refusal with Conditional Activation Steering
Bruce W. Lee (bruce-lee) · 2024-09-11T20:57:08.714Z · comments (0)

[link] cancer rates after gene therapy
bhauth · 2024-10-16T15:32:53.949Z · comments (0)

[question] If I wanted to spend WAY more on AI, what would I spend it on?
Logan Zoellner (logan-zoellner) · 2024-09-15T21:24:46.742Z · answers+comments (13)

[link] What's important in "AI for epistemics"?
Lukas Finnveden (Lanrian) · 2024-08-24T01:27:06.771Z · comments (0)

Startup Success Rates Are So Low Because the Rewards Are So Large
AppliedDivinityStudies (kohaku-none) · 2024-10-10T20:22:01.557Z · comments (6)

[link] Adverse Selection by Life-Saving Charities
vaishnav92 · 2024-08-14T20:46:23.662Z · comments (15)

[link] Things I learned talking to the new breed of scientific institution
Abhishaike Mahajan (abhishaike-mahajan) · 2024-08-29T14:00:14.844Z · comments (6)

[link] Point of Failure: Semiconductor-Grade Quartz
Annapurna (jorge-velez) · 2024-09-30T15:57:40.495Z · comments (8)

Superintelligent AI is possible in the 2020s
HunterJay · 2024-08-13T06:03:26.990Z · comments (3)

2025 Color Trends
sarahconstantin · 2024-10-07T21:20:03.962Z · comments (7)

[Linkpost] Play with SAEs on Llama 3
Tom McGrath · 2024-09-25T22:35:44.824Z · comments (2)

instruction tuning and autoregressive distribution shift
nostalgebraist · 2024-09-05T16:53:41.497Z · comments (5)

Why did ChatGPT say that? Prompt engineering and more, with PIZZA.
Jessica Rumbelow (jessica-cooper) · 2024-08-03T12:07:46.302Z · comments (2)

[question] Implications of China's recession on AGI development?
Eric Neyman (UnexpectedValues) · 2024-09-28T01:12:36.443Z · answers+comments (3)

Californians, tell your reps to vote yes on SB 1047!
Holly_Elmore · 2024-08-12T19:50:09.817Z · comments (24)

Low Probability Estimation in Language Models
Gabriel Wu (gabriel-wu) · 2024-10-18T15:50:05.947Z · comments (0)

Anthropic rewrote its RSP
Zach Stein-Perlman · 2024-10-15T14:25:12.518Z · comments (17)

Interpretability as Compression: Reconsidering SAE Explanations of Neural Activations with MDL-SAEs
Kola Ayonrinde (kola-ayonrinde) · 2024-08-23T18:52:31.019Z · comments (5)

Monthly Roundup #23: October 2024
Zvi · 2024-10-16T13:50:05.869Z · comments (11)

[link] Generative ML in chemistry is bottlenecked by synthesis
Abhishaike Mahajan (abhishaike-mahajan) · 2024-09-16T16:31:34.801Z · comments (2)

[link] AISafety.info: What is the "natural abstractions hypothesis"?
Algon · 2024-10-05T12:31:14.195Z · comments (2)

Book Review: On the Edge: The Business
Zvi · 2024-09-25T12:20:06.230Z · comments (0)

Australian AI Safety Forum 2024
Liam Carroll (liam-carroll) · 2024-09-27T00:40:11.451Z · comments (0)

0.202 Bits of Evidence In Favor of Futarchy
niplav · 2024-09-29T21:57:59.896Z · comments (0)

Evaluating Sparse Autoencoders with Board Game Models
Adam Karvonen (karvonenadam) · 2024-08-02T19:50:21.525Z · comments (1)

You're a Space Wizard, Luke
lsusr · 2024-08-18T05:35:39.238Z · comments (6)

Compelling Villains and Coherent Values
Cole Wyeth (Amyr) · 2024-10-06T19:53:47.891Z · comments (4)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

yonatan-cale-1 on Bitter lessons about lucid dreaming

I find lucid dreams to be effective "against" nightmares (for 10+ years already).

AMA if you want

lesswronguser123 on is there a big dictionary somewhere with all your jargon and acronyms and whatnot?

https://www.lesswrong.com/tag/r-a-z-glossary [? · GW]

I found this by mistake and luckily I remembered glancing over your question

christiankl on Interest in Leetcode, but for Rationality?

The goal of this problem type would be to train the ability to recognize bias to the point where it becomes second nature, with the hope that this same developed skill would also trigger in your own thought processes.

Part of what rationality is about is that you don't just hope for beneficial things to happen.

Cognitive bias is a term that comes out of the psychology literature and there were plenty of studies in the domain. It's my understanding that in academia nobody found that you get very far by teaching people to recognize biases.

Outside of academia, we have CFAR that did think about whether you can get people to be more rational by giving them exercises and came to the conclusion that those exercises should be different.

In a case like this, asking yourself "What evidence do I have that what I hope will actually happen?" and "What sources, be it academic people or experts I might interview, could give me more evidence?" would be much more productive questions than "What things in my thought process might be labeled as biases?"

abstractapplic on What's a good book for a technically-minded 11-year old?

Math textbooks. Did you know that you can just buy math textbooks which are "several years too advanced for you"? And that due to economies of scale and the objectivity of their subject matter, they tend to be of both high and consistent quality? Not getting my parents to do this at that age is something I still regret decades later.

Or did you specifically mean fiction? If so, you're asking for fiction recommendations on the grew-up-reading-HPMOR website, we're obviously going to recommend HPMOR (especially if they've already read Harry Potter, but it's still good if you only know the broad strokes).

david-johnston on The Hidden Complexity of Wishes

Algorithmic complexity is precisely analogous to difficulty-of-learning-to-predict, so saying "it's not about learning to predict, it's about algorithmic complexity" doesn't make sense. One read of the original is: learning to respect common sense moral side constraints is tricky, but AI systems will learn how to do it in the end. I'd be happy to call this read correct, and is consistent with the observation that today's AI systems do respect common sense moral side constraints given straightforward requests, and that it took a few years to figure out how to do it. That read doesn't really jive with your commentary.

Your commentary seems to situate this post within a larger argument: teaching a system to "act" is different to teaching it to "predict" because in the former case a sufficiently capable learner's behaviour can collapse to a pathological policy, whereas teaching a capable learner to predict does not risk such collapse. Thus "prediction" is distinguished from "algorithmic complexity". Furthermore, commonsense moral side constraints are complex enough to risk such collapse when we train an "actor" but not a "predictor". This seems confused.

First, all we need to turn a language model prediction into an action is a means of turning text into action, and we have many such means. So the distinction between text predictor and actor is suspect. We could consider an alternative knows/cares distinction: does a system act properly when properly incentivised ("knows") vs does it act properly when presented with whatever context we are practically able to give it ("""cares""")? Language models usually act properly given simple prompts, so in this sense they "care". So rejecting evidence from language models does not seem well justified.

Second, there's no need to claim that commonsense moral side constraints in particular are so hard that trying to develop AI systems that respect them leads to policy collapse. It need only be the case that one of the things we try to teach them to do leads to policy collapse. Teaching values is not particularly notable among all the things we might want AI systems to do; it certainly does not seem to be among the hardest. Focussing on values makes the argument unnecessarily weak.

Third, algorithmic complexity is measured with respect to a prior. The post invokes (but does not justify) an "English speaking evil genie" prior. I don't think anyone thinks this is a serious prior for reasoning about advanced AI system behaviour. But the post is (according to your commentary, if not the post itself) making a quantitative point - values are sufficiently complex to induce policy collapse - but it's measuring this quantity using a nonsense prior. If the quantitative argument was indeed the original point, it is mystifying why a nonsense prior was chosen to make it, and also why no effort was made to justify the prior.

christiankl on Start an Upper-Room UV Installation Company?

If you want to do this as a successful company, you essentially have to get your customers to trust you that you are installing it in a way where UVC up does not produce any negative effects.

People have been doing it for decades is not something that would convince me that there are not long-term side-effects.

christiankl on Start an Upper-Room UV Installation Company?

Quick Googling gives me https://northshorefuel.com/products-services/indoor-air-quality/uv-germicidal-lights.php . They seem near enough to install in Boston.

Using Yelp to find a company that likely does B2B sales when you don't know the exact keywords they use, is not an effective strategy to find installers.

jkaufman on Start an Upper-Room UV Installation Company?

You do need to pay attention to what paint is on the ceiling and measure to verify that levels are low in the places people are, but pointing UVC up is something we've done safely for a long time in many places.

michael-roe on Bitter lessons about lucid dreaming

Discussing sleep paralysis might be an infohazard…

The times I’ve entered sleep paralysis it hasn’t bothered me, as I knew what it was.

charlie-steiner on Start an Upper-Room UV Installation Company?

Pointing UVC LEDs at your ceiling seems sketchy. White paint will likely scatter ~5% of UVC, and shiny metal surfaces will scatter more. Try to go below 250nm for reduced reflection (and reduced penetration into human skin) and (more) unwanted chemistry will start happening to the air.

I guess an important question is whether UVC is more harmful than UVB. If it's not any more harmful, then as long as nobody's getting sunburned from being in that room all day, it's probably fine - that 5% scattering is just another name for SPF 20. But if it is more harmful, then sunburn might not be an adequate signal for when it's bad for you.