LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Why I think it's net harmful to do technical safety research at AGI labs
Remmelt (remmelt-ellen) · 2024-02-07T04:17:15.246Z · comments (24)

What is the best argument that LLMs are shoggoths?
JoshuaFox · 2024-03-17T11:36:23.636Z · comments (22)

Geometric Utilitarianism (And Why It Matters)
StrivingForLegibility · 2024-05-12T03:41:21.342Z · comments (2)

AI debate: test yourself against chess 'AIs'
Richard Willis · 2023-11-22T14:58:10.847Z · comments (35)

Smartphone Etiquette: Suggestions for Social Interactions
Declan Molony (declan-molony) · 2024-06-04T06:01:03.336Z · comments (4)

The Overkill Conspiracy Hypothesis
ymeskhout · 2023-10-20T16:51:20.308Z · comments (8)

Bayesian inference without priors
DanielFilan · 2024-04-24T23:50:08.312Z · comments (8)

AI #57: All the AI News That’s Fit to Print
Zvi · 2024-03-28T11:40:05.435Z · comments (14)

Causality is Everywhere
silentbob · 2024-02-13T13:44:49.952Z · comments (12)

Ideas for Next-Generation Writing Platforms, using LLMs
ozziegooen · 2024-06-04T18:40:24.636Z · comments (4)

The Sequences on YouTube
Neil (neil-warren) · 2024-01-07T01:44:39.663Z · comments (9)

Consequentialism is a compass, not a judge
Neil (neil-warren) · 2024-04-13T10:47:44.980Z · comments (6)

Just because an LLM said it doesn't mean it's true: an illustrative example
dirk (abandon) · 2024-08-21T21:05:59.691Z · comments (12)

Open Thread Fall 2024
habryka (habryka4) · 2024-10-05T22:28:50.398Z · comments (64)

[question] Seeking AI Alignment Tutor/Advisor: $100–150/hr
MrThink (ViktorThink) · 2024-10-05T21:28:16.491Z · answers+comments (3)

Optimizing Repeated Correlations
SatvikBeri · 2024-08-01T17:33:23.823Z · comments (1)

The causal backbone conjecture
tailcalled · 2024-08-17T18:50:14.577Z · comments (0)

[link] Evaluating Synthetic Activations composed of SAE Latents in GPT-2
Giorgi Giglemiani (Rakh) · 2024-09-25T20:37:48.227Z · comments (0)

5 ways to improve CoT faithfulness
CBiddulph (caleb-biddulph) · 2024-10-05T20:17:12.637Z · comments (8)

[link] Positive visions for AI
L Rudolf L (LRudL) · 2024-07-23T20:15:26.064Z · comments (4)

LessWrong email subscriptions?
Raemon · 2024-08-27T21:59:56.855Z · comments (6)

Do Sparse Autoencoders (SAEs) transfer across base and finetuned language models?
Taras Kutsyk · 2024-09-29T19:37:30.465Z · comments (7)

Housing Roundup #9: Restricting Supply
Zvi · 2024-07-17T12:50:05.321Z · comments (8)

Proving the Geometric Utilitarian Theorem
StrivingForLegibility · 2024-08-07T01:39:10.920Z · comments (0)

[link] Fictional parasites very different from our own
Abhishaike Mahajan (abhishaike-mahajan) · 2024-09-08T14:59:39.080Z · comments (0)

[link] Conventional footnotes considered harmful
dkl9 · 2024-10-01T14:54:01.732Z · comments (16)

[link] what becoming more secure did for me
Chipmonk · 2024-08-22T17:44:48.525Z · comments (5)

[link] An Intuitive Explanation of Sparse Autoencoders for Mechanistic Interpretability of LLMs
Adam Karvonen (karvonenadam) · 2024-06-25T15:57:16.872Z · comments (0)

[link] SB 1047 gets vetoed
ryan_b · 2024-09-30T15:49:38.609Z · comments (1)

[link] Announcing Open Philanthropy's AI governance and policy RFP
Julian Hazell (julian-hazell) · 2024-07-17T02:02:39.933Z · comments (0)

SAE features for refusal and sycophancy steering vectors
neverix · 2024-10-12T14:54:48.022Z · comments (4)

You're Playing a Rough Game
jefftk (jkaufman) · 2024-10-17T19:20:06.251Z · comments (2)

[link] **In defence of Helen Toner, Adam D'Angelo, and Tasha McCauley**
mrtreasure · 2023-12-06T02:02:32.004Z · comments (3)

[link] Structured Transparency: a framework for addressing use/mis-use trade-offs when sharing information
habryka (habryka4) · 2024-04-11T18:35:44.824Z · comments (0)

[link] The Best Essay (Paul Graham)
Chris_Leong · 2024-03-11T19:25:42.176Z · comments (2)

AXRP Episode 30 - AI Security with Jeffrey Ladish
DanielFilan · 2024-05-01T02:50:04.621Z · comments (0)

[question] What ML gears do you like?
Ulisse Mini (ulisse-mini) · 2023-11-11T19:10:11.964Z · answers+comments (4)

Clipboard Filtering
jefftk (jkaufman) · 2024-04-14T20:50:02.256Z · comments (1)

Economics Roundup #1
Zvi · 2024-03-26T14:00:06.332Z · comments (4)

[link] Arrogance and People Pleasing
Jonathan Moregård (JonathanMoregard) · 2024-02-06T18:43:09.120Z · comments (7)

[link] Was a Subway in New York City Inevitable?
Jeffrey Heninger (jeffrey-heninger) · 2024-03-30T00:53:21.314Z · comments (4)

Control Symmetry: why we might want to start investigating asymmetric alignment interventions
domenicrosati · 2023-11-11T17:27:10.636Z · comments (1)

$250K in Prizes: SafeBench Competition Announcement
ozhang (oliver-zhang) · 2024-04-03T22:07:41.171Z · comments (0)

Decent plan prize announcement (1 paragraph, $1k)
lukehmiles (lcmgcd) · 2024-01-12T06:27:44.495Z · comments (19)

Improving SAE's by Sqrt()-ing L1 & Removing Lowest Activating Features
Logan Riggs (elriggs) · 2024-03-15T16:30:00.744Z · comments (5)

Is Yann LeCun strawmanning AI x-risks?
Chris_Leong · 2023-10-19T11:35:08.167Z · comments (4)

[link] Report: Evaluating an AI Chip Registration Policy
Deric Cheng (deric-cheng) · 2024-04-12T04:39:45.671Z · comments (0)

Changing Contra Dialects
jefftk (jkaufman) · 2023-10-26T17:30:10.387Z · comments (2)

[question] How to Model the Future of Open-Source LLMs?
Joel Burget (joel-burget) · 2024-04-19T14:28:00.175Z · answers+comments (9)

Testing for consequence-blindness in LLMs using the HI-ADS unit test.
David Scott Krueger (formerly: capybaralet) (capybaralet) · 2023-11-24T23:35:29.560Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

christiankl on There aren't enough smart people in biology doing something boring

Making money at all in biology requires being a therapeutics company, which requires you to do something exciting

Illumina has a market cap of 22,77 billion. There was a time when Theranos had a high market cap even if they ultimately didn't manage to develop the technology for it.

It's possible to make a lot of money building tools, it's just that most of the capital is therapeutics-focused instead of tool-focused. However, theraputics-focus vs. tool focused is not the same thing as boring/interesting. Neither Illumina nor Theranos are boring. Alpha Fold was exciting but there's still a reason why it was developed at Google and not at a big pharma company.

If we look at the question of incubators, there's probably a company that sells the incubators and the software that runs them is closed-source so it's hard for someone besides the incubator company to provide software to control it.

The first sales page I found for an incubator is https://www.thermofisher.com/order/catalog/product/51031528?SID=srch-srp-51031528 . If you want to create an incubator startup, building an incubator that can do all the things that the incubator from Thermo Fisher can do and additionally has WLan and an app, you have to do a lot of work to match the features of the existing incubator. Even if you could produce the product, I expect it will not easy to sell it and get people to trust you to have a better product than Thermo Fisher.

Thermo Fisher likely does market analysis and would build build an app for their incubator if they would think that their customers want that but currently sees no demand.

It might be inherent, in idea of having an app to control the incubator being boring, that it's hard to sell it incubators with it.

raghuvar-nadig on OpenAI defected, but we can take honest actions

Thanks! I should have been more clear that the trajectory toward level 5 (with all human virtue/trust being hackable for instrumental gains) itself is concerning, not just the eventual leap when it gets there.

lalartu on The Personal Implications of AGI Realism

This chain of logic is founded on an assumption that these technologies are possible, which I find highly dubious. If an (aligned) superintelligence is built, and we ask it for life extension, the most probable answer would be that biological immortality (and all stuff requiring nanorobots) is just plain impossible, and brain uploading wouldn't help because your copy is not you.

christiankl on There aren't enough smart people in biology doing something boring

Somehow Docusign got the Swiss government to pay them a lot of money for providing e-signatures [LW · GW] instead of that service provided order of magnitudes cheaper by a startup with two full time developers. There are no companies who use the existence of AWS to do disruptive innovation to eat Docusigns profits away.

akash-wasil on What AI companies should do: Some rough ideas

I think I agree with this idea in principle, but I also feel like it misses some things in practice (or something). Some considerations:

I think my bar for "how much I trust a lab such that I'm OK with them not making transparency commitments" is fairly high. I don't think any existing lab meets that bar.
I feel like a lot of forms of helpful transparency are not that costly. The main 'cost' feels to me like "maybe the government will end up regulating the sector if/when it understands how dangerous industry people expect AI systems to be and how many safety/security concerns they have". But I think things like "report dangerous stuff to the govt", "have a whistleblower mechanism", and even "make it clear that you're willing to have govt people come and ask about safety/security concerns" don't seem very costly from a time/effort perspective.
If a Responsible Company implemented transparency stuff unilaterally, it would make it easier for the government to have proof-of-concept and implement the same requirements for other companies. In a lot of cases, showing that a concept works for company X (and that company X actually thinks it's a good thing) can reduce a lot of friction in getting things applied to companies Y and Z.

I do agree that some of this depends on the type of transparency commitment and there might be specific types of transparency commitments that don't make sense to pursue unilaterally. Off the top of my head, I can't think of any transparency requirements that I wouldn't want to see implemented unilaterally, and I can think of several that I would want to see (e.g., dangerous capability reports, capability forecasts, whistleblower mechanisms, sharing if-then plans with govt, sharing shutdown plans with govt, setting up interview program with govt, engaging publicly with threat models, having clear OpenAI-style tables that spell out which dangerous capabilities you're tracking/expecting).

bogdan-ionut-cirstea on The case for unlearning that removes information from LLM weights

This would seem like a great benchmark/dataset/eval to apply automated research to [LW(p) · GW(p)]. Would you have thoughts/recommendations on that?

ben-lang on What's a good book for a technically-minded 11-year old?

Much as I liked the book I think its not a good recomendation for an 11 year old. There are definitely maths-y 11 year olds who would really enjoy the subject matter once they get into it. (Stuff about formal systems and so on). But if we gave GEB to such an 11 year old I think the dozens of pages at the beginning on the history of music and Bach running around getting donations would repel most of them. (Urgh, mum tricked me into reading about classical music).

I am all for giving young people a challenge, but I think GEB is challenging on too many different fronts all at once. Its loooong. Its written somewhat in academic-ese. And the subject matter is advanced. So any 11 year old who could deal with one of that trinity also has to face the other two.

martinsq on What's a good book for a technically-minded 11-year old?

GEB

jonas-hallgren on Liquid vs Illiquid Careers

Amazing post, I really enjoyed the perspective explored here.

An extension that might be useful for me as an illiquid path enjoyer is what arbitrage or risk-reduction opportunities you see existing out there?

VCs can get by by doing a lot of smaller bets and if you want to be anti-fragile as an illiquid bet it becomes quite hard as you're part of the cogs in the anti-fragile system. What Taleb says about that is that then these people should be praised because they dare to take on that risk. But there has to be some sort of system one could for example develop with peers and similar?

What is the many bets risk reduction strat here, is it just to make a bunch of smaller MVPs to gain info?

I would be very curious to hear your perspective on this.

foyle on Sleeping on Stage

Fantastic life skill to be able to sleep in a noise environment on a hard floor. Most Chinese can do it so easily, and I would frequently less kids anywhere up to 4-5 years old being carried sleeping down the road by guardians.

I think super valuable when it comes to adulthood and sharing a bed - one less potential source of difficulties if adaption to noisy environment when sleeping makes snoring a non-issue.