LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Apply to MATS 7.0!
Ryan Kidd (ryankidd44) · 2024-09-21T00:23:49.778Z · comments (0)

Resolving von Neumann-Morgenstern Inconsistent Preferences
niplav · 2024-10-22T11:45:20.915Z · comments (5)

Mapping the semantic void II: Above, below and between token embeddings
mwatkins · 2024-02-15T23:00:09.010Z · comments (4)

On "Geeks, MOPs, and Sociopaths"
alkjash · 2024-01-19T21:04:48.525Z · comments (35)

The Math of Suspicious Coincidences
Roko · 2024-02-07T13:32:35.513Z · comments (3)

Differential Optimization Reframes and Generalizes Utility-Maximization
J Bostock (Jemist) · 2023-12-27T01:54:22.731Z · comments (2)

[link] How "Pause AI" advocacy could be net harmful
Tamsin Leake (carado-1) · 2023-12-26T16:19:20.724Z · comments (10)

AI #62: Too Soon to Tell
Zvi · 2024-05-02T15:40:04.364Z · comments (8)

A Case for Superhuman Governance, using AI
ozziegooen · 2024-06-07T00:10:10.902Z · comments (0)

Adversarial Robustness Could Help Prevent Catastrophic Misuse
aogara (Aidan O'Gara) · 2023-12-11T19:12:26.956Z · comments (18)

AI Constitutions are a tool to reduce societal scale risk
Sammy Martin (SDM) · 2024-07-25T11:18:17.826Z · comments (2)

The Third Gemini
Zvi · 2024-02-20T19:50:05.195Z · comments (2)

Running the Numbers on a Heat Pump
jefftk (jkaufman) · 2024-02-09T03:00:04.920Z · comments (12)

[link] Anthropic: Reflections on our Responsible Scaling Policy
Zac Hatfield-Dodds (zac-hatfield-dodds) · 2024-05-20T04:14:44.435Z · comments (21)

Information-Theoretic Boxing of Superintelligences
JustinShovelain · 2023-11-30T14:31:11.798Z · comments (0)

Some comments on intelligence
Viliam · 2024-08-01T15:17:07.215Z · comments (5)

[link] The origins of the steam engine: An essay with interactive animated diagrams
jasoncrawford · 2023-11-29T18:30:36.315Z · comments (1)

AI #74: GPT-4o Mini Me and Llama 3
Zvi · 2024-07-25T13:50:06.528Z · comments (6)

[link] There is no IQ for AI
Gabriel Alfour (gabriel-alfour-1) · 2023-11-27T18:21:26.196Z · comments (10)

Interpreting Quantum Mechanics in Infra-Bayesian Physicalism
Yegreg · 2024-02-12T18:56:03.967Z · comments (6)

"Full Automation" is a Slippery Metric
ozziegooen · 2024-06-11T19:56:49.855Z · comments (1)

Glomarization FAQ
Zane · 2023-11-15T20:20:49.488Z · comments (5)

[link] When scientists consider whether their research will end the world
Harlan · 2023-12-19T03:47:06.645Z · comments (4)

I played the AI box game as the Gatekeeper — and lost
datawitch · 2024-02-12T18:39:35.777Z · comments (52)

Understanding Subjective Probabilities
Isaac King (KingSupernova) · 2023-12-10T06:03:27.958Z · comments (16)

[link] Baking vs Patissing vs Cooking, the HPS explanation
adamShimi · 2024-07-17T20:29:09.645Z · comments (16)

Interpreting the Learning of Deceit
RogerDearnaley (roger-d-1) · 2023-12-18T08:12:39.682Z · comments (14)

AI #85: AI Wins the Nobel Prize
Zvi · 2024-10-10T13:40:07.286Z · comments (6)

[link] Safety tax functions
owencb · 2024-10-20T14:08:38.099Z · comments (0)

[link] [Paper] Hidden in Plain Text: Emergence and Mitigation of Steganographic Collusion in LLMs
Yohan Mathew (ymath) · 2024-09-25T14:52:48.263Z · comments (2)

AIS terminology proposal: standardize terms for probability ranges
eggsyntax · 2024-08-30T15:43:39.857Z · comments (12)

Fun With CellxGene
sarahconstantin · 2024-09-06T22:00:03.461Z · comments (2)

Some additional SAE thoughts
Hoagy · 2024-01-13T19:31:40.089Z · comments (4)

[question] What are things you're allowed to do as a startup?
Elizabeth (pktechgirl) · 2024-06-20T00:01:59.257Z · answers+comments (9)

AI Safety 101 : Reward Misspecification
markov (markovial) · 2023-10-18T20:39:34.538Z · comments (4)

Sparse MLP Distillation
slavachalnev · 2024-01-15T19:39:02.926Z · comments (3)

[question] What's your standard for good work performance?
Chi Nguyen · 2023-09-27T16:58:16.114Z · answers+comments (3)

Announcing SPAR Summer 2024!
laurenmarie12 · 2024-04-16T08:30:31.339Z · comments (2)

Against "argument from overhang risk"
RobertM (T3t) · 2024-05-16T04:44:00.318Z · comments (11)

The Intentional Stance, LLMs Edition
Eleni Angelou (ea-1) · 2024-04-30T17:12:29.005Z · comments (3)

AI Alignment Breakthroughs this week (10/08/23)
Logan Zoellner (logan-zoellner) · 2023-10-08T23:30:54.924Z · comments (14)

[link] 2024 State of the AI Regulatory Landscape
Deric Cheng (deric-cheng) · 2024-05-28T11:59:06.582Z · comments (0)

AI #59: Model Updates
Zvi · 2024-04-11T14:20:06.339Z · comments (2)

[link] AISN #28: Center for AI Safety 2023 Year in Review
aogara (Aidan O'Gara) · 2023-12-23T21:31:40.767Z · comments (1)

Two Tales of AI Takeover: My Doubts
Violet Hour · 2024-03-05T15:51:05.558Z · comments (8)

[question] Current AI safety techniques?
Zach Stein-Perlman · 2023-10-03T19:30:54.481Z · answers+comments (2)

Taking features out of superposition with sparse autoencoders more quickly with informed initialization
Pierre Peigné (pierre-peigne) · 2023-09-23T16:21:42.799Z · comments (8)

[link] One: a story
Richard_Ngo (ricraz) · 2023-10-10T00:18:31.604Z · comments (0)

RA Bounty: Looking for feedback on screenplay about AI Risk
Writer · 2023-10-26T13:23:02.806Z · comments (6)

[link] Managing AI Risks in an Era of Rapid Progress
Algon · 2023-10-28T15:48:25.029Z · comments (3)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

benito on Reformative Hypocrisy, and Paying Close Enough Attention to Selectively Reward It.

Thanks for making this point.

I note Musk was the first one to start a competitor, which seems to me to be very costly.

I think that founding OpenAI could be right if the non-profit structure was likely to work out. I don't know if that made sense at the time. Altman has overpowered getting fired by the board, removed parts of the board, and rumor has it he is moving to a for-profit, which is strong evidence against the non-profit being able to withstand the pressures that were coming, but even without Altman I suspect it would still involve billions of $ of funding, partnerships like the one with Microsoft, and other for-profit pressures to be the sort of player it is today. So I don't know that Musk's plan was viable at all.

lorec on Project Adequate: Seeking Cofounders/Funders

This is one of those subject areas that'd be unfortunately bad to get into publicly. If you or any other individual wants to grill me on this, feel free to DM me or contact me by any of the above methods and I will take disclosure case by case.

meme-marine on Project Adequate: Seeking Cofounders/Funders

Kudos to you for actually trying to solve the problem, but I must remind you that the history of symbolic AI is pretty much nothing but failure after failure; what do you intend to do differently, and how do you intend to overcome the challenges that halted progress in this area for the past ~40 years?

mako-yass on Trying Bluesky

For a while I just stuck to that, but eventually it occurred to me that the rules of following mode favor whoever tweets the most, which is a similar social problem as when meetups end up favoring whoever talks the loudest and interrupts the most, and so I came to really prefer bsky's "Quiet Posters" mode.

seth-herd on OpenAI Email Archives (from Musk v. Altman)

That makes sense under certain assumptions - I find them so foreign I wasn't thinking in those terms. I find this move strange if you worry about either alignment or misuse. If you hand AGI to a bunch of people, one of them is prone to either screw up and release a misaligned AGI, or deliberately use their AGI to self-improve and either take over or cause mayhem.

To me these problems both seem highly likely. That's why the move of responding to concern over AGI by making more AGIs makes no sense to me. I think a singleton in responsible hands is our best chance at survival.

If you think alignment is so easy nobody will screw it up, or if you strongly believe that an offense-defense balance will strongly hold so that many good AGIs safely counter a few misaligned/misused ones, then sure. I just don't think either of those are very plausible views once you've thought back and forth through things.

Cruxes of disagreement on alignment difficulty [LW(p) · GW(p)] explains why I think anybody who thinks alignment is super easy is overestimating their confidence (as is anyone who's sure it's really really hard) - we just haven't done enough analysis or experimentation yet.

If we solve alignment, do we die anyway? [LW · GW] addresses why I think offense-defense balance is almost guaranteed to shift to offense with self-improving AGI, meaning a massively multipolar scenario means we're doomed to misuse.

My best guess is that people who think open-sourcing AGI is a good idea either are thinking only of weak "AGI" and not the next step to autonomously self-improving AGI, or they've taken an optimistic guess at the offense-defense balance with many human-controlled real AGIs.

cubefox on Trying Bluesky

The algorithm has been horrific for a while

After Musk took over, they implemented a mode which doesn't use an algorithm on the timeline at all. It's the "following" tab.

cronodas on The Online Sports Gambling Experiment Has Failed

I've heard that, in Las Vegas, if you put yourself on the government's "compulsive gambler" list, you can still walk into any casino, give them your money, and place a bet - the only difference being that, if you happen to win, the casino keeps your money as if you had lost.

I think it should work the other way around, making it the casino's responsibility to avoid accepting bets from self-proclaimed problem gamblers - if you're on the list and the casino doesn't stop you from betting, the casino has to give you back any money you lose.

benito on Lao Mein's Shortform

Maybe there's a hope there, but I'll point out that many of the people needed to run a business (finance, legal, product, etc) are not idealistic scientists who would be willing to have their equity become worthless.

mako-yass on Trying Bluesky

Markets put bsky exceeding twitter at 44%, 4x higher than mastodon.
My P would be around 80%. I don't think most people (who use social media much in the first place) are proud to be on twitter. The algorithm has been horrific for a while and bsky at least offers algorithmic choice (but only one feed right now is a sophisticated algorithm, and though that algorithm isn't impressive, it at least isn't repellent)

For me, I decided I had to move over (@makoConstruct) when twitter blocked links to rival systems, which included substack. They seem to have made the algorithm demote any tweet with links, which makes it basically useless as a news curation/discovery system.

I also tentatively endorse the underlying protocol. Due to its use of content-addressed datastructures, an atproto server is usually much lighter to run than an activitypub server, it makes nomadic identity/personal data host transfer much easier to implement, and it makes it much more likely that atproto is going to dovetail cleanly with verifiable computing, upon which much more consequential social technologies than microblogging could be built.

zack_m_davis on Lao Mein's Shortform

I don't think Vance is e/acc. He has said positive things about open source, but consider that the context was specifically about censorship and political bias in contemporary LLMs (bolding mine):

There are undoubtedly risks related to AI. One of the biggest:

A partisan group of crazy people use AI to infect every part of the information economy with left wing bias. Gemini can't produce accurate history. ChatGPT promotes genocidal concepts.

The solution is open source

If Vinod really believes AI is as dangerous as a nuclear weapon, why does ChatGPT have such an insane political bias? If you wanted to promote bipartisan efforts to regulate for safety, it's entirely counterproductive.

Any moderate or conservative who goes along with this obvious effort to entrench insane left-wing businesses is a useful idiot.

I'm not handing out favors to industrial-scale DEI bullshit because tech people are complaining about safety.

The words I've bolded indicate that Vance is at least peripherally aware that the "tech people [...] complaining about safety" are a different constituency than the "DEI bullshit" he deplores. If future developments or rhetorical innovations [LW · GW] persuade him that extinction risk is a serious concern, it seems likely that he'd be on board with "bipartisan efforts to regulate for safety."