LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[question] Could orcas be (trained to be) smarter than humans? 
Towards_Keeperhood (Simon Skade) · 2024-11-04T23:29:26.677Z · answers+comments (20)

[link] How much I'm paying for AI productivity software (and the future of AI use)
jacquesthibs (jacques-thibodeau) · 2024-10-11T17:11:27.025Z · comments (16)

[link] Zen and The Art of Semiconductor Manufacturing
Recurrented (rachel-farley) · 2024-12-09T17:19:35.236Z · comments (2)

[link] The Alignment Trap: AI Safety as Path to Power
crispweed · 2024-10-29T15:21:26.545Z · comments (17)

[link] Making Eggs Without Ovaries
Niko_McCarty (niko-2) · 2024-09-22T17:44:46.733Z · comments (3)

Intricacies of Feature Geometry in Large Language Models
7vik (satvik-golechha) · 2024-12-07T18:10:51.375Z · comments (0)

AI #84: Better Than a Podcast
Zvi · 2024-10-03T15:00:07.128Z · comments (7)

Evidence against Learned Search in a Chess-Playing Neural Network
p.b. · 2024-09-13T11:59:55.634Z · comments (3)

Reading RFK Jr so that you don’t have to
braces · 2024-11-22T00:59:19.583Z · comments (1)

U.S.-China Economic and Security Review Commission pushes Manhattan Project-style AI initiative
Phib · 2024-11-19T18:42:43.296Z · comments (7)

Safe Predictive Agents with Joint Scoring Rules
Rubi J. Hudson (Rubi) · 2024-10-09T16:38:16.535Z · comments (10)

[link] The Evals Gap
Marius Hobbhahn (marius-hobbhahn) · 2024-11-11T16:42:46.287Z · comments (7)

Neuroscience of human social instincts: a sketch
Steven Byrnes (steve2152) · 2024-11-22T16:16:52.552Z · comments (0)

Toward Safety Case Inspired Basic Research
Lucas Teixeira · 2024-10-31T23:06:32.854Z · comments (2)

Secret Collusion: Will We Know When to Unplug AI?
schroederdewitt · 2024-09-16T16:07:01.119Z · comments (7)

Cognitive Work and AI Safety: A Thermodynamic Perspective
Daniel Murfet (dmurfet) · 2024-12-08T21:42:17.023Z · comments (7)

A Path out of Insufficient Views
Unreal · 2024-09-24T20:00:27.332Z · comments (46)

[link] On the Role of Proto-Languages
adamShimi · 2024-09-22T16:50:34.720Z · comments (1)

[link] How Likely Are Various Precursors of Existential Risk?
NunoSempere (Radamantis) · 2024-10-28T13:27:31.620Z · comments (4)

Win/continue/lose scenarios and execute/replace/audit protocols
Buck · 2024-11-15T15:47:24.868Z · comments (2)

[link] a space habitat design
bhauth · 2024-11-25T17:28:48.481Z · comments (13)

How to Give in to Threats (without incentivizing them)
Mikhail Samin (mikhail-samin) · 2024-09-12T15:55:50.384Z · comments (26)

Parental Writing Selection Bias
jefftk (jkaufman) · 2024-10-13T14:00:03.225Z · comments (3)

[question] If I wanted to spend WAY more on AI, what would I spend it on?
Logan Zoellner (logan-zoellner) · 2024-09-15T21:24:46.742Z · answers+comments (16)

[link] The Mysterious Trump Buyers on Polymarket
Annapurna (jorge-velez) · 2024-10-18T13:26:25.565Z · comments (10)

A Conflicted Linkspost
Screwtape · 2024-11-21T00:37:54.035Z · comments (0)

How might we solve the alignment problem? (Part 1: Intro, summary, ontology)
Joe Carlsmith (joekc) · 2024-10-28T21:57:12.063Z · comments (5)

Luck Based Medicine: No Good Very Bad Winter Cured My Hypothyroidism
Elizabeth (pktechgirl) · 2024-12-08T20:10:02.651Z · comments (3)

Estimates of GPU or equivalent resources of large AI players for 2024/5
CharlesD · 2024-11-28T23:01:58.522Z · comments (7)

Model evals for dangerous capabilities
Zach Stein-Perlman · 2024-09-23T11:00:00.866Z · comments (11)

[link] Prices are Bounties
Maxwell Tabarrok (maxwell-tabarrok) · 2024-10-12T14:51:40.689Z · comments (13)

I Finally Worked Through Bayes' Theorem (Personal Achievement)
keltan · 2024-12-05T02:04:16.547Z · comments (6)

Claude Sonnet 3.5.1 and Haiku 3.5
Zvi · 2024-10-24T14:50:06.286Z · comments (9)

[link] Anthropic's updated Responsible Scaling Policy
Zac Hatfield-Dodds (zac-hatfield-dodds) · 2024-10-15T16:46:48.727Z · comments (3)

[Intuitive self-models] 7. Hearing Voices, and Other Hallucinations
Steven Byrnes (steve2152) · 2024-10-29T13:36:16.325Z · comments (2)

Applications of Chaos: Saying No (with Hastings Greer)
Elizabeth (pktechgirl) · 2024-09-21T16:30:07.415Z · comments (16)

[link] [Paper Blogpost] When Your AIs Deceive You: Challenges with Partial Observability in RLHF
Leon Lang (leon-lang) · 2024-10-22T13:57:41.125Z · comments (1)

[link] Can AI Outpredict Humans? Results From Metaculus's Q3 AI Forecasting Benchmark
ChristianWilliams · 2024-10-10T18:58:46.041Z · comments (2)

[link] A toy evaluation of inference code tampering
Fabien Roger (Fabien) · 2024-12-09T17:43:40.910Z · comments (0)

AI #82: The Governor Ponders
Zvi · 2024-09-19T13:30:04.863Z · comments (8)

Metastatic Cancer Treatment Since 2010: The Success Stories
sarahconstantin · 2024-11-04T22:50:09.386Z · comments (2)

Low Probability Estimation in Language Models
Gabriel Wu (gabriel-wu) · 2024-10-18T15:50:05.947Z · comments (0)

[link] Book review: Xenosystems
jessicata (jessica.liu.taylor) · 2024-09-16T20:17:56.670Z · comments (18)

[link] SAEBench: A Comprehensive Benchmark for Sparse Autoencoders
Can (Can Rager) · 2024-12-11T06:30:37.076Z · comments (0)

[link] cancer rates after gene therapy
bhauth · 2024-10-16T15:32:53.949Z · comments (0)

[link] Active Recall and Spaced Repetition are Different Things
Saul Munn (saul-munn) · 2024-11-08T20:14:56.092Z · comments (2)

Demis Hassabis and Geoffrey Hinton Awarded Nobel Prizes
Anna Gajdova (anna-gajdova) · 2024-10-09T12:56:24.856Z · comments (14)

An alternative approach to superbabies
Towards_Keeperhood (Simon Skade) · 2024-11-05T22:56:15.740Z · comments (19)

Evaluating the truth of statements in a world of ambiguous language.
Hastings (hastings-greer) · 2024-10-07T18:08:09.920Z · comments (19)

Interested in Cognitive Bootcamp?
Raemon · 2024-09-19T22:12:13.348Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

adamzerner on adamzerner's Shortform

I would like to see people write high-effort summaries, analyses and distillations of the posts in The Sequences.

When Eliezer wrote the original posts, he was [? · GW] writing one blog post a day for two years. Surely you could do a better job presenting the content that he produced in one day if you, say, took four months applying principles of pedagogy and iterating on it as a side project. I get the sense that more is possible [? · GW].

This seems like a particularly good project for people who want to write but don't know what to write about. I've talked with a variety of people who are in that boat.

One issue with such distillation posts is discoverability. Maybe you write the post, it receives some upvotes, some people see it, and then it disappears into the ether. Ideally when someone in the future goes to read the corresponding sequence post they would be aware that your distillation post is available as a sort of sister content to the original content. LessWrong does have the "Mentioned in" section at the bottom of posts, but that doesn't feel like it is sufficient.

yams on yams's Shortform

What text analogizing LLMs to human brains have you found most compelling?

adamzerner on adamzerner's Shortform

I recently started going through some of Rationality from AI to Zombies again. A big reason why is the fact that there are audio recordings of the posts. It's easy to listen to a post or two as I walk my dog, or a handful of posts instead of some random hour-long podcast that I would otherwise listen to.

I originally read (most of) The Sequences maybe 13 or 14 years ago when I was in college. At various [? · GW] times [LW · GW] since then I've made somewhat deliberate efforts to revisit them. Other times I've re-read random posts as opposed to larger collections of posts. Anyway, the point I want to make is that it's been a while.

I've been a little surprised in my feelings as I re-read them. Some of them feel notably less good than what I remember. Others blow my mind and are incredible.

The Mysterious Answers sequence [? · GW] is one that I felt disappointed by. I felt like the posts weren't very clear and that there wasn't much substance. I think the main overarching point of the sequence is that an explanation can't say that all outcomes are equally probable. It has to say that some outcomes are more probable than others. But that just seems kinda obvious.

I think it's quite plausible that there are "good" reasons why I felt disappointed as I re-read this and other sequences. Maybe there are important things that are going over my head. Or maybe I actually understand things too well now after hanging around this community for so long.

One post that hit me kinda hard that I really enjoyed after re-reading it was Rationality and the English Language [? · GW], and then the follow up post, Human Evil and Muddled Thinking [? · GW]. The posts helped me grok how powerful language can be.

If you really want an artist’s perspective on rationality, then read Orwell; he is mandatory reading for rationalists as well as authors. Orwell was not a scientist, but a writer; his tools were not numbers, but words; his adversary was not Nature, but human evil. If you wish to imprison people for years without trial, you must think of some other way to say it than “I’m going to imprison Mr. Jennings for years without trial.” You must muddy the listener’s thinking, prevent clear images from outraging conscience. You say, “Unreliable elements were subjected to an alternative justice process.”

I'm pretty sure that I read those posts before, along with a bunch of related posts and stuff, but for whatever reason the re-read still meaningfully improved my understand the concept.

alex-k-chen-parrot on Which skincare products are evidence-based?

Has anyone tried OneSkin/does it actually do what it claims? It acts on a mechanism independent from tretonin.

atillayasar on AtillaYasar's Shortform

Editability and findability --> higher quality over time

Editability

Code being easier to find and easier to edit, for example,

if it's in the same live environment where you're working, or if it's a simple hotkey away, or an alt-tab away to a config file which updates your settings without having to restart,

makes it more likely to be edited, more subject to "evolutionary pressures", to feedback loop dynamics.

Same applies to writing, or anything where you have connected objects that influence each other, where the "influencer node" is editable and visible.

configs : program layout / behavior
informal rules about how your relationship is to your friend : the dynamics of the relationship
layout of your desk : the way you work
underlying philosophy of ideas : writing ideas
ideas "going viral" in social media : people discussing them (think about Luigi Mangione triggering notions of killing people you don't like (this is bad!!), or Elon x Trump's doge thing having all sorts of people discussing efficiency of organizations and bureaucracy (this is amazing) )

(not sure if the last one is a good example)

Imagine if when writing this Quick Take:tm:, I had a side panel that on every keystroke, pulled up related paragraphs from all my existing writings!
I can see past writings which cool, but I can edit them way more easily (assuming a "jump to" feature), in the long term this yields many more edits, and a more polished and readable total volume of work.

Findability

If you can easily see the contents of something and go, "wait this is dumb". Then even if it's "far away" like, you have to find it in the browser, do mouse clicks and scrolls, you'll still do it. What in fact determined you editing it, is that the threshold for loading its contents into your mind, had been lowered.
When you load it, the opinion is instantly triggered.

g-1 on Post-Quantum Investing: Dump Crypto for Index Funds and Real Estate?

I extrapolate faster, because experts were wrong about AGI "after 2050" and they were wrong about predicting explosive growth of Bitcoin. In general they are usually too conservative, so odds are experts will be wrong about quantum supremacy as well.

davidmanheim on Refuting Searle’s wall, Putnam’s rock, and Johnson’s popcorn

Good point. The problem I have with that is that in every listed example, the mapping either requires the execution of the conscious mind and a readout of its output and process in order to build it, or it stipulates that it is well enough understood that it can be mapped to an arbitrary process, thereby implicitly also requiring that it was run elsewhere.

nathan-helm-burger on A shortcoming of concrete demonstrations as AGI risk advocacy

Sure. At this point I agree that some people will be so foolish and stubborn that no demo will concern them. Indeed, some people fail to update even on actual events.

So now we are, as Zvi likes to say, 'talking price'. What proportion of key government decision-makers would be influenced by how persuasive (and costly) of demos.

We both agree that the correct amount of effort to put towards demos is somewhere between nearly all of our AI safety effort-resources, and nearly none. I think it's a good point that we should try to estimate how effective a demo is likely to be on some particular individual or group, and aim to neither over nor under invest in it.

ape-in-the-coat on Zombies! Substance Dualist Zombies?

I think we can use the same method Eliezer applied to the regular epiphenomenalist Zombie argument to deal with this, weaker one.

Whether your mind interprets certain colour in a certain way actually has causal effects on the world. Namely, things that appear beautiful to you in our world may not appear beautiful to your qualia inversed counterpart. Which naturally affects your behaviour: whether you look at a certain object more, whether you buy a certain object and so on.

This is even more obvious for people with selective colour blindness. Suppose your mind is unable to distinguish between qualia of blueness and redness. And suppose there are three objects: A is red, B is blue and C is green. In our world you can't distinguish between objects A and B. But in the qualia inversed world you wouldn't be able to distinguish between objects B and C.

And if you try to switch to substance dualist version - all the reasoning from this post still stands.

eukaryote on Is being sexy for your homies?

Didn't like the post then, still don't like it in 2024. I think there are defensible points interwoven with assumptions and stereotypes.

First: generalizes from personal experiences that are not universal. I think a lot of people don't have this or don't struggle with this or find it worth it, and the piece assumes everyone feels the way the author feels.

Second: the thing it describes is a bias, and I don't think the essay realizes this.

Okay, part of the thing is that this doesn't make a case or acknowledge this romantic factor as being different from, like, friendship. Like, in the people-at-work case, you might also do someone a favor at work because you like them as a buddy, which is not necessarily the same as whether they're a good worker or it's a strategic thing for you to do, or whatever - you're inclined to give your friends special treatment. Even in straight same-gender groups, people will end up being friends and having outgroups.

Anyway, you have to be careful reasoning out of "what your in-built stereotypes say". This is sometimes relevant information, totally. But A) your in-built stereotypes are not everyone else's in-built stereotypes, even within your culture, and B) this is reasoning from the territory, not the map. Are they true? In some of the cases given in this piece, it matters if they're true.

Like, the thing being described here is a bias, a flaw in the lens. "Having to navigate around possible sexual dynamics with other people makes it harder to do regular communication with them" is a thing that'll make you less able to reason and less effective. (Especially if it still fires strongly in cases like "this woman is at this event about an unrelated topic, with a partner, and so is probably not available for dating.") I don't begrudge the author for having it. I think it's really common. God knows my own best judgment has failed me before in the face of very pretty people.

But I like this community for usually not giving up on matters of self-improvement and epistemics. Even if you don't prioritize it, you're at least recognizing it and not throwing it out. It's very disconcerting to read "I notice my brain does extra work when I talk with women... wouldn't it be easier if society were radically altered so that I didn't have to talk with women?" Like, what? And there's no way you or anyone else can become more rational about this? This barrier to ideal communication with 50% of people is insurmountable? It's worth giving up on this one? Hello?

I get that the author views this as sort of a series of tenuous hypotheticals and doesn't necessarily stand by these stances and was just putting it out there, which is respectable. I think it's wrong and so tenuous as to be unhelpful.

Overall: bad takes, did have a solid 20 seconds of mixed fun and horror imagining this totally-unsexist society where straight men and women are kept in polite segregated groups, and 10% of people are in fringe situations - stable lesbian gay-male duos who must rely on each other, the bisexuals and the nonbinary people wandering the earth alone, the asexuals reigning supreme; incorruptible, masters of all domains.