LessWrong 2.0 Reader

View: New · Old · Top

← previous page (newer posts) · next page (older posts) →

Bandwagon effect: Bias in Evaluating AGI X-Risks
Remmelt (remmelt-ellen) · 2022-12-28T07:54:50.669Z · comments (0)

Getting up to Speed on the Speed Prior in 2022
robertzk (Technoguyrob) · 2022-12-28T07:49:22.948Z · comments (5)

[question] [link] World superpowers, particularly the United States, still maintain large conventional militaries despite nuclear deterrence. Why?
niederman · 2022-12-28T05:38:15.585Z · answers+comments (8)

[question] What does "probability" really mean?
sisyphus (benj) · 2022-12-28T03:20:45.651Z · answers+comments (20)

Zooming the Chrome Audio Player
jefftk (jkaufman) · 2022-12-28T02:30:01.244Z · comments (0)

What AI Safety Materials Do ML Researchers Find Compelling?
Vael Gates · 2022-12-28T02:03:31.894Z · comments (34)

South Bay ACX/LW Meetup
IS (is) · 2022-12-28T01:59:12.389Z · comments (0)

Regarding Blake Lemoine's claim that LaMDA is 'sentient', he might be right (sorta), but perhaps not for the reasons he thinks
philosophybear · 2022-12-28T01:55:40.565Z · comments (1)

Fundamental Uncertainty: Chapter 5 - How do we know what we know?
Gordon Seidoh Worley (gworley) · 2022-12-28T01:28:50.605Z · comments (2)

Is checking that a state of the world is not dystopian easier than constructing a non-dystopian state?
No77e (no77e-noi) · 2022-12-27T20:57:27.663Z · comments (3)

Crypto-currency as pro-alignment mechanism
False Name (False Name, Esq.) · 2022-12-27T17:45:54.474Z · comments (2)

[link] My Reservations about Discovering Latent Knowledge (Burns, Ye, et al)
Robert_AIZI · 2022-12-27T17:27:02.225Z · comments (0)

[link] Things that can kill you quickly: What everyone should know about first aid
jasoncrawford · 2022-12-27T16:23:24.831Z · comments (21)

[question] Why The Focus on Expected Utility Maximisers?
DragonGod · 2022-12-27T15:49:36.536Z · answers+comments (84)

[link] Presumptive Listening: sticking to familiar concepts and missing the outer reasoning paths
Remmelt (remmelt-ellen) · 2022-12-27T15:40:23.698Z · comments (8)

Mere exposure effect: Bias in Evaluating AGI X-Risks
Remmelt (remmelt-ellen) · 2022-12-27T14:05:29.563Z · comments (2)

Housing and Transportation Roundup #2
Zvi · 2022-12-27T13:10:00.979Z · comments (0)

[question] Are tulpas moral patients?
ChristianKl · 2022-12-27T11:30:29.923Z · answers+comments (28)

Reflections on my 5-month alignment upskilling grant
Jay Bailey · 2022-12-27T10:51:49.872Z · comments (4)

[link] Institutions Cannot Restrain Dark-Triad AI Exploitation
Remmelt (remmelt-ellen) · 2022-12-27T10:34:34.698Z · comments (0)

Introduction: Bias in Evaluating AGI X-Risks
Remmelt (remmelt-ellen) · 2022-12-27T10:27:30.646Z · comments (0)

MDPs and the Bellman Equation, Intuitively Explained
Jack O'Brien (jack-o-brien) · 2022-12-27T05:50:23.633Z · comments (3)

[link] How 'Human-Human' dynamics give way to 'Human-AI' and then 'AI-AI' dynamics
Remmelt (remmelt-ellen) · 2022-12-27T03:16:17.377Z · comments (5)

[link] Nine Points of Collective Insanity
Remmelt (remmelt-ellen) · 2022-12-27T03:14:11.426Z · comments (3)

Fractional Resignation
jefftk (jkaufman) · 2022-12-27T02:30:01.240Z · comments (6)

[question] What policies have most thoroughly crippled (otherwise-promising) industries or technologies?
benwr · 2022-12-27T02:25:44.376Z · answers+comments (4)

Recent advances in Natural Language Processing—Some Woolly speculations (2019 essay on semantics and language models)
philosophybear · 2022-12-27T02:11:36.960Z · comments (0)

Against Agents as an Approach to Aligned Transformative AI
DragonGod · 2022-12-27T00:47:03.706Z · comments (9)

Can we efficiently distinguish different mechanisms?
paulfchristiano · 2022-12-27T00:20:01.728Z · comments (30)

Air-gapping evaluation and support
Ryan Kidd (ryankidd44) · 2022-12-26T22:52:29.881Z · comments (1)

Slightly against aligning with neo-luddites
Matthew Barnett (matthew-barnett) · 2022-12-26T22:46:42.693Z · comments (31)

Avoiding perpetual risk from TAI
scasper · 2022-12-26T22:34:48.565Z · comments (6)

Announcing: The Independent AI Safety Registry
Shoshannah Tekofsky (DarkSym) · 2022-12-26T21:22:18.381Z · comments (9)

Are men harder to help?
braces · 2022-12-26T21:11:08.036Z · comments (1)

[question] How much should I update on the fact that my dentist is named Dennis?
MichaelDickens · 2022-12-26T19:11:07.918Z · answers+comments (3)

[link] Theodicy and the simulation hypothesis, or: The problem of simulator evil
philosophybear · 2022-12-26T18:55:15.872Z · comments (12)

[link] Safety of Self-Assembled Neuromorphic Hardware
Can (Can Rager) · 2022-12-26T18:51:26.163Z · comments (2)

Coherent extrapolated dreaming
Alex Flint (alexflint) · 2022-12-26T17:29:14.420Z · comments (10)

An overview of some promising work by junior alignment researchers
Akash (akash-wasil) · 2022-12-26T17:23:58.991Z · comments (0)

Solstice song: Here Lies the Dragon
jchan · 2022-12-26T16:08:34.740Z · comments (1)

The Usefulness Paradigm
Aprillion (Peter Hozák) (Aprillion) · 2022-12-26T13:23:58.722Z · comments (4)

Looking Back on Posts From 2022
Zvi · 2022-12-26T13:20:00.745Z · comments (8)

[link] Analogies between Software Reverse Engineering and Mechanistic Interpretability
Neel Nanda (neel-nanda-1) · 2022-12-26T12:26:57.880Z · comments (6)

Mlyyrczo
lsusr · 2022-12-26T07:58:57.920Z · comments (14)

Causal abstractions vs infradistributions
Pablo Villalobos (pvs) · 2022-12-26T00:21:16.179Z · comments (0)

[link] Concrete Steps to Get Started in Transformer Mechanistic Interpretability
Neel Nanda (neel-nanda-1) · 2022-12-25T22:21:49.686Z · comments (7)

It's time to worry about online privacy again
Malmesbury (Elmer of Malmesbury) · 2022-12-25T21:05:30.977Z · comments (23)

[link] [Hebbian Natural Abstractions] Mathematical Foundations
Samuel Nellessen (samuel-nellessen) · 2022-12-25T20:58:03.423Z · comments (2)

[question] Oracle AGI - How can it escape, other than security issues? (Steganography?)
RationalSieve · 2022-12-25T20:14:09.834Z · answers+comments (6)

YCombinator fraud rates
Xodarap · 2022-12-25T19:21:52.829Z · comments (3)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

david-cato on Thoughts on seed oil

I wish you the best and look forward to hearing how it goes.

david-hornbein on The first future and the best future

What is the mechanism, specifically, by which going slower will yield more "care"? What is the mechanism by which "care" will yield a better outcome? I see this model asserted pretty often, but no one ever spells out the details.

I've studied the history of technological development in some depth, and I haven't seen anything to convince me that there's a tradeoff between development speed on the one hand, and good outcomes on the other.

chriswaterguy on "You're the most beautiful girl in the world" and Wittgensteinian Language Games

I suspect that many people who use such a phrase would endorse an interpretation such as "The most beautiful... to me."

chriswaterguy on "You're the most beautiful girl in the world" and Wittgensteinian Language Games

Could you say more, especially about "non-verbal signs"? I can guess what you're gesturing out, but I'm interested to hear your thoughts.

nathan-young on The Inner Ring by C. S. Lewis

I wish there were a clear unifying place for all commentary on this topic. I could create a wiki page I suppose.

the-gears-to-ascension on When is a mind me?

Update: a friend convinced me that I really should separate my intuitions about locating patterns that are exactly myself from my intuitions about the moral value of ensuring I don't contribute to a decrease in realityfluid of the mindlike experiences I morally value, in which case the reason that I selfishly value causal history is actually that it's an overwhelmingly predictive proxy for where my self-pattern gets instantiated, and my moral values - an overwhelmingly larger portion of what I care about - care immensely about avoiding waste, because it appears to me to be by far the largest impact any agent can have on what the future is made of.

Also, I now think that eating is a form of incremental uploading.

ebenezer-dukakis on Is being a trans woman +20 IQ?

Why is nobody in San Francisco pretty? Hormones make you pretty but dumb (pretty faces don't usually pay rent in SF). Why is nobody in Los Angeles smart? Hormones make you pretty but dumb. (Sincere apologies to all residents of SF & LA.)

Some other possibilities:

Pretty people self-select towards interests and occupations that reward beauty. If you're pretty, you're more likely to be popular in high school, which interferes with the dedication necessary to become a great programmer.
A big reason people are prettier in LA is they put significant effort into their appearance -- hair, makeup, orthodontics, weight loss, etc.

Then why didn't evolution give women big muscles? I think because if you are in the same strength range as men then you are much more plausibly murderable. It is hard for a male to say that he killed a female in self-defense in unarmed combat. No reason historically to conscript women into battle. Their weakness protects them. (Maybe someone else has a better explanation.)

Perhaps hunter/gatherer tribes had gender-based specialization of labor. If men are handling the hunting and tribe defense which requires the big muscles, there's less need for women to pay the big-muscle metabolic cost.

nathan-young on This is Water by David Foster Wallace

Can I check that I've understood it.

Roughly, the essay urges one to be conscious of each passing thought, to see it and kind of head it off at the tracks - "feeling angry?" "don't!". But the comment argues this is against what CBT says about feeling our feelings.

What about Sam Harris' practise of meditation which seems focused on seeing and noticing thoughts, turning attention back on itself. I had a period last night of sort of "intense consciousness" where I felt very focused on the fact I was conscious. It. wasn't super pleasant, but it was profound. I can see why one would want to focus on that but also why it might be a bad idea.

dr_s on Examples of Highly Counterfactual Discoveries?

Maybe it's the other way around, and it's the Chinese elite who was unusually and stubbornly conservative on this, trusting the wisdom of their ancestors over foreign devilry (would be a pretty Confucian thing to do). The Greeks realised the Earth was round from things like seeing sails appear over the horizon. Any sailing peoples thinking about this would have noticed sooner or later.

Kind of a long shot, but did Polynesian people have ideas on this, for example?

neil-warren on Neil Warren's Shortform

Poetry and practicality

I was staring up at the moon a few days ago and thought about how deeply I loved my family, and wished to one day start my own (I'm just over 18 now). It was a nice moment.

Then, I whipped out my laptop and felt constrained to get back to work; i.e. read papers for my AI governance course, write up LW posts, and trade emails with EA France. (These I believe to be my best shots at increasing everyone's odds of survival).

It felt almost like sacrilege to wrench myself away from the moon and my wonder. Like I was ruining a moment of poetry and stillwatered peace by slamming against reality and its mundane things again.

But... The reason I wrenched myself away is directly downstream from the spirit that animated me in the first place. Whether I feel the poetry now that I felt then is irrelevant: it's still there, and its value and truth persist. Pulling away from the moon was evidence I cared about my musings enough to act on them.

The poetic is not a separate magisterium from the practical; rather the practical is a particular facet of the poetic. Feeling "something to protect" in my bones naturally extends to acting it out. In other words, poetry doesn't just stop. Feel no guilt in pulling away. Because, you're not.