LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

A brief theory of why we think things are good or bad
David Johnston (david-johnston) · 2024-10-20T20:31:26.309Z · comments (10)

[link] October 2024 Progress in Guaranteed Safe AI
Quinn (quinn-dougherty) · 2024-10-28T23:34:51.689Z · comments (0)

Enhancing Mathematical Modeling with LLMs: Goals, Challenges, and Evaluations
ozziegooen · 2024-10-28T21:44:42.352Z · comments (0)

[question] somebody explain the word "epistemic" to me
KvmanThinking (avery-liu) · 2024-10-28T16:40:24.275Z · answers+comments (8)

Quantitative Trading Bootcamp [Nov 6-10]
Ricki Heicklen (bayesshammai) · 2024-10-28T18:39:58.480Z · comments (0)

Piling bounded arguments
momom2 (amaury-lorin) · 2024-09-19T22:27:41.534Z · comments (0)

[link] AISN #44: The Trump Circle on AI Safety Plus, Chinese researchers used Llama to create a military tool for the PLA, a Google AI system discovered a zero-day cybersecurity vulnerability, and Complex Systems
Corin Katzke (corin-katzke) · 2024-11-19T16:36:40.501Z · comments (0)

[link] Redundant Attention Heads in Large Language Models For In Context Learning
skunnavakkam · 2024-09-01T20:08:48.963Z · comments (1)

[question] On the subject of in-house large language models versus implementing frontier models
Annapurna (jorge-velez) · 2024-09-23T15:00:32.811Z · answers+comments (1)

Fake Blog Posts as a Problem Solving Device
silentbob · 2024-08-31T09:22:54.513Z · comments (0)

Proactive 'If-Then' Safety Cases
Nathan Helm-Burger (nathan-helm-burger) · 2024-11-18T21:16:37.237Z · comments (0)

AirBnB Baking
jefftk (jkaufman) · 2024-07-10T12:50:03.381Z · comments (1)

Ethical Implications of the Quantum Multiverse
Jonah Wilberg (jrwilb@googlemail.com) · 2024-11-18T16:00:20.645Z · comments (7)

[question] Does a time-reversible physical law/Cellular Automaton always imply the First Law of Thermodynamics?
Noosphere89 (sharmake-farah) · 2024-08-30T15:12:28.823Z · answers+comments (11)

Thoughts to niplav on lie-detection, truthfwl mechanisms, and wealth-inequality
Emrik (Emrik North) · 2024-07-11T18:55:46.687Z · comments (8)

[link] Kinds of Motivation
Sable · 2024-07-13T15:52:44.432Z · comments (2)

Chat Bankman-Fried: an Exploration of LLM Alignment in Finance
claudia.biancotti · 2024-11-18T09:38:35.723Z · comments (2)

[link] Consciousness As Recursive Reflections
Gunnar_Zarncke · 2024-10-05T20:00:53.053Z · comments (3)

Do Deep Neural Networks Have Brain-like Representations?: A Summary of Disagreements
Joseph Emerson (joseph-emerson) · 2024-11-18T00:07:15.155Z · comments (0)

[question] What makes one a "rationalist"?
mathyouf · 2024-10-08T20:25:21.812Z · answers+comments (5)

Denver USA - ACX Meetups Everywhere Fall 2024
Eneasz · 2024-08-29T18:40:53.332Z · comments (0)

[link] Boons and banes
dkl9 · 2024-09-23T06:18:38.335Z · comments (0)

Value/Utility: A History
Lorec · 2024-11-19T23:01:39.167Z · comments (0)

Join my new subscriber chat
sarahconstantin · 2024-11-06T02:30:11.059Z · comments (0)

Utilitarianism and the replaceability of desires and attachments
MichaelStJules · 2024-07-27T01:57:42.419Z · comments (2)

Making Beliefs Pay Rent
Screwtape · 2024-07-28T17:59:52.101Z · comments (2)

Relativity Theory for What the Future 'You' Is and Isn't
FlorianH (florian-habermacher) · 2024-07-29T02:01:17.736Z · comments (48)

Moral Trade, Impact Distributions and Large Worlds
Larks · 2024-09-20T03:45:56.273Z · comments (0)

[link] [Linkpost] Hawkish nationalism vs international AI power and benefit sharing
jakub_krys (kryjak) · 2024-10-18T18:13:19.425Z · comments (5)

[question] If I ask an LLM to think step by step, how big are the steps?
ryan_b · 2024-09-13T20:30:50.558Z · answers+comments (1)

[question] What actual bad outcome has "ethics-based" RLHF AI Alignment already prevented?
Roko · 2024-10-19T06:11:12.602Z · answers+comments (16)

The Personal Implications of AGI Realism
xizneb · 2024-10-20T16:43:37.870Z · comments (7)

A Brief Explanation of AI Control
Aaron_Scher · 2024-10-22T07:00:56.954Z · comments (1)

[link] Validating / finding alignment-relevant concepts using neural data
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-09-20T21:12:49.267Z · comments (0)

[link] Is Redistributive Taxation Justifiable? Part 1: Do the Rich Deserve their Wealth?
Alexander de Vries (alexander-de-vries) · 2024-09-05T10:23:08.958Z · comments (20)

Sequence overview: Welfare and moral weights
MichaelStJules · 2024-08-15T04:22:32.567Z · comments (0)

What does a Gambler's Verity world look like?
ErioirE (erioire) · 2024-07-25T22:03:56.447Z · comments (6)

Deception and Jailbreak Sequence: 2. Iterative Refinement Stages of Jailbreaks in LLM
Winnie Yang (winnie-yang) · 2024-08-28T08:41:38.967Z · comments (2)

[link] Taking nonlogical concepts seriously
Kris Brown (kris-brown) · 2024-10-15T18:16:01.226Z · comments (5)

Of Birds and Bees
RussellThor · 2024-09-30T10:52:15.069Z · comments (9)

[link] Spherical cow
dkl9 · 2024-11-11T03:10:27.788Z · comments (0)

[link] Thinking LLMs: General Instruction Following with Thought Generation
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-10-15T09:21:22.583Z · comments (0)

[link] Checking public figures on whether they "answered the question" quick analysis from Harris/Trump debate, and a proposal
david reinstein (david-reinstein) · 2024-09-11T20:25:27.845Z · comments (4)

Not all biases are equal - a study of sycophancy and bias in fine-tuned LLMs
jakub_krys (kryjak) · 2024-11-11T23:11:15.233Z · comments (0)

[link] The AI regulator’s toolbox: A list of concrete AI governance practices
Adam Jones (domdomegg) · 2024-08-10T21:15:09.265Z · comments (1)

The Great Bootstrap
KristianRonn · 2024-10-11T19:46:51.752Z · comments (0)

Quantum Immortality: A Perspective if AI Doomers are Probably Right
avturchin · 2024-11-07T16:06:08.106Z · comments (40)

Funding for programs and events on global catastrophic risk, effective altruism, and other topics
abergal · 2024-08-14T23:59:48.146Z · comments (0)

Broadly human level, cognitively complete AGI
p.b. · 2024-08-06T09:26:13.220Z · comments (0)

[question] why won't this alignment plan work?
KvmanThinking (avery-liu) · 2024-10-10T15:44:59.450Z · answers+comments (7)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

christiankl on What changes should happen in the HHS?

To the extent that our needs are "actively shoot ourselves in the foot slightly less often", there's the question of why we currently shoot ourselves in the food. I suspect it's because of the incentives that are produced by the current policies.

dakara on If we solve alignment, do we die anyway?

That does indeed answer my 3 concerns (and Seth's answer does as well). Overnight, I came up with 1 more concern.

What if AGI somewhere down the line overgoes a value drift. After all, looking at the evolution, it seems like our evolutionary goal was supposed to be "produce as many offsprings". And in the recent years, we have strayed from this goal (and are currently much worse at it than our ancestors). Now, humans seem to have goals like "design a video game" or "settle in France" or "climb Everest". What if AGI similarly changes its goals and values overtime? Is there are way to prevent that or at least be safeguarded against that?

I am afraid that if that happens, humans would, metaphorically speaking, stand in AGI's way of climbing Everest.

viliam on Neutrality

yet we still don't have anything close to a unified theory of human mating, relationships, and child-rearing that's better.

We even seem to have a collective taboo against developing such theory, or even making relatively obvious observations.

anthonyc on What changes should happen in the HHS?

I think our collective HHS needs are less "clever policy ideas" and more "actively shoot ourselves in the foot slightly less often."

christiankl on What changes should happen in the HHS?

Saying "whatever ways are reasonable" is ignoring the key issues.

Robert F. Kennedy Jr. believes that all vaccines should require placebo-blind trials to be licensed the most other drugs do.

Beyond that, a major health problem is obesity and here semaglutide seems like it would help a lot.

Do you believe that Medicaid/Medicare should just pay the sticker price for everyone who wants it?

dr_s on Intrinsic Power-Seeking: AI Might Seek Power for Power’s Sake

This still feels like instrumentality. I guess maybe the addition is that it's a sort of "when all you have is a hammer" situation; as in, even when the optimal strategy for a problem does not involve seeking power (assuming such a problem exists; really I'd say the question is what the optimal power seeking vs using that power trade-off is), the AI would be more liable to err on the side of seeking too much power because that just happens to be such a common successful strategy that it's sort of biased towards it.

christiankl on Don't Dismiss on Epistemics

Surely the state of the science has advanced since this lawsuit took place.

Yes, it does. We now have meta reviews which were not common back in 1990.

Cochrane is one of the best sources for metastudies and their read of the scientific evidence for chiropractics is: "The review shows that while combined chiropractic interventions slightly improved pain and disability in the short term and pain in the medium term for acute and subacute low-back pain, there is currently no evidence to support or refute that combined chiropractic interventions provide a clinically meaningful advantage over other treatments for pain or disability in people with low-back pain."

While it's not shown to be superior to conventional treatment it's also not shown to be without effect. Given that insurance covers a variety of treatments for back pain that are just as effective as chiropractics, the AMA has been essentially shown wrong.

To me, it's quite strange to advocate "Don't Dismiss on Epistemics" while at the same time ignoring scientific meta reviews on the topic.

viliam on Making a conservative case for alignment

I approve of the militant atheism, because there are just too many religious people out there, so without making a strong line we would have an Eternal September of people joining Less Wrong just to say "but have you considered that an AI can never have a soul?" or something similar.

And if being religious is strongly correlated with some political tribe, I guess it can't be avoided.

But I think that going further than that is unnecessary and harmful.

Actually, we should probably show some resistance to the stupid ideas of other political tribes, just to make our independence clear. Otherwise, people would hesitate to call out bullshit when it comes from those who seem associated with us. (Quick test: Can you say three things the average Democrat believes that are wrong and stupid? What reaction would you expect if you posted your answer on LW?)

Specifically on trans issues:

I am generally in favor of niceness and civilization, therefore:

If someone calls themselves "he" or "she", I will use that pronoun without thinking twice about it.
I disapprove of doxing in general, which extends to all speculations about someone's biological sex.

But I also value rationality and free speech, therefore:

I insist on keeping an "I don't know, really" attitude to trans issues. I don't know, really. The fact that you are yelling at me does not make your arguments any more logically convincing.
No, I am not literally murdering you by disagreeing with you. Let's tone down the hysteria.
There are people who feel strongly that they are Napoleon. If you want to convince me, you need to make a stronger case than that.
I specifically disagree on the point that if someone changes their gender, it retroactively changes their entire past. If someone presented as male for 50 years, then changed to female, it makes sense to use "he" to refer to their first 50 years, especially if this is the pronoun everyone used at that time. Also, I will refer to them using the name they actually used at that time. (If I talk about the Ancient Rome, I don't call it Italian Republic either.) Anything else feels like magical thinking to me. I won't correct you if you do that, but please do not correct me, or I will be super annoyed.

leon-lang on Leon Lang's Shortform

After the US election, the twitter competitor bluesky suddenly gets a surge of new users:

https://x.com/robertwiblin/status/1858991765942137227

atillayasar on AtillaYasar's Shortform

Twitter doesn't incentivize truth-seeking

Twitter is designed for writing things off the top of your head, and things that others will share or reply to. There are almostt no mechanisms to reward good ideas, to punish bad ones, nor for consistency of your views, nor any mechanism for even seeing whether someone updates their beliefs, or whether a comment pointed out that they're wrong.

(The fact that there are comments is really really good, and it's part of makes Twitter so much better than mainstream media. Community Notes are great too.)

The solution to Twitter sucking, is not to follow different people, and DEFINITELY not to correct every wrong statement (oops), it's to just leave. Even smart people, people who are way smarter and more interesting and knowledgeable and funny than me, they simply don't care that much about their posts. If it's thought-provoking, you can't even do anything with that fact, because nothing about the website is designed for deeper conversations. Though I've had a couple of nice moments where I went deep into a topic with someone in the replies.

Shortforms are better

The above thing is also a danger with Shortforms, but to a lesser extent, because things are easier to find, and it's much more likely that I'll see something I've written, see that I'm wrong, and delete it or edit it. Posts on Twitter or not editable, harder to find, there's no preview-on-hover, there is no hyperlinked text.