LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Anthropic teams up with Palantir and AWS to sell AI to defense customers
Matrice Jacobine · 2024-11-09T11:50:34.050Z · comments (0)

A Dialogue on Deceptive Alignment Risks
Rauno Arike (rauno-arike) · 2024-09-25T16:10:12.294Z · comments (0)

[link] Markets Are Information - Beating the Sportsbooks at Their Own Game
JJXW · 2024-11-07T20:58:43.389Z · comments (1)

The Bayesian Conspiracy Live Recording
Eneasz · 2024-11-06T16:25:13.380Z · comments (0)

Testing "True" Language Understanding in LLMs: A Simple Proposal
MtryaSam · 2024-11-02T19:12:34.710Z · comments (2)

Prediction markets and Taxes
Edmund Nelson (edmund-nelson) · 2024-11-01T17:39:35.191Z · comments (7)

[link] In Praise of the Beatitudes
robotelvis · 2024-09-24T05:08:21.133Z · comments (7)

Fundamental Uncertainty: Epilogue
Gordon Seidoh Worley (gworley) · 2024-11-16T00:57:48.823Z · comments (0)

Becket First
jefftk (jkaufman) · 2024-09-22T17:10:04.304Z · comments (0)

[question] A Different Perspective on Rationality - Would This Be Valuable?
Gabriel Brito (gabriel-brito) · 2024-10-26T18:47:46.416Z · answers+comments (4)

Derivative AT a discontinuity
Alok Singh (OldManNick) · 2024-10-24T02:48:24.573Z · comments (5)

Thinking About a Pedalboard
jefftk (jkaufman) · 2024-10-08T11:50:02.054Z · comments (2)

Electric Mandola
jefftk (jkaufman) · 2024-09-21T13:40:04.772Z · comments (0)

The Other Existential Crisis
James Stephen Brown (james-brown) · 2024-09-21T01:16:38.011Z · comments (24)

how to rapidly assimilate new information
dhruvmethi · 2024-10-24T02:18:00.648Z · comments (3)

[link] Testing Genetic Engineering Detection with Spike-Ins
jefftk (jkaufman) · 2024-10-22T17:20:54.947Z · comments (0)

[question] Are UV-C Air purifiers so useful?
JohnBuridan · 2024-09-04T14:16:01.310Z · answers+comments (0)

Toy Models of Superposition: Simplified by Hand
Axel Sorensen (axel-sorensen) · 2024-09-29T21:19:52.475Z · comments (3)

[link] Species as Canonical Referents of Super-Organisms
Yudhister Kumar (randomwalks) · 2024-10-18T07:49:52.944Z · comments (8)

Keeping it (less than) real: Against ℶ₂ possible people or worlds
quiet_NaN · 2024-09-13T17:29:44.915Z · comments (0)

Rationalist Gnosticism
tailcalled · 2024-10-10T09:06:34.149Z · comments (10)

[link] Physics of Language models (part 2.1)
Nathan Helm-Burger (nathan-helm-burger) · 2024-09-19T16:48:32.301Z · comments (2)

[question] Is this a Pivotal Weak Act? Creating bacteria that decompose metal
doomyeser · 2024-09-11T18:07:19.385Z · answers+comments (9)

Dario Amodei's "Machines of Loving Grace" sound incredibly dangerous, for Humans
Super AGI (super-agi) · 2024-10-27T05:05:13.763Z · comments (1)

[link] [Linkpost] Automated Design of Agentic Systems
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-08-19T23:06:06.669Z · comments (1)

[link] Triangulating My Interpretation of Methods: Black Boxes by Marco J. Nathan
adamShimi · 2024-10-09T19:13:26.631Z · comments (0)

[link] Jailbreaking language models with user roleplay
loops (smitop) · 2024-09-28T23:43:10.870Z · comments (0)

[question] What are some positive developments in AI safety in 2024?
Satron · 2024-11-15T10:32:39.541Z · answers+comments (0)

Meta AI (FAIR) latest paper integrates system-1 and system-2 thinking into reasoning models.
happy friday (happy-friday) · 2024-10-24T16:54:15.721Z · comments (0)

[link] In-Context Learning: An Alignment Survey
alamerton · 2024-09-30T18:44:28.589Z · comments (0)

[link] Contagious Beliefs—Simulating Political Alignment
James Stephen Brown (james-brown) · 2024-10-13T00:27:08.084Z · comments (0)

[question] Change My Mind: Thirders in "Sleeping Beauty" are Just Doing Epistemology Wrong
DragonGod · 2024-10-16T10:20:22.133Z · answers+comments (67)

Interpreting the effects of Jailbreak Prompts in LLMs
Harsh Raj (harsh-raj-ep-037) · 2024-09-29T19:01:10.113Z · comments (0)

Steering LLMs' Behavior with Concept Activation Vectors
Ruixuan Huang (sprout_ust) · 2024-09-28T09:53:19.658Z · comments (0)

[link] Universal dimensions of visual representation
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-08-28T10:38:58.396Z · comments (0)

LLMs are likely not conscious
research_prime_space · 2024-09-29T20:57:26.111Z · comments (8)

[link] AI Safety Newsletter #42: Newsom Vetoes SB 1047 Plus, OpenAI’s o1, and AI Governance Summary
Corin Katzke (corin-katzke) · 2024-10-01T20:35:32.399Z · comments (0)

[link] Approval-Seeking ⇒ Playful Evaluation
Jonathan Moregård (JonathanMoregard) · 2024-08-28T21:03:51.244Z · comments (0)

Three main arguments that AI will save humans and one meta-argument
avturchin · 2024-10-02T11:39:08.910Z · comments (8)

Thoughts On the Nature of Capability Elicitation via Fine-tuning
Theodore Chapman · 2024-10-15T08:39:19.909Z · comments (0)

[link] What is autonomy? Why boundaries are necessary.
Chipmonk · 2024-10-21T17:56:33.722Z · comments (1)

[link] Can AI agents learn to be good?
Ram Rachum (ram@rachum.com) · 2024-08-29T14:20:04.336Z · comments (0)

New UChicago Rationality Group
Noah Birnbaum (daniel-birnbaum) · 2024-11-08T21:20:34.485Z · comments (0)

[link] An Uncanny Moat
Adam Newgas (BorisTheBrave) · 2024-11-15T11:39:15.165Z · comments (0)

[link] Models of life
Abhishaike Mahajan (abhishaike-mahajan) · 2024-09-29T19:24:40.060Z · comments (0)

Project Adequate: Seeking Cofounders/Funders
Lorec · 2024-11-17T03:12:12.995Z · comments (4)

[question] Is it Legal to Maintain Turing Tests using Data Poisoning, and would it work?
Double · 2024-09-05T00:35:39.504Z · answers+comments (9)

[link] Catastrophic Cyber Capabilities Benchmark (3CB): Robustly Evaluating LLM Agent Cyber Offense Capabilities
Jonathan N (derpyplops) · 2024-11-05T01:01:08.083Z · comments (0)

Two new datasets for evaluating political sycophancy in LLMs
alma.liezenga · 2024-09-28T18:29:49.088Z · comments (0)

An open response to Wittkotter and Yampolskiy
Donald Hobson (donald-hobson) · 2024-09-24T22:27:21.987Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

cousin_it on OpenAI Email Archives (from Musk v. Altman)

"Things that cause people to stumble are bound to come, but woe to anyone through whom they come." Or to translate from New Testament into something describing the current situation: you should accept that AI will come, but you shouldn't be the one who hastens its coming.

Yes, this approach sounds very simple and naive. The people in this email exchange rejected it and went for a more sophisticated one: join the arms race and try to steer it. By now we all see that these ultra-smart and ultra-rich people made things much worse than if they'd followed the "do no evil" approach. If this doesn't prove the "do no evil" approach, I'm not sure what will.

startattheend on Breaking beliefs about saving the world

I like this post, but I have some problems with it. Don't take it too hard, as I'm not the average LW reader. I think your post is quite in line with what most people here believe. I'm just an anomaly who happened to read your post.

By bringing attention to tactical/emotionally pulling patterns of suffering, people will recognize it in their own life, and we will create an unfulfilled desire that only we have the solution for.

I think this might make suffering worse. Suffering is subjective, so if you make people believe that they should be suffering, or that suffering is justified, they may suffer needlessly. For example, poverty doesn't make people as dissatisfied with life as relative poverty does. It's when people compare themselves to others and realize that they could have it better, that they start disliking what they have at the moment. If you create ideals, then people will work towards archiving them, but they will also suffer from the gap between the current state and what's ideal. You may argue "the reward redeems the suffering and makes it bearable", and yes, but only as long as people believe that they're getting closer to the goal. Most positive emotion we experience is a result of feeling ourselves moving towards our goals.

Personal concurrent life-satisfaction is possible in-spite of punishment/suffering when punishment/suffering is perceived as a necessary sacrifice for an impending reward.

Yes, which is why one should not reduce "suffering" but "the causes of unproductive suffering". Just like one shouldn't avoid "pain", but "actions which are painful and without benefit". The conclusions of "mans search for meaning" was that suffering is bearable as long as it as meaning, that only meaningless suffering is unbearable. I've personally felt this as well. One of the times I was the most happy, I was also the most depressed. But that might just have been a mixed episode as is known from bipolar disorder.
I'm nitpicking, but I believe it's important to state that "suffering" isn't a fundamental issue. If I touch a flame and burn my hand, then the flame is the issue, not the pain. In fact, the pain is protecting me from touching the flame again. Suffering is good for survival, for the same reason that pain is good for survival. The proof is that evolution made us suffer, that those who didn't suffer didn't pass on their genes.

We are products of EA

I'm not sure this is true? EA seems to be the opposite of darwinism, and survival of the fittest has been the standard until recent (everyone suddenly cares about reducing negative emotions and unfairness, to an almost pathological degree). But even if various forces helped me avoid suffering, would that really be a good thing?

I personally grew the most as a person as a result of suffering. You're probably right that you were the least productive when you didn't eat, but suffering is merely a signal that change is necessary, and when you experience great suffering, you become open to the idea of change. It's not uncommon that somebody hits rock bottom and turns their lives around for the better as a result. But while suffering is bearable, we can continue enduring, until we suffer the death of a thousand papercuts (or the death of the boiling frog, by our own hands)
That said, growth is usually a result of internal pressure, in which an inconsistency inside oneself finally snaps, so that one can focus on a single direction with determination. It's like a fever - the body almost kills itself, so that something harmful to it can die sooner.

We are still in trouble if the average human is as stupid as I am.

Are you sure suffering is caused by a lack of intelligence, and not by too much intelligence? ('Forbidden fruit' argument) And that we suffer from a lack of tech rather than from an abundance of tech? (As Ted Kaczynski and the Amish seem to think)
Many animals are thriving despite their lack of intelligence. Any problem more complicated than "Get water, food and shelter. Find a mate, and reproduce" is a fabricated problem. It's because we're more intelligent than animals that we fabricate more difficult problems. And if something was within out ability, we'd not consider it a problem, which is why we always fabricate problems which are beyond our current capacities, which is how we trick ourselves into growth and improvement. Growth and improvement which somehow resulted in us being so powerful that we can destroy ourselves. Horseshoe crabs seem content with themselves, and even after 400 million years they just do their own thing. Some of them seem endangered now, but that's because of us?

Bureaucracy

Caused by too much centralization, I think. Merging structures into fewer, bigger structures causes an overhead which doesn't seem to be worth it. Decentralizing everything may actually save the world, or at least decrease the feedback loop which causes a few entities to hog all the resources.

Moloch

Caused by too much information and optimization, and therefore unlikely to be solved with information and optimization. My take here is the same as with intelligence and tech. Why hasn't moloch killed us sooner? I believe it's because the conditions for moloch weren't yet reached (optimal strategies weren't visible, as the world wasn't legible and transparent enough), in which case, going back might be better than going forwards.

The tools you wish to use to solve human extinction are, from my perspective, what is currently leading us towards extinction. You can add AGI to this list of things if you want.

gerardus-mercator on Claude seems to be smarter than LessWrong community

So, if I understand you correctly, you now agree that a paperclip-maximizing agent won't utterly disregard paperclips relative to survival, because that would be suboptimal for its utility function.
However, if a paperclip-maximizing agent utterly disregarded paperclips relative to investigating the possibility of an objective goal, that would also be suboptimal for its utility function.
It sounds to me like you're saying that the intelligent agent will just disregard optimization of its utility function and instead investigate the possibility of an objective goal.
However, I don't agree with that. I don't see why an intelligent agent would do that if its utility function didn't already include a term for objective goals.
Again, I think a toy example might help to illustrate your position.

christiankl on What are Emotions?

I simply find it interesting that people feel the need to justify their terminal goals (unless they are emotions), and that the only way they can seem to do it is by associating it with an emotion.

I don't find it surprising that when you ask people "Why do you want that?", they feel pressure to justify themselves. That seems to me the basic way normal human beings reject to social inquiries. If you ask "Why X" normal people feel pressured to provide a justification.

christiankl on The Online Sports Gambling Experiment Has Failed

Yes, most people are generally bad at updating. That has nothing to do with whether or not someone is a libertarian.

The reason Zvi is surprised comes downstream of numerical literacy and not downstream of him being libertarian-leaning.

28% percent increase suggests that more than one in five go bankrupt because of sports online gambling (which would be a subset of gambling in general).

If I ask Claude for the top five reasons people go bankrupt in the US in 2023 I get:

1. Medical Debt
2. Job Loss/Income Reduction
3. Credit Card/Consumer Debt
4. Divorce/Family Issues
5. Housing Market Issues/Mortgage Debt

ChatGPT gives me:

Loss of Income
Medical Expenses
Unaffordable Mortgages or Foreclosures
Overspending and Credit Card Debt
Providing Financial Assistance to Others

I think Claude and ChatGPT do summarize the common wisdom about what normal people believe about what the most important factors for bankruptcy happen to be and that does not include gambling (let alone sports online gambling specifically). If you ask a normal person for the top five reasons they would likely come up with a similar list that does not mention sports online betting.

leon-lang on OpenAI Email Archives (from Musk v. Altman)

Do we know anything about why they were concerned about an AGI dictatorship created by Demis?

peterh on OpenAI Email Archives (from Musk v. Altman)

There is no meeting where an actual rational discussion of considerations and theories of change happens, everything really is people flying by the seat of their pants even at highest level. Talk of ethics usually just gets you excluded from the power talk.

This seems overstated. E.g. Musk and Altman both read Superintelligence, and they both met Bostrom at least once in 2015. Sam published reflective blog posts on AGI in 2015, and it's clear that the OpenAI founders had lengthy, reflective discussions from the YC Research days onwards.

peterh on OpenAI Email Archives (from Musk v. Altman)

That's probably not what Page meant. On consideration, he would probably have clarified […]

[…]

A little more careful conversation would've prevented this whole thing.

The passage you quote earlier suggests that they had multiple lengthy conversations:

Larry Page and I used to be close friends and I would stay at his house in Paolo Alto and I would talk to him late into the night about AI safety […]

Quick discussions via email are not strong evidence of a lack of careful discussion and reflection in other contexts.

startattheend on Alexander Gietelink Oldenziel's Shortform

Great post!

It's a habit of mine to think in very high levels of abstraction (I haven't looked much into category theory though, admittedly), and while it's fun, it's rarely very useful. I think it's because of a width-depth trade-off. Concrete real-world problems have a lot of information specific to that problem, you might even say that the unique information is the problem. An abstract idea which applies to all of mathematics is way too general to help much with a specific problem, it can just help a tiny bit with a million different problems.

I also doubt the need for things which are so complicated that you need a team of people to make sense of them. I think it's likely a result of bad design. If a beginner programmer made a slot machine game, the code would likely be convoluted and unintuitive, but you could probably design the program in a way that all of it fits in your working memory at once. Something like "A slot machine is a function from the cartesian product of wheels to a set of rewards". An understanding which would simply the problem so that you could write it much shorter and simpler than the beginner. What I mean is that there may exist simple designs for most problems in the world, with complicated designs being due to a lack of understanding.

The real world values the practical way more than the theoretical, and the practical is often quite sloppy and imperfect, and made to fit with other sloppy and imperfect things.

The best things in society are obscure by statistical necessity, and it's painful to see people at the tail ends doubt themselves at the inevitable lack of recognition and reward.

alexander-gietelink-oldenziel on Alexander Gietelink Oldenziel's Shortform

Yes thats worded too strongly and a result of me putting in some key phrases into Claude and not proofreading. :p

I agree with you that most modern math is within-paradigm work.