LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] An "Observatory" For a Shy Super AI?
Sherrinford · 2024-09-27T21:22:40.296Z · comments (0)

[link] Linkpost: Hypocrisy standoff
Chris_Leong · 2024-09-29T14:27:19.175Z · comments (1)

Grass Valley USA - ACX Meetups Everywhere Fall 2024
Raelifin · 2024-08-29T18:39:57.229Z · comments (0)

[link] Formalize the Hashiness Model of AGI Uncontainability
Remmelt (remmelt-ellen) · 2024-11-09T16:10:05.032Z · comments (0)

[question] How do you follow AI (safety) news?
PeterH · 2024-09-24T13:58:48.916Z · answers+comments (2)

Jailbreaking ChatGPT and Claude using Web API Context Injection
Jaehyuk Lim (jason-l) · 2024-10-21T21:34:37.579Z · comments (0)

Tokyo (日本語) Japan - ACX Meetups Everywhere Fall 2024
Emi (emi-2) · 2024-08-29T18:35:28.013Z · comments (0)

Building Safer AI from the Ground Up: Steering Model Behavior via Pre-Training Data Curation
Antonio Clarke (antonio-clarke) · 2024-09-29T18:48:23.308Z · comments (0)

[link] Proposing the Conditional AI Safety Treaty (linkpost TIME)
otto.barten (otto-barten) · 2024-11-15T13:59:01.050Z · comments (4)

[question] Is OpenAI net negative for AI Safety?
Lysandre Terrisse · 2024-11-02T16:18:02.859Z · answers+comments (0)

[link] 2024 Election Forecasting Contest
mike20731 · 2024-10-05T20:43:16.203Z · comments (0)

[question] Noticing the World
EvolutionByDesign (bioluminescent-darkness) · 2024-11-04T16:41:44.696Z · answers+comments (1)

[question] What are the primary drivers that caused selection pressure for intelligence in humans?
Towards_Keeperhood (Simon Skade) · 2024-11-07T09:40:20.275Z · answers+comments (15)

Developmental Stages in Multi-Problem Grokking
James Sullivan · 2024-09-29T18:58:22.954Z · comments (0)

Likelihood calculation with duobels
Martin Gerdes (martin-gerdes) · 2024-10-01T16:21:01.268Z · comments (0)

Towards a Clever Hans Test: Unmasking Sentience Biases in Chatbot Interactions
glykokalyx · 2024-11-10T22:34:58.956Z · comments (0)

[link] A Logical Proof for the Emergence and Substrate Independence of Sentience
rife (edgar-muniz) · 2024-10-24T21:08:09.398Z · comments (31)

Bellevue-Redmond USA - ACX Meetups Everywhere Fall 2024
Cedar (xida-ren) · 2024-08-29T18:43:57.014Z · comments (8)

The great Enigma in the sky: The universe as an encryption machine
Alex_Shleizer · 2024-08-14T13:21:58.713Z · comments (1)

[link] The ELYSIUM Proposal - Extrapolated voLitions Yielding Separate Individualized Utopias for Mankind
Roko · 2024-10-16T01:24:51.102Z · comments (18)

Effects of Non-Uniform Sparsity on Superposition in Toy Models
Shreyans Jain (shreyans-jain) · 2024-11-14T16:59:43.234Z · comments (3)

Can Current LLMs be Trusted To Produce Paperclips Safely?
Rohit Chatterjee (rohit-c) · 2024-08-19T17:17:07.530Z · comments (0)

[question] How do you finish your tasks faster?
Cipolla · 2024-08-21T20:01:41.306Z · answers+comments (2)

Tbilisi Georgia - ACX Meetups Everywhere Fall 2024
Dmitrii (dmitrii) · 2024-08-29T18:36:43.223Z · comments (4)

[link] Predictions as Public Works Project — What Metaculus Is Building Next
ChristianWilliams · 2024-10-22T16:35:13.999Z · comments (0)

Some Comments on Recent AI Safety Developments
testingthewaters · 2024-11-09T16:44:58.936Z · comments (0)

Ways to think about alignment
Abhimanyu Pallavi Sudhir (abhimanyu-pallavi-sudhir) · 2024-10-27T01:40:50.762Z · comments (0)

Interest poll: A time-waster blocker for desktop Linux programs
nahoj · 2024-08-22T20:44:04.479Z · comments (5)

It is time to start war gaming for AGI
yanni kyriacos (yanni) · 2024-10-17T05:14:17.932Z · comments (1)

[question] Is there a known method to find others who came across the same potential infohazard without spoiling it to the public?
hive · 2024-10-17T10:47:05.099Z · answers+comments (6)

[question] is there a big dictionary somewhere with all your jargon and acronyms and whatnot?
KvmanThinking (avery-liu) · 2024-10-17T11:30:50.937Z · answers+comments (7)

Methodology: Contagious Beliefs
James Stephen Brown (james-brown) · 2024-10-19T03:58:17.966Z · comments (0)

Personal Philosophy
Xor · 2024-10-13T03:01:59.324Z · comments (0)

AI Compute governance: Verifying AI chip location
Farhan · 2024-10-12T17:36:45.942Z · comments (0)

[question] EndeavorOTC legit?
FinalFormal2 · 2024-10-17T01:33:12.606Z · answers+comments (0)

Playing Minecraft with a Superintelligence
Johannes C. Mayer (johannes-c-mayer) · 2024-08-17T22:47:42.767Z · comments (0)

Your memory eventually drives confidence in each hypothesis to 1 or 0
Crazy philosopher (commissar Yarrick) · 2024-10-28T09:00:27.084Z · comments (6)

San Francisco ACX Meetup “First Saturday”
Nate Sternberg (nate-sternberg) · 2024-10-28T05:05:36.757Z · comments (0)

Dallas USA - ACX Meetups Everywhere Fall 2024
ethanmorse · 2024-08-29T18:43:37.972Z · comments (0)

LDT (and everything else) can be irrational
Christopher King (christopher-king) · 2024-11-06T04:05:36.932Z · comments (6)

Leverage points for a pause
Remmelt (remmelt-ellen) · 2024-08-28T09:21:17.593Z · comments (0)

Computational irreducibility challenges the simulation hypothesis
Clément L · 2024-08-11T16:14:29.655Z · comments (15)

Distributed espionage
margetmagenta · 2024-11-04T19:43:33.316Z · comments (0)

Cape Town South Africa - ACX Meetups Everywhere Fall 2024
moyamo · 2024-08-29T18:28:24.579Z · comments (0)

[question] Should LW suggest standard metaprompts?
Dagon · 2024-08-21T16:41:07.757Z · answers+comments (6)

Quantum Immortality: A Perspective if AI Doomers are Probably Right
avturchin · 2024-11-07T16:06:08.106Z · comments (30)

[link] Podcast discussing Hanson's Cultural Drift Argument
vaishnav92 · 2024-10-20T17:58:41.416Z · comments (0)

[link] Higher Order Signs, Hallucination and Schizophrenia
Nicolas Villarreal (nicolas-villarreal) · 2024-11-02T16:33:10.574Z · comments (0)

[link] Both-Sidesism—When Fair & Balanced Goes Wrong
James Stephen Brown (james-brown) · 2024-11-02T03:04:03.820Z · comments (15)

Hamiltonian Dynamics in AI: A Novel Approach to Optimizing Reasoning in Language Models
Javier Marin Valenzuela (javier-marin-valenzuela) · 2024-10-09T19:14:56.162Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

omnizoid on The Case For Giving To The Shrimp Welfare Project

As they describe in the report, the philosophical assumptions are mostly inconsequential and assumed for simplicity. The rest of your critique is just describing what they did, not an objection to it. It's not precise and they admit quite high uncertainty, but it's definitely better than alternatives (E.g. neuron counts).

silentbob on The Third Fundamental Question

I'm a bit torn regarding the "predicting how others react to what you say or do, and adjust accordingly" part. On the one hand this is very normal and human and makes sense. It's kind of predictive empathy in a way. On the other hand, thinking so very explicitly about it and trying to steer your behavior in a way so as to get the desired reaction out of another person also feels a bit manipulative and inauthentic. If I knew another person would think that way and plan exactly how they interacted with me, I would find that quite off-putting. But maybe the solution is just "don't overdo it", and/or "only use it in ways the other person would likely consent to" (such as avoiding to accidentally say something hurtful).

habryka4 on OpenAI Email Archives (from Musk v. Altman)

Fixed! That specific response had a very weird thread structure, so makes sense the AI I used got confused. Plausible something else was missing, though I think I've now read through all the original PDFs and didn't see anything new.

satron on Sabotage Evaluations for Frontier Models

I haven't heard of any such corrupt deals with OpenAI or Anthropic concerning governmental oversight over AI technology on the scale that would make me worried. Do you have any links to articles about government employees (who are responsible for oversight) recently signing secret contracts with OpenAI or Anthropic that would prohibit them from giving real feedback on a big enough scale to make it concerning?

satron on Sabotage Evaluations for Frontier Models

I will try to provide another similar analogy. Let's say that a King got tired of his people dying from diseases, so he decided to try a novel method of vaccination.

However, some people were really concerned about that. As far as they were concerned, the default outcome of injecting viruses into the bodies of people is death. And the King wants to vaccinate everyone, so these people create a council of great and worthy thinkers who after thinking for a while come up with a list of reasons why vaccines are going to doom everyone.

However, some other great and worthy thinkers (let's call them "hopefuls") come to the council and give reasons to think that aforementioned reasons are mistaken. Maybe they have done their own research, which seems to vindicate King's plan or at least undermine council's arguments.

And now imagine that King comes down from the castle points his finger at hopefuls' giving arguments for why the arguments proposed by the council is wrong and says "yeah, basically this" and then turns around and goes back to the castle. To me it seems like King's official endorsement of the arguments proposed by hopefuls doesn't really change the ethicality of the situation, as long as King is acting according to hopefuls' plan.

Furthermore, imagine if the one of the hopefuls who come to argue with the council was actually an undercover King. And he gave exactly the same arguments as people before him. This still IMO doesn't change the ethicality of the situation.

chaosmage on What Ketamine Therapy Is Like

First I heard of it was from an anesthesiologist who was very happy with how it is the only way to get to full anesthesia without depressing the patient's heart rate, so for senior patients it was really the only option. In retrospect, his enthusiasm about it does seem suspicious, but we were surrounded by professors and I don't think he was lying.

sam-marks on OpenAI Email Archives (from Musk v. Altman)

FYI it seems like this (important-seeming) email is missing, though the surrounding emails in the exchange seem to be present. (So maybe some other ones are missing too.)

benito on Sabotage Evaluations for Frontier Models

I'm having a hard time following this argument. To be clear, I'm saying that while certain people were in regulatory bodies in the US & UK govts, they actively had secret legal contracts to not criticize the leading industry player, else (prseumably) they could be sued for damages. This is not a past shady deals, this is about current people during their current tenure having been corrupted.

quetzal_rainbow on D0TheMath's Shortform

Completeness theorem states that consistent countable FO theory has a model. Compactness theorem states that FO theory has a model iff every finite subset of FO theory has a model. Both theorems are provable in ZFC.

Therefore:

Consistent(ZFC) <-> all finite subsets of ZFC have a model ->

not Consistent(ZFC) <-> some finite subsets of ZFC don't have a model ->

some finite subsets of ZFC + not Consistent(ZFC) don't have a model <->

not Consistent(ZFC + not Consistent(ZFC)),

proven in ZFC + not Consistent(ZFC)

ape-in-the-coat on Quantum Immortality: A Perspective if AI Doomers are Probably Right

If my π-guess is wrong, my only chance to survive is getting all-heads.

Your other chance for survival is that whatever means are used to kill you somehow does not succeed due to quantuum effects. And this is what QI+path-based identity approach actually predicts. The universe isn't going to reotroactively change the digit of pi, but neither it's going to influence the probability of the coin tosses due to the fact that someone may die. QI influence will trigger only at the moment of your death, turning it into near death. And then for the next attempt. And for the next one. Potentially locking you in a state of eternal torture.

However, abandoning SSSA also has a serious theoretical cost:
If observed probabilities have a hidden subjective dimension (because of path-dependency), all hell breaks loose. If we agree that probabilities of being a copy are distributed not in a state-dependent way, but in a path-dependent way, we agree that there is a 'hidden variable' in self-locating probabilities. This hidden variable does not play a role in our π experiment but appears in other thought experiments where the order of making copies is defined.

I fail to see this cost. Yes, we agree that there is an additional variable. Namely, my causal history. It's not necessary hidden but can as well be. So what? What is so hellbreaking about it? This is exactly how probability theory works in every other case. Why should it have a special case for conscious experience?

If there are two bags one with 1 red ball and another with 1000 blue balls and then the coin is tossed and based on the outcome I'm either getting a ball from the first or the second bag, I'm expecting to receive red ball with 50% chance. I'm not supposed to assume out of nowhere that every ball have to have equal probabilities to be given, therefore postulate a ball-shadow that will modify the fairness of the coin.