LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Thoughts on Evo-Bio Math and Mesa-Optimization: Maybe We Need To Think Harder About "Relative" Fitness?
Lorec · 2024-09-28T14:07:42.412Z · comments (6)

Retrieval Augmented Genesis
João Ribeiro Medeiros (joao-ribeiro-medeiros) · 2024-10-01T20:18:01.836Z · comments (0)

Inquisitive vs. adversarial rationality
gb (ghb) · 2024-09-18T13:50:09.198Z · comments (9)

Avoiding jailbreaks by discouraging their representation in activation space
Guido Bergman · 2024-09-27T17:49:20.785Z · comments (2)

Increasing the Span of the Set of Ideas
Jeffrey Heninger (jeffrey-heninger) · 2024-09-13T15:52:39.132Z · comments (1)

Does “Ultimate Neartermism” via Eternal Inflation dominate Longtermism in expectation?
Jordan Arel · 2024-08-17T22:28:21.849Z · comments (1)

Thirty random thoughts about AI alignment
Lysandre Terrisse · 2024-09-15T16:24:10.572Z · comments (1)

[link] SCP Foundation - Anti memetic Division Hub
landscape_kiwi · 2024-09-15T13:40:52.691Z · comments (1)

[question] How do we know dreams aren't real?
Logan Zoellner (logan-zoellner) · 2024-08-22T12:41:57.380Z · answers+comments (31)

[link] Join the $10K AutoHack 2024 Tournament
Paul Bricman (paulbricman) · 2024-09-25T11:54:20.112Z · comments (0)

[link] Internal music player: phenomenology of earworms
dkl9 · 2024-11-14T23:29:48.383Z · comments (1)

Using LLM's for AI Foundation research and the Simple Solution assumption
Donald Hobson (donald-hobson) · 2024-09-24T11:00:53.658Z · comments (0)

[link] Should we abstain from voting? (In nondeterministic elections)
B Jacobs (Bob Jacobs) · 2024-10-02T10:07:43.167Z · comments (6)

Some reasons to start a project to stop harmful AI
Remmelt (remmelt-ellen) · 2024-08-22T16:23:34.132Z · comments (0)

Educational CAI: Aligning a Language Model with Pedagogical Theories
Bharath Puranam (bharath-puranam) · 2024-11-01T18:55:26.993Z · comments (1)

Grass Valley USA - ACX Meetups Everywhere Fall 2024
Raelifin · 2024-08-29T18:39:57.229Z · comments (0)

Seeking mentorship
Kevin Afachao (kevin-afachao) · 2024-09-21T16:54:58.353Z · comments (0)

[link] Exposure can’t rule out disasters
Chipmonk · 2024-08-15T17:03:37.259Z · comments (19)

[question] If the DoJ goes through with the Google breakup,where does Deepmind end up?
O O (o-o) · 2024-10-12T05:06:50.996Z · answers+comments (1)

Differential knowledge interconnection
Roman Leventov · 2024-10-12T12:52:36.267Z · comments (0)

Biasing VLM Response with Visual Stimuli
Jaehyuk Lim (jason-l) · 2024-10-03T18:04:31.474Z · comments (0)

New Capabilities, New Risks? - Evaluating Agentic General Assistants using Elements of GAIA & METR Frameworks
Tej Lander (tej-lander) · 2024-09-29T18:58:56.253Z · comments (0)

[link] How long should political (and other) terms be?
ohmurphy · 2024-10-14T21:38:43.050Z · comments (0)

Longevity and the Mind
George3d6 · 2024-09-16T09:43:09.700Z · comments (2)

Reinforcement Learning from Information Bazaar Feedback, and other uses of information markets
Abhimanyu Pallavi Sudhir (abhimanyu-pallavi-sudhir) · 2024-09-16T01:04:32.953Z · comments (1)

[link] Is P(Doom) Meaningful? Bayesian vs. Popperian Epistemology Debate
Liron · 2024-11-09T23:39:30.039Z · comments (0)

[link] Linkpost: Hypocrisy standoff
Chris_Leong · 2024-09-29T14:27:19.175Z · comments (1)

Bellevue Library Meetup - Nov 23
Cedar (xida-ren) · 2024-11-09T23:05:02.452Z · comments (1)

[link] Formalize the Hashiness Model of AGI Uncontainability
Remmelt (remmelt-ellen) · 2024-11-09T16:10:05.032Z · comments (0)

[link] AI Safety Newsletter #41: The Next Generation of Compute Scale Plus, Ranking Models by Susceptibility to Jailbreaking, and Machine Ethics
Corin Katzke (corin-katzke) · 2024-09-11T19:14:08.274Z · comments (1)

[link] Could Things Be Very Different?—How Historical Inertia Might Blind Us To Optimal Solutions
James Stephen Brown (james-brown) · 2024-09-11T09:53:07.474Z · comments (0)

Meta: On viewing the latest LW posts
quiet_NaN · 2024-08-25T19:31:39.008Z · comments (2)

[link] Universal basic income isn’t always AGI-proof
Kevin Kohler (KevinKohler) · 2024-09-05T15:39:18.389Z · comments (3)

Agency overhang as a proxy for Sharp left turn
Eris (anton-zheltoukhov) · 2024-11-07T12:14:24.333Z · comments (0)

Democracy beyond majoritarianism
Arturo Macias (arturo-macias) · 2024-09-03T15:10:56.284Z · comments (2)

Scattered thoughts on what it means for an LLM to believe
TheManxLoiner · 2024-11-06T22:10:29.429Z · comments (3)

[question] Artificial V/S Organoid Intelligence
10xyz (10xyz-coder) · 2024-10-23T14:31:46.385Z · answers+comments (0)

[question] AMA: International School Student in China
Novice · 2024-10-01T06:00:16.282Z · answers+comments (0)

[link] An "Observatory" For a Shy Super AI?
Sherrinford · 2024-09-27T21:22:40.296Z · comments (0)

Apply to be a mentor in SPAR!
agucova · 2024-11-05T21:32:45.797Z · comments (0)

If I care about measure, choices have additional burden (+AI generated LW-comments)
avturchin · 2024-11-15T10:27:15.212Z · comments (9)

Using Narrative Prompting to Extract Policy Forecasts from LLMs
Max Ghenis (MaxGhenis) · 2024-11-05T04:37:52.004Z · comments (0)

Interest poll: A time-waster blocker for desktop Linux programs
nahoj · 2024-08-22T20:44:04.479Z · comments (5)

Some Comments on Recent AI Safety Developments
testingthewaters · 2024-11-09T16:44:58.936Z · comments (0)

[link] A Logical Proof for the Emergence and Substrate Independence of Sentience
rife (edgar-muniz) · 2024-10-24T21:08:09.398Z · comments (31)

Building Safer AI from the Ground Up: Steering Model Behavior via Pre-Training Data Curation
Antonio Clarke (antonio-clarke) · 2024-09-29T18:48:23.308Z · comments (0)

Towards a Clever Hans Test: Unmasking Sentience Biases in Chatbot Interactions
glykokalyx · 2024-11-10T22:34:58.956Z · comments (0)

[link] Proposing the Conditional AI Safety Treaty (linkpost TIME)
otto.barten (otto-barten) · 2024-11-15T13:59:01.050Z · comments (4)

[question] is there a big dictionary somewhere with all your jargon and acronyms and whatnot?
KvmanThinking (avery-liu) · 2024-10-17T11:30:50.937Z · answers+comments (7)

[question] How do you follow AI (safety) news?
PeterH · 2024-09-24T13:58:48.916Z · answers+comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

rotatingpaguro on AI #90: The Wall

I somewhat disagree with Tenobrus' commentary about Wolfram.

I watched the full podcast, and my impression was that Wolfram uses a "scientific hat", of which he is well aware of, which comes with a certain ritual and method for looking at new things and learning them. Wolfram is doing the ritual of understanding what Yudkowsky says, which involves picking at the details of everything.

Wolfram often recognizes that maybe he feels like agreeing with something, but "scientifically" he has a duty to pick it apart. I think this has to be understood as a learning process rather than as a state of belief.

nisan on Habryka's Shortform Feed

check out exhibit 13...

omnizoid on The Case For Giving To The Shrimp Welfare Project

As they describe in the report, the philosophical assumptions are mostly inconsequential and assumed for simplicity. The rest of your critique is just describing what they did, not an objection to it. It's not precise and they admit quite high uncertainty, but it's definitely better than alternatives (E.g. neuron counts).

silentbob on The Third Fundamental Question

I'm a bit torn regarding the "predicting how others react to what you say or do, and adjust accordingly" part. On the one hand this is very normal and human and makes sense. It's kind of predictive empathy in a way. On the other hand, thinking so very explicitly about it and trying to steer your behavior in a way so as to get the desired reaction out of another person also feels a bit manipulative and inauthentic. If I knew another person would think that way and plan exactly how they interacted with me, I would find that quite off-putting. But maybe the solution is just "don't overdo it", and/or "only use it in ways the other person would likely consent to" (such as avoiding to accidentally say something hurtful).

habryka4 on OpenAI Email Archives (from Musk v. Altman)

Fixed! That specific response had a very weird thread structure, so makes sense the AI I used got confused. Plausible something else was missing, though I think I've now read through all the original PDFs and didn't see anything new.

satron on Sabotage Evaluations for Frontier Models

I haven't heard of any such corrupt deals with OpenAI or Anthropic concerning governmental oversight over AI technology on the scale that would make me worried. Do you have any links to articles about government employees (who are responsible for oversight) recently signing secret contracts with OpenAI or Anthropic that would prohibit them from giving real feedback on a big enough scale to make it concerning?

satron on Sabotage Evaluations for Frontier Models

I will try to provide another similar analogy. Let's say that a King got tired of his people dying from diseases, so he decided to try a novel method of vaccination.

However, some people were really concerned about that. As far as they were concerned, the default outcome of injecting viruses into the bodies of people is death. And the King wants to vaccinate everyone, so these people create a council of great and worthy thinkers who after thinking for a while come up with a list of reasons why vaccines are going to doom everyone.

However, some other great and worthy thinkers (let's call them "hopefuls") come to the council and give reasons to think that aforementioned reasons are mistaken. Maybe they have done their own research, which seems to vindicate King's plan or at least undermine council's arguments.

And now imagine that King comes down from the castle points his finger at hopefuls' giving arguments for why the arguments proposed by the council is wrong and says "yeah, basically this" and then turns around and goes back to the castle. To me it seems like King's official endorsement of the arguments proposed by hopefuls doesn't really change the ethicality of the situation, as long as King is acting according to hopefuls' plan.

Furthermore, imagine if the one of the hopefuls who come to argue with the council was actually an undercover King. And he gave exactly the same arguments as people before him. This still IMO doesn't change the ethicality of the situation.

chaosmage on What Ketamine Therapy Is Like

First I heard of it was from an anesthesiologist who was very happy with how it is the only way to get to full anesthesia without depressing the patient's heart rate, so for senior patients it was really the only option. In retrospect, his enthusiasm about it does seem suspicious, but we were surrounded by professors and I don't think he was lying.

sam-marks on OpenAI Email Archives (from Musk v. Altman)

FYI it seems like this (important-seeming) email is missing, though the surrounding emails in the exchange seem to be present. (So maybe some other ones are missing too.)

benito on Sabotage Evaluations for Frontier Models

I'm having a hard time following this argument. To be clear, I'm saying that while certain people were in regulatory bodies in the US & UK govts, they actively had secret legal contracts to not criticize the leading industry player, else (prseumably) they could be sued for damages. This is not a past shady deals, this is about current people during their current tenure having been corrupted.