LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Agency overhang as a proxy for Sharp left turn
Eris (anton-zheltoukhov) · 2024-11-07T12:14:24.333Z · comments (0)

Project Adequate: Seeking Cofounders/Funders
Lorec · 2024-11-17T03:12:12.995Z · comments (7)

Theories With Mentalistic Atoms Are As Validly Called Theories As Theories With Only Non-Mentalistic Atoms
Lorec · 2024-11-12T06:45:26.039Z · comments (5)

[link] The Problem with Reasoners by Aidan McLaughin
t14n (tommy-nguyen-1) · 2024-11-25T20:24:26.021Z · comments (1)

Using Narrative Prompting to Extract Policy Forecasts from LLMs
Max Ghenis (MaxGhenis) · 2024-11-05T04:37:52.004Z · comments (0)

Apply to be a mentor in SPAR!
agucova · 2024-11-05T21:32:45.797Z · comments (0)

If I care about measure, choices have additional burden (+AI generated LW-comments)
avturchin · 2024-11-15T10:27:15.212Z · comments (11)

[link] Is P(Doom) Meaningful? Bayesian vs. Popperian Epistemology Debate
Liron · 2024-11-09T23:39:30.039Z · comments (0)

Educational CAI: Aligning a Language Model with Pedagogical Theories
Bharath Puranam (bharath-puranam) · 2024-11-01T18:55:26.993Z · comments (1)

[link] Formalize the Hashiness Model of AGI Uncontainability
Remmelt (remmelt-ellen) · 2024-11-09T16:10:05.032Z · comments (0)

[question] What are the primary drivers that caused selection pressure for intelligence in humans?
Towards_Keeperhood (Simon Skade) · 2024-11-07T09:40:20.275Z · answers+comments (15)

[question] Is OpenAI net negative for AI Safety?
Lysandre Terrisse · 2024-11-02T16:18:02.859Z · answers+comments (0)

[question] What (if anything) made your p(doom) go down in 2024?
Satron · 2024-11-16T16:46:43.865Z · answers+comments (6)

Effects of Non-Uniform Sparsity on Superposition in Toy Models
Shreyans Jain (shreyans-jain) · 2024-11-14T16:59:43.234Z · comments (3)

[link] Entropic strategy in Two Truths and a Lie
dkl9 · 2024-11-21T22:03:28.986Z · comments (2)

Some Comments on Recent AI Safety Developments
testingthewaters · 2024-11-09T16:44:58.936Z · comments (0)

[question] Noticing the World
EvolutionByDesign (bioluminescent-darkness) · 2024-11-04T16:41:44.696Z · answers+comments (1)

What are Emotions?
Myles H (zarsou9) · 2024-11-15T04:20:27.388Z · comments (13)

Germany-wide ACX Meetup
Fernand0 · 2024-11-17T10:08:54.584Z · comments (0)

Ways to think about alignment
Abhimanyu Pallavi Sudhir (abhimanyu-pallavi-sudhir) · 2024-10-27T01:40:50.762Z · comments (0)

Visualizing small Attention-only Transformers
WCargo (Wcargo) · 2024-11-19T09:37:42.213Z · comments (0)

Towards a Clever Hans Test: Unmasking Sentience Biases in Chatbot Interactions
glykokalyx · 2024-11-10T22:34:58.956Z · comments (0)

San Francisco ACX Meetup “First Saturday”
Nate Sternberg (nate-sternberg) · 2024-10-28T05:05:36.757Z · comments (0)

Your memory eventually drives confidence in each hypothesis to 1 or 0
Crazy philosopher (commissar Yarrick) · 2024-10-28T09:00:27.084Z · comments (6)

The boat
RomanS · 2024-11-22T12:56:45.050Z · comments (0)

Beyond Gaussian: Language Model Representations and Distributions
Matt Levinson · 2024-11-24T01:53:38.156Z · comments (0)

Distributed espionage
margetmagenta · 2024-11-04T19:43:33.316Z · comments (0)

(draft) Cyborg software should be open (?)
AtillaYasar (atillayasar) · 2024-11-01T07:24:51.966Z · comments (5)

LDT (and everything else) can be irrational
Christopher King (christopher-king) · 2024-11-06T04:05:36.932Z · comments (6)

Interview with Bill O’Rourke - Russian Corruption, Putin, Applied Ethics, and More
JohnGreer · 2024-10-27T17:11:28.891Z · comments (0)

[link] Both-Sidesism—When Fair & Balanced Goes Wrong
James Stephen Brown (james-brown) · 2024-11-02T03:04:03.820Z · comments (15)

Antonym Heads Predict Semantic Opposites in Language Models
Jake Ward (jake-ward) · 2024-11-15T15:32:14.102Z · comments (0)

[link] Decorated pedestrian tunnels
dkl9 · 2024-11-24T22:16:03.794Z · comments (3)

Reducing x-risk might be actively harmful
MountainPath · 2024-11-18T14:25:07.127Z · comments (5)

[question] How might language influence how an AI "thinks"?
bodry (plosique) · 2024-10-30T17:41:04.460Z · answers+comments (0)

[link] Higher Order Signs, Hallucination and Schizophrenia
Nicolas Villarreal (nicolas-villarreal) · 2024-11-02T16:33:10.574Z · comments (0)

[link] AI Safety at the Frontier: Paper Highlights, October '24
gasteigerjo · 2024-10-31T00:09:33.522Z · comments (0)

Ultralearning in 80 days
aproteinengine · 2024-11-26T00:01:23.679Z · comments (6)

[link] Paradigm Shifts—change everything... except almost everything
James Stephen Brown (james-brown) · 2024-11-23T18:34:13.088Z · comments (0)

A better “Statement on AI Risk?”
Knight Lee (Max Lee) · 2024-11-25T04:50:29.399Z · comments (4)

MIT FutureTech are hiring ‍a Product and Data Visualization Designer
peterslattery · 2024-11-13T14:48:06.167Z · comments (0)

[link] Sparks of Consciousness
Charlie Sanders (charlie-sanders) · 2024-11-13T04:58:27.222Z · comments (0)

[link] AI & Liability Ideathon
Kabir Kumar (kabir-kumar) · 2024-11-26T13:54:01.820Z · comments (0)

[link] Some Preliminary Notes on the Promise of a Wisdom Explosion
Chris_Leong · 2024-10-31T09:21:11.623Z · comments (0)

Which AI Safety Benchmark Do We Need Most in 2025?
Loïc Cabannes (loic-cabannes) · 2024-11-17T23:50:56.337Z · comments (2)

Gothenburg LW/ACX meetup
Stefan (stefan-1) · 2024-10-29T20:40:22.754Z · comments (0)

Root node of my posts
AtillaYasar (atillayasar) · 2024-11-19T20:09:02.973Z · comments (0)

aspirational leadership
dhruvmethi · 2024-11-20T16:07:43.507Z · comments (0)

[question] Poll: what’s your impression of altruism?
David Gross (David_Gross) · 2024-11-09T20:28:15.418Z · answers+comments (4)

Agenda Manipulation
Pazzaz · 2024-11-09T14:13:33.729Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

viliam on Making a conservative case for alignment

Napoleon is merely an argument for "just because you strongly believe it, even if it is a statement about you, does not necessarily make it true".

We will probably disagree on this, but the only reason I care about trans issues is that some people report significant suffering (gender dysphoria) from their current situation, and I am in favor of people not suffering, so I generally try not to be an asshole.

Unfortunately, for every person who suffers from something, there are probably dozen people out there who cosplay their condition... because it makes them popular on Twitter I guess, or just gives them another opportunity to annoy their neighbors. I have no empathy for those. Play your silly games, if you wish, but don't expect me to play along, and definitely don't threaten me to play along. Also, the cosplayers often make the situation more difficult for those who genuinely have the condition, by speaking in their name, and often saying things that the people who actually have the condition would disagree with... and in the most ironic cases, the cosplayers get them cancelled. So I don't mind being an asshole to the cosplayers, because from my perspective, they started it first.

The word "deadnaming" is itself hysterical. (Who died? No one.)

Gender essentialism? I don't make any metaphysical claim about essences. People simply are born with male or female bodies (yes, I know that some are intersex), and some people are strongly unhappy about their state. I find it plausible that there may be an underlying biological reason for that; and hormones seem like a likely candidate, because that's how body communicates many things. I don't have a strong opinion on that, because I have never felt a desire to be one sex or the other, just like I have never felt a strong desire to have a certain color of eyes, or hair, or skin, whether it would be the one I have or some that I have not.

I expect that you will disagree with a lot of this, and that's okay; I am not trying to convince you, just explaining my position.

richard_kennaway on You are not too "irrational" to know your preferences.

The key is recognizing that the preference itself is completely independent from rationality or intelligence.

The orthogonality thesis is also for human beings.

daystareld on You are not too "irrational" to know your preferences.

I don't see how your question contradicts my statement, nor that link. People absolutely develop in their desires over time, and can change them, but that is not the same as being able to decide, in the moment, that you do not like the taste of pizza if your tongue is having the sensory experience of enjoying it.

ivan-vendrov on Passages I Highlighted in The Letters of J.R.R.Tolkien

Feels connected to his distrust of "quick, bright, standardized, mental processes", and the obsession with language. It's like his mind is relentlessly orienting to the territory, refusing to accept anyone else's map. Which makes it harder to be a student but easier to discover something new. Reminds me of Geoff Hinton's advice to not read the literature before engaging with the problem yourself.

mr-hire on Counting AGIs

while this paradigm of 'training a model that's an agi, and then running it at inference' is one way we get to transformative agi, i find myself thinking that probably WON'T be the first transformative AI, because my guess is that there are lots of tricks using lots of compute at inference to get not quite transformative ai to transformative ai.

my guess is that getting to that transformative level is gonna require ALL the tricks and compute, and will therefore eek out being transformative BY utilizing all those resources.

one of those tricks may be running millions of copies of the thing in an agentic swarm, but i would expect that to be merely a form of inference time scaling, and therefore wouldn't expect ONE of those things to be transformative AGI on it's own.

and i doubt that these tricks can funge against train time compute, as you seem to be assuming in your analysis. my guess is that you hit diminishing returns for various types of train compute, then diminishing returns for various types of inference compute, and that we'll get to a point where we need to push both of them to that point to get tranformative ai

alexander-gietelink-oldenziel on Alexander Gietelink Oldenziel's Shortform

testing markdown editor

Here's the content restructured as a numbered list with inline links:

Logical Induction and Embedded Agency - Explore Scott Garrabrant and Abram Demski's cornerstone work including the Embedded Agents sequence [LW · GW] and research on Logical Induction [LW · GW].
Learning-Theoretic Approach - Study Vanessa Kosoy's agenda for creating a foundational mathematical theory of agency, detailed in her interview on theoretical research [LW · GW].
Selection Theorems - Explore John Wentworth's research agenda on Natural Abstraction Hypothesis and system modularity through his research overview.
Decision Theory - Read about idealized decision theory and reflective oracles in agent foundations work [LW · GW].
Cartesian Frames - Study Scott Garrabrant's mathematical framework for agent-environment distinctions in the embedded agency guide.
Additional Research Areas - Discover various technical alignment approaches through the comprehensive overview [AF · GW] of the field.

rai on Thoughts on seed oil

It was gluten.

david-gross on You are not too "irrational" to know your preferences.

Wants are emergent, complex forms of pain and pleasure. They are either felt or they are not felt, and reason only comes in at the stage of deciding what to do about them.

Are you really certain that one's desires are just givens that one has no rational influence over? I'm skeptical.

https://www.lesswrong.com/posts/aQQ69PijQR2Z64m2z/notes-on-temperance#Can_we_shape_our_desires_

mr-hire on Two flavors of computational functionalism

This seems arbitrary to me. I'm bringing in bits of information on multiple layers when I write a computer program to calculate the thing and then read out the result from the screen

Consider, if the transistors on the computer chip were moved around, would it still process the data in the same way and wield the correct answer?

Yes under some interpretation, but no from my perspective, because the right answer is about the relationship between what I consider computation and how I interpret the results in getting

But the real question for me is - under a computational perspective of consciousness, are there features of this computation that actually correlate to strength of consciousness? Does any interpretation of computation get equal weight? We could nail down a precise definition of what we mean by consciousness that we agreed on that didn't have the issues mentioned above, but who knows whether that would be the definition that actually maps to the territory of consciousness?

viliam on Alignment is not intelligent

Ah sorry, I am not really interested in debating this.

I think there is a difference between conjuring up a new intelligence and asking "what goals it might have?" (the orthogonality thesis says: any); and waiting for some longer time, maybe in an environment where multiple intelligences compete for resources, and asking "the surviving intelligences, what goals are they likely to have?" (which would sort out at least the obviously suicidal ones).

Also, not sure if you have seen this [? · GW] -- sounds similar to what you are saying.