LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] MIRI's September 2024 newsletter
Harlan · 2024-09-16T18:15:40.785Z · comments (0)

All The Latest Human tFUS Studies
sarahconstantin · 2024-08-09T22:20:04.561Z · comments (2)

AI #68: Remarkably Reasonable Reactions
Zvi · 2024-06-13T16:30:02.969Z · comments (11)

Thoughts on "The Offense-Defense Balance Rarely Changes"
Cullen (Cullen_OKeefe) · 2024-02-12T03:26:50.662Z · comments (4)

[link] Michael Dickens' Caffeine Tolerance Research
niplav · 2024-09-04T15:41:53.343Z · comments (3)

I'm open for projects (sort of)
cousin_it · 2024-04-18T18:05:01.395Z · comments (13)

AI doing philosophy = AI generating hands?
Wei Dai (Wei_Dai) · 2024-01-15T09:04:39.659Z · comments (22)

AI #75: Math is Easier
Zvi · 2024-08-01T13:40:05.539Z · comments (25)

Startup Roundup #2
Zvi · 2024-08-06T13:30:06.554Z · comments (0)

AI #80: Never Have I Ever
Zvi · 2024-09-10T17:50:08.074Z · comments (20)

[link] Loneliness and suicide mitigation for students using GPT3-enabled chatbots (survey of Replika users in Nature)
Kaj_Sotala · 2024-01-23T14:05:40.986Z · comments (2)

We Don't Know Our Own Values, but Reward Bridges The Is-Ought Gap
johnswentworth · 2024-09-19T22:22:05.307Z · comments (47)

Work with me on agent foundations: independent fellowship
Alex_Altair · 2024-09-21T13:59:16.706Z · comments (5)

AI #72: Denying the Future
Zvi · 2024-07-11T15:00:05.865Z · comments (8)

Quick thoughts on the implications of multi-agent views of mind on AI takeover
Kaj_Sotala · 2023-12-11T06:34:06.395Z · comments (14)

~80 Interesting Questions about Foundation Model Agent Safety
RohanS · 2024-10-28T16:37:04.713Z · comments (4)

[link] Rational Animations' intro to mechanistic interpretability
Writer · 2024-06-14T16:10:57.015Z · comments (1)

The Gemini Incident Continues
Zvi · 2024-02-27T16:00:05.648Z · comments (6)

Principled Satisficing To Avoid Goodhart
JenniferRM · 2024-08-16T19:05:27.204Z · comments (2)

[link] Book review: Everything Is Predictable
PeterMcCluskey · 2024-05-27T03:33:53.857Z · comments (0)

[link] Book review: Deep Utopia
PeterMcCluskey · 2024-04-23T19:55:50.417Z · comments (14)

[link] AlphaGeometry: An Olympiad-level AI system for geometry
alyssavance · 2024-01-17T17:17:30.913Z · comments (9)

We ran an AI safety conference in Tokyo. It went really well. Come next year!
Blaine (blaine-rogers) · 2024-07-17T06:55:39.620Z · comments (1)

[link] AI Rights for Human Safety
Simon Goldstein (simon-goldstein) · 2024-08-01T23:01:07.252Z · comments (6)

[link] Paper: Tell, Don't Show- Declarative facts influence how LLMs generalize
Owain_Evans · 2023-12-19T19:14:26.423Z · comments (4)

Monthly Roundup #18: May 2024
Zvi · 2024-05-13T12:30:04.863Z · comments (10)

[link] What Ketamine Therapy Is Like
Sable · 2024-11-11T11:09:08.602Z · comments (6)

[link] Fluent dreaming for language models (AI interpretability method)
tbenthompson (ben-thompson) · 2024-02-06T06:02:59.296Z · comments (5)

Atlantis: Berkeley event venue available for rent
Jonas V (Jonas Vollmer) · 2023-11-22T01:47:12.026Z · comments (0)

On Tapping Out
Screwtape · 2023-11-17T03:23:55.880Z · comments (13)

[link] I'd also take $7 trillion
bhauth · 2024-02-19T03:31:45.552Z · comments (12)

[link] Towards Evaluating AI Systems for Moral Status Using Self-Reports
Ethan Perez (ethan-perez) · 2023-11-16T20:18:51.730Z · comments (3)

AI #53: One More Leap
Zvi · 2024-02-29T16:10:04.049Z · comments (0)

Things Solenoid Narrates
Solenoid_Entity · 2024-04-12T23:57:16.169Z · comments (2)

AI #54: Clauding Along
Zvi · 2024-03-07T16:00:05.066Z · comments (11)

Dating Roundup #3: Third Time’s the Charm
Zvi · 2024-05-08T13:30:03.232Z · comments (27)

[link] NYT on the Manifest forecasting conference
Austin Chen (austin-chen) · 2023-10-09T21:40:16.732Z · comments (14)

Some open-source dictionaries and dictionary learning infrastructure
Sam Marks (samuel-marks) · 2023-12-05T06:05:21.903Z · comments (7)

[link] How people stopped dying from diarrhea so much (& other life-saving decisions)
Writer · 2024-03-16T16:00:47.830Z · comments (0)

A starting point for making sense of task structure (in machine learning)
Kaarel (kh) · 2024-02-24T01:51:49.227Z · comments (2)

AI #32: Lie Detector
Zvi · 2023-10-05T13:50:05.030Z · comments (19)

AI #36: In the Background
Zvi · 2023-11-02T18:00:01.803Z · comments (5)

Userscript to always show LW comments in context vs at the top
Vlad Sitalo (harcisis) · 2023-11-21T17:53:30.418Z · comments (8)

Announcing Atlas Computing
miyazono · 2024-04-11T15:56:31.241Z · comments (4)

[link] LLM Evaluators Recognize and Favor Their Own Generations
Arjun Panickssery (arjun-panickssery) · 2024-04-17T21:09:12.007Z · comments (1)

[link] EPUBs of MIRI Blog Archives and selected LW Sequences
mesaoptimizer · 2023-10-26T14:17:11.538Z · comments (6)

[link] Non-alignment project ideas for making transformative AI go well
Lukas Finnveden (Lanrian) · 2024-01-04T07:23:13.658Z · comments (1)

Truthseeking, EA, Simulacra levels, and other stuff
Elizabeth (pktechgirl) · 2023-10-27T23:56:49.198Z · comments (12)

[link] Chinese scientists acknowledge xrisk & call for international regulatory body [Linkpost]
Akash (akash-wasil) · 2023-11-01T13:28:43.723Z · comments (4)

Laying the Foundations for Vision and Multimodal Mechanistic Interpretability & Open Problems
Sonia Joseph (redhat) · 2024-03-13T17:09:17.027Z · comments (13)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

cstinesublime on Ayn Rand’s model of “living money”; and an upside of burnout

As always, I may not be the intended audience, so please excuse my questions that might be patently obvious to the intended audience.

Am I right in understanding a very simplified version of this model is that if you use willpower too much without deriving any net benefits, eventually you'll suffer 'burnout' which really is just a mistrust of using willpower ever, which may have negative effects on other aspects of your life even where willpower is needed like, say, cleaning your house?

Willpower, as I understand it is another word for 'patience' or 'discipline', variously described as the ability to choose to endure pain (physical or emotional). Whether willpower actually exists is a question I won't get into here, let's assume for the sake of this model it does, and fits the description of the ability to choose to endure pain.

For me this sentence I find especially alien to me:

your psyche’s conscious verbal planner “earns” willpower (earns trust with the rest of your psyche) by choosing actions that nourish your fundamental, bottom-up processes in the long run.

what is the "psyche's conscious verbal planner"? I don't know what this is or what part of my mind, person, identity, totality as a organism or anything really that I can equate this label to. Also without examples of what actions are that nourish (again, would cleaning the house, cooking healthy meals be examples?), that are fundamental and those that aren't, it's even harder to pin down what this is and why you attribute willpower to it.

It appears to have the ability to force one's-self to go on a date, which really makes the "verbal" descriptor confusing since a lot of the processes that are involved in going on a date don't feel like they are verbal, lexical, or take the form of the speaker's native language written or spoken. At least in my experience, a lot of the thoughts, feelings, and motivations behind going on a date are not innately verbal for me and if you asked me "why did you agree to see this person?" - even if I felt no fear of embarrassment explaining my reasons - I'd have a hard time putting that into words. Or the words I'd use would be so impossibly vague ("they seem cool") as to suggest that there was a nonverbal reasoning or motivation.

Would this 'conscious verbal planner' also be the part of my mind and body that searches an online store a week later to see if those shoes I want are on special? Or would you attribute that to a different entity?

Is there an unconscious verbal planner?

When I am thinking very carefully about what I'm saying, but not so minutely that I'm thinking about the correct grammatical use, would the grammar I use be my unconscious verbal planner, while the content of my speech be the conscious verbal planner?

A lot of example, for me, of willpower often are nonverbal and come from guilt. Guilt felt as a somatic or bodily thing. I can't verbalize why I feel guilty, although it verbally equates to the words "should" "must" and even "ought" when used as imperatives, not as modals.

ricraz on Why I’m not a Bayesian

Ty for the link but these seem like both clearly bad semantics (e.g. under either of these the second-best hypothesis under consideration might score arbitrarily badly).

gunnar_zarncke on Gunnar_Zarncke's Shortform

Instrumental power-seeking might be less dangerous if the self-model of the agent is large and includes individual humans, groups, or even all of humanity and if we can reliably shape it that way.

It is natural for humans to for form a self-model that is bounded by the body, though it is also common to be only the brain or the mind, and there are other self-models. See also Intuitive Self-Models [? · GW].

It is not clear what the self-model of an LLM agent would be. It could be

the temporary state of the execution of the model (or models),
the persistently running model and its memory state,
the compute resources (CPU/GPU/RAM) allocated to run the model and its collection of support programs,
the physical compute resources in some compute center(s),
the compute center as an organizational structure that includes the staff to maintain and operate not only the machines but also the formal organization (after all, without that, the machines will eventually fail), or
dito but including all the utilities and suppliers to continue to operate it.

There is not as clear a physical boundary as in the human case. But even in the human case, esp. babies depend on caregivers to a large degree.

There are indications that we can shape the self-model of LLMs: Self-Other Overlap: A Neglected Approach to AI Alignment [LW · GW]

noah-birnbaum on The Case For Giving To The Shrimp Welfare Project

I really don't like when people downvote so heavily without giving reasons - think this is nicely argued!

One issue I do have is that Bob Fischer, the conductor of the Rethink study, warned about exactly what you are sorta doing here in being like ah now we can use x amount of shrimp and saying we can trolly problem a human for that many. This is just one contention, but I think the point is important and people willing to take weird/ controversial ideas seriously (especially here!) should take it more seriously!

lukehmiles on lukehmiles's Shortform

Yeah I just wanted to check that nobody is giving away money before I go do the exact opposite thing I've been doing

gunnar_zarncke on Alexander Gietelink Oldenziel's Shortform

This sounds related to my complaint about the YUDKOWSKY + WOLFRAM ON AI RISK debate:

I wish there had been some effort to quantify @stephen_wolfram's "pockets or irreducibility" (section 1.2 & 4.2) because if we can prove that there aren't many or they are hard to find & exploit by ASI, then the risk might be lower.

I got this tweet wrong. I meant if pockets of irreducibility are common and non-pockets are rare and hard to find, then the risk from superhuman AI might be lower. I think Stephen Wolfram's intuition has merit but needs more analysis to be convicing.

ebenezer-dukakis on Alexander Gietelink Oldenziel's Shortform

Chinas has alienated virtually all its neighbours

That sounds like an exaggeration? My impression is that China has OK/good relations with countries such as Vietnam, Cambodia, Pakistan, Indonesia, North Korea, factions in Myanmar. And Russia, of course. If you're serious about this claim, I think you should look at a map, make a list of countries which qualify as "neighbors" based purely on geographic distance, then look up relations for each one.

o-o on O O's Shortform

O1 probably scales to superhuman reasoning:

O1 given maximal compute solves most AIME questions. (One of the hardest benchmarks in existence). If this isn’t gamed by having the solution somewhere in the corpus then:

-you can make the base model more efficient at thinking

-you can implement the base model more efficiently on hardware

-you can simply wait for hardware to get better

-you can create custom inference chips

Anything wrong with this view? I think agents are unlocked shortly along with or after this too.

ustice on Ayn Rand’s model of “living money”; and an upside of burnout

Thanks for clarifying! Willpower is a tricky concept.

I’ve suffered from depression at times, where getting out of bed felt like a huge exertion of emotional energy. Due to my tenuous control over my focus with ADHD, I often have to repeat in my head what I’m doing so I don’t forget in the middle of it. I’ve also put in 60-hour weeks writing code, both because I’ve had serious deadlines, but also because time disappeared as I got so wrapped up in it. I’ve stayed on healthy diets for years without problem, and had times where slipped back to high sugar foods.

All of these are examples of what people refer to as willpower (or lack there-of). Most of them are from times in my life where I haven’t felt really in control. This is especially true regarding memory. It’s not uncommon for me to realize as I am putting my groceries away that I didn’t get the one item I really needed (and have to go back).

That said, I’m pretty good at grit: I’m willing to put in the work, despite hardships and obstacles. I’m also good at leading by example. I’ll fight the good fight, when needed,

All of these different features of me and my brain, are wrapped up in the concept of willpower. Each of them are a mixture of conscious and unconscious patterns of behavior (including cognitive).

It’s this distinction that makes me look askance at the concept of willpower. It’s too wrapped up in moral judgement.

I wasn’t diagnosed with ADHD until after my son was. I lived with a lot guilt and shame because I interpreted the things I struggled with as a moral failings, because I just lacked the willpower.

Then I saw how many people struggled with the same sorts of things I did. It was really weird learning that so many things I previously would have described as negative personality traits of mine, turned out to be what happens when someone has this quirk in their brain that me and my son have.

Now, I don’t carry that guilt. Now, I know that despite my best efforts, tools, and practices, there are things I’m just going to always struggle with that neurotypical find easy, and that’s okay. Now, I don’t see myself as having low willpower because of them. Now, I better understand the quirks of my brain, and I am better equipped to mitigate my weaknesses, and play into my strengths.

Now, I’m a lot happier and confident. I wish it hadn’t taken 40 years for me to figure things out, but I’m glad my son is free of that shame and guilt.

I feel pretty lucky: when I was a kid, I had knack for patterns and abstraction, a fascination with computers, a family that could actually afford one, and people who could help me when I was stuck, I managed to make my hobby into my profession, and still enjoy it as a hobby.

I totally agree that joy and meaning are a balm to burnout. That and vacations; take more vacations.

I guess what I’m saying is be careful to not stretch your metaphors too far, as the details are messy; however, if it helps you to remember to take care of yourself, find joy, and seek meaning, I’m all for it.

rogerdearnaley on Is Deep Learning Actually Hitting a Wall? Evaluating Ilya Sutskever's Recent Claims

If these rumors are true, it sounds like we’re already starting to hit the issue I predicted in
LLMs May Find It Hard to FOOM [LW · GW]. The majority of content on the Internet isn’t written by geniuses with post-doctoral experience, so we’re starting to run out of the highest-quality training material for getting LLMs past doctoral student performance levels. However, as I describe there, this isn’t a wall, it’d just a slowdown: we need to start using AI to generate a lot more high-quality training data, As o1 shows, that’s entirely possible, using inference-time compute scaling and then training on the results.

However, this might be enough to render fast takeoff unlikely, which from an alignment point of view would be an excellent thing.

Now we just need to make sure all that synthetic training data we’re having the AI generate is well aligned.