LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Altman firing retaliation incoming?
trevor (TrevorWiesinger) · 2023-11-19T00:10:15.645Z · comments (23)

Transfer learning and generalization-qua-capability in Babbage and Davinci (or, why division is better than Spanish)
RP (Complex Bubble Tea) · 2024-02-09T07:00:45.825Z · comments (6)

AI #52: Oops
Zvi · 2024-02-22T21:50:07.393Z · comments (9)

Gemini 1.0
Zvi · 2023-12-07T14:40:05.243Z · comments (7)

Vipassana Meditation and Active Inference: A Framework for Understanding Suffering and its Cessation
Benjamin Sturgeon (benjamin-sturgeon) · 2024-03-21T12:32:22.475Z · comments (8)

GPT-2030 and Catastrophic Drives: Four Vignettes
jsteinhardt · 2023-11-10T07:30:06.480Z · comments (5)

[link] Finding Backward Chaining Circuits in Transformers Trained on Tree Search
abhayesian · 2024-05-28T05:29:46.777Z · comments (1)

Paper in Science: Managing extreme AI risks amid rapid progress
JanB (JanBrauner) · 2024-05-23T08:40:40.678Z · comments (2)

Observations on Teaching for Four Weeks
ClareChiaraVincent · 2024-05-06T16:55:59.315Z · comments (14)

[link] on the dollar-yen exchange rate
bhauth · 2024-04-07T04:49:53.920Z · comments (21)

Unlearning via RMU is mostly shallow
Andy Arditi (andy-arditi) · 2024-07-23T16:07:52.223Z · comments (3)

Consent across power differentials
Ramana Kumar (ramana-kumar) · 2024-07-09T11:42:03.177Z · comments (12)

Changes in College Admissions
Zvi · 2024-04-24T13:50:03.487Z · comments (11)

Why you should learn a musical instrument
cata · 2024-05-15T20:36:16.034Z · comments (23)

[link] Announcing Human-aligned AI Summer School
Jan_Kulveit · 2024-05-22T08:55:10.839Z · comments (0)

On Complexity Science
Garrett Baker (D0TheMath) · 2024-04-05T02:24:32.039Z · comments (19)

So you want to work on technical AI safety
gw · 2024-06-24T14:29:57.481Z · comments (3)

Sherlockian Abduction Master List
Cole Wyeth (Amyr) · 2024-07-11T20:27:00.000Z · comments (60)

[link] DM Parenting
Shoshannah Tekofsky (DarkSym) · 2024-07-16T08:50:08.144Z · comments (4)

Interoperable High Level Structures: Early Thoughts on Adjectives
johnswentworth · 2024-08-22T21:12:38.223Z · comments (1)

[link] in defense of Linus Pauling
bhauth · 2024-06-03T21:27:43.962Z · comments (8)

[link] Anthropic announces interpretability advances. How much does this advance alignment?
Seth Herd · 2024-05-21T22:30:52.638Z · comments (4)

An issue with training schemers with supervised fine-tuning
Fabien Roger (Fabien) · 2024-06-27T15:37:56.020Z · comments (12)

Book Review: Righteous Victims - A History of the Zionist-Arab Conflict
Yair Halberstadt (yair-halberstadt) · 2024-06-24T11:02:03.490Z · comments (8)

[LDSL#0] Some epistemological conundrums
tailcalled · 2024-08-07T19:52:55.688Z · comments (10)

AI #58: Stargate AGI
Zvi · 2024-04-04T13:10:06.342Z · comments (9)

Please do not use AI to write for you
Richard_Kennaway · 2024-08-21T09:53:34.425Z · comments (34)

AI #67: Brief Strange Trip
Zvi · 2024-06-06T18:50:03.514Z · comments (6)

Low Probability Estimation in Language Models
Gabriel Wu (gabriel-wu) · 2024-10-18T15:50:05.947Z · comments (0)

Claude Sonnet 3.5.1 and Haiku 3.5
Zvi · 2024-10-24T14:50:06.286Z · comments (9)

Notes on control evaluations for safety cases
ryan_greenblatt · 2024-02-28T16:15:17.799Z · comments (0)

Bounty: Diverse hard tasks for LLM agents
Beth Barnes (beth-barnes) · 2023-12-17T01:04:05.460Z · comments (31)

Public Weights?
jefftk (jkaufman) · 2023-11-02T02:50:18.095Z · comments (19)

[question] why did OpenAI employees sign
bhauth · 2023-11-27T05:21:28.612Z · answers+comments (23)

Tall Tales at Different Scales: Evaluating Scaling Trends For Deception In Language Models
Felix Hofstätter · 2023-11-08T11:37:43.997Z · comments (0)

[link] Chapter 1 of How to Win Friends and Influence People
gull · 2024-01-28T00:32:52.865Z · comments (5)

Job listing: Communications Generalist / Project Manager
Gretta Duleba (gretta-duleba) · 2023-11-06T20:21:03.721Z · comments (7)

They are made of repeating patterns
quetzal_rainbow · 2023-11-13T18:17:43.189Z · comments (4)

The Broken Screwdriver and other parables
bhauth · 2024-03-04T03:34:38.807Z · comments (1)

Wrong answer bias
lukehmiles (lcmgcd) · 2024-02-01T20:05:38.573Z · comments (24)

Should rationalists be spiritual / Spirituality as overcoming delusion
Kaj_Sotala · 2024-03-25T16:48:08.397Z · comments (57)

Experiments as a Third Alternative
Adam Zerner (adamzerner) · 2023-10-29T00:39:31.399Z · comments (21)

Safety First: safety before full alignment. The deontic sufficiency hypothesis.
Chipmonk · 2024-01-03T17:55:19.825Z · comments (3)

On ‘Responsible Scaling Policies’ (RSPs)
Zvi · 2023-12-05T16:10:06.310Z · comments (3)

AISC 2024 - Project Summaries
NickyP (Nicky) · 2023-11-27T22:32:23.555Z · comments (3)

[link] Urging an International AI Treaty: An Open Letter
Olli Järviniemi (jarviniemi) · 2023-10-31T11:26:25.864Z · comments (2)

On the lethality of biased human reward ratings
Eli Tyre (elityre) · 2023-11-17T18:59:02.303Z · comments (10)

Making Bad Decisions On Purpose
Screwtape · 2023-11-09T03:36:59.611Z · comments (8)

What is the next level of rationality?
lsusr · 2023-12-12T08:14:14.846Z · comments (24)

Highlights from Lex Fridman’s interview of Yann LeCun
Joel Burget (joel-burget) · 2024-03-13T20:58:13.052Z · comments (15)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

notfnofn on What are some good ways to form opinions on controversial subjects in the current and upcoming era?

It seems to me that you might not have read the question/premise carefully. If you did and stand by this answer/comment, let me know and I'll respond when I have time

cubefox on avturchin's Shortform

Thanks, this was an interesting article. The irony of course being that I, not knowing Russian, read it using Google Translate.

davekasten on davekasten's Shortform

Oh, to be clear I'm not sure this is at all actually likely, but I was curious if anyone had explored the possibility conditional on it being likely

seth-herd on A Case for Conscious Significance rather than Free Will.

I don't know what people mean by "free will" and I don't think they usually do either. Clarifying that is the first project, so I applaud this post.

I substitute the term "self-determination" for "free will", in hopes that that term captures more of what people tend to actually care about in this topic: do I control my own future? Framed this way, I think the answer is more interesting- it's sort of and sometimes, rather than a simple yes or no.

This is something I used to think about a lot. I haven't written about it much outside of comments because it's interesting but less pressing than alignment. I think in sum my take agrees with yours, but I don't think the terminology and focus on consciousness here is the right way to convey it.

I think someone who's really concerned that "free will isn't real" would say sure they help determine outcomes, but the contents of my consciousness were also determined by previous processes. I didn't pick them. I'm an observer, not a cause. My conscious awareness is an observer. It causes the future, but it doesn't choose, it just predicts outcomes.

So your terminological move:

"You are not really in control of your behaviour."
... becomes...
"The continuation of deterministic forces via genetics and experience are not really in control of your behaviour."

I completely agree that this transformation is correct, but I'm not sure it's fully satisfying. I'm afraid someone who's really bothered by determinism and their lack of "free will" wouldn't find this comforting at all - they'd say "well exactly! Me (or my mind or my brain) being just a continuation of deterministic forces is exactly what bothers me!

So here I think it's important to break it down further, and ask how someone would want their choices to work in an ideal world (this move is essentially borrowed from Daniel Dennett's "all the varieties of free will worth wanting").

I think the most people would ask for is to have their decisions and therefore their outcomes controlled by their beliefs, their knowledge, their values, and importantly, their efforts at making decisions.

I think these are all perfectly valid labels for important aspects of cognition (with lots of overlap among knowledge, beliefs, and values). Effort at making a decision also plays a huge role, and I think that's a central concern - it seems like I'm working so hard at my decisions, but is that an illusion? I think what we perceive as effort involves more of the conscious predictions you describe (incidentally I did a whole bunch of work on exactly how the brain does exatly that process of conscious predictions to choose outcomes, best written up in Neural mechanisms of human decision-making, but that's still barely worth reading because it's so specialist-oriented). It also involves more different types of multi-step cognition, like analyzing progress so far and choosing new strategies for decisions or intermediate conclusions for complex decisions.

So my response to people being bothered by being "just" the "continuation of deterministic forces via genetics and experience" is that those are condensed as beliefs, values, knowledge, and skills, and the effort with which those are applied is what determines outcomes and therefore your results and your future.

This leaves intact some concerns about forces you're not conscious of playing a role. Did I decide to do this because it's the best decision, or because an advertiser or a friend put an association or belief in my head in a way I wouldn't endorse on reflection? I think those are valid concerns.

So my answer to "am I really in control of my behavior?" is: sometimes, in some ways - and the exceptions are worth figuring out, so we can have more self-determination in the future.

startattheend on What are some good ways to form opinions on controversial subjects in the current and upcoming era?

I think it's often the case that neither A nor B are true. Common opinions are shallow, often simplified and exaggerated or even entirely besides the point.
Now, you're asking what a good way to form opinions is, well, it depends on what you want.

Do you want to know which side you should vote for to bring the future towards the state that you want?
Do you want to figure out which side is the most correct?
Do you want to figure out the actual truth behind the political issue?
Do you want to hold an opinion which won't disrupt your social life too much or make you unpopular?

I expect that these four will bring you to different answers.

(While I think I understand the problem well, I can't promise that I have a good solution. Besides, it's subjective. Since the topic is controversial, any answer I give will be influenced by the very biases that we're potentially interested in avoiding)

By the way, personally, I don't care much what foreign actors (or team A and B) have to say about anything, so it's not a factor which makes a difference to me.

Edit: I should probably have submitted this as a comment and not an answer. Oh well, I will think up an answer if you respond.

richard-horvath on A superficially plausible promising alternate Earth without lockstep

You lost me at "Bywayeans generally save up enough to move out of their parents' houses around age 9". Likely you will lose most people at "ancap".

But these are only minor things, and I can imagine that it would be possible to pull up some plausible explanation or just to revise one or the other, and keep the core message. I think the main issue with your description Byway are in these:

"...even though Bywayeans are smarter than Earthlings..."

"Bywayeans have a lot of energy..."

"...they love working and innovating..."

These imply that they are already better then us, and would run the world better even in the structure we have on our real Earth.

The message of Dath Ilan is that they and us are the same people, with the same genetic heritage, hence intelligence and vices. But they can still do better, just by having better institutions/methods for cooperation, hence, we can get better too! If you give special power to the inhabitants of your alternate Earth, it may explain why are not stuck in the same mire as we are, and imply we cannot learn from their success.

alexander-gietelink-oldenziel on New intro textbook on AIXI

Thanks a lot!

A few followup questions :..

By computaibility level do you mean Turing degree ?

Why cant the universal distribution be constructed for most levels ?

What exactly is the coding theorem?

What do you mean by conditioning and planning damaging the computability level and why is not so bad ?

ruby on A bird's eye view of ARC's research

Curated! I think it's generally great when people explain what they're doing and why in way legibile to those not working on it. Great because it let's others potentially get involved, build on it, expose flaws or omissions, etc. This one seems particularly clear and well written. While I haven't read all of the research, nor am I particularly qualified to comment on it, I like the idea of a principled/systematic approach behind, in comparison to a lot of work that isn't coming on a deeper, bigger, framework.

(While I'm here though, I'll add a link to Dmitry Vaintrob's comment [LW(p) · GW(p)] that Jacob Hilton described as "best critique of ARC's research agenda that I have read since we started working on heuristic explanations". Eliciting such feedback is the kind of good thing that comes out of up writing agendas – it's possible or likely Dmitry was already tracking the work and already had these critiques, but a post like this seems like a good way to propagate them and have a public back and forth.)

Roughly speaking, if the scalability of an algorithm depends on unknown empirical contingencies (such as how advanced AI systems generalize), then we try to make worst-case assumptions instead of attempting to extrapolate from today's systems.

I like this attitude. The human standard, I think often in alignment work too, is to argue why one's plan will work and find stories for that, and adopting the methodology of the opposite, especially given the unknowns, is much needed in alignment work.

Overall, this is neat. Kudos to Jacob (and rest of the team) for taking the time to put this all together. Doesn't seem all that quick to write, and I think it'd be easy to think they ought to not take time out off from further object-level research to write it. Thanks!

towards_keeperhood on johnswentworth's Shortform

(There might be a sorta annoying analysis one could do to test my hypothesis: On my hypothesis the correlation between the intelligence of very intelligent parents and their children would be even a bit less than on the just-independent-mutations hypothesis, because very intelligent people likely also got lucky in how their gene variants work together but those properties would unlikely to all be passed along and end up dominant.)

chipmonk on The hostile telepaths problem

This reminds me… maybe muscle tension is a frequent solution to this problem?

Some context: Lately I've been wondering, Why do we often experience feelings as things in the body? For example, why do I feel anxiety in my chest rather than just “knowing” I'm anxious?

For example, my previous chronic neck pain seemed to be related to information that manifested in my neck:

I suspect the feeling in my neck represented the information "I have the choice to leave the social situation I'm in right now" and/or "I am disliking/suppressing myself."

Why might this feeling have manifested in my neck?

What if feelings use the body as a screen to communicate information with others? If you have a certain feeling in your chest, maybe others can see that.

BUT: What if a feeling represents information that your system doesn't want other people to know? Hostile telepaths problem.

Im my case:

The feeling represented the awareness that I was insecure, and there were probably situations (probably social situations) in which it partially benefited me to be partially unaware of the fact that I was insecure.

Well, in that case, your system could create muscle tension to "jam the signal"…

If the muscles are stiff, maybe they can't be used as a screen anymore.