LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Monthly Roundup #16: March 2024
Zvi · 2024-03-19T13:10:05.529Z · comments (4)

Rational Animations offers animation production and writing services!
Writer · 2024-03-15T17:26:07.976Z · comments (0)

We have promising alignment plans with low taxes
Seth Herd · 2023-11-10T18:51:38.604Z · comments (9)

Sparse autoencoders find composed features in small toy models
Evan Anders (evan-anders) · 2024-03-14T18:00:43.339Z · comments (12)

Monthly Roundup #20: July 2024
Zvi · 2024-07-23T12:50:07.991Z · comments (9)

[link] The Cancer Resolution?
PeterMcCluskey · 2024-07-24T00:25:17.322Z · comments (24)

2024 ACX Predictions: Blind/Buy/Sell/Hold
Zvi · 2024-01-09T19:30:06.388Z · comments (2)

Disentangling four motivations for acting in accordance with UDT
Julian Stastny · 2023-11-05T21:26:22.514Z · comments (3)

[link] AI Safety Memes Wiki
plex (ete) · 2024-07-24T18:53:04.977Z · comments (1)

[link] On Lies and Liars
Gabriel Alfour (gabriel-alfour-1) · 2023-11-17T17:13:03.726Z · comments (4)

Mech Interp Lacks Good Paradigms
Daniel Tan (dtch1997) · 2024-07-16T15:47:32.171Z · comments (0)

Update #2 to "Dominant Assurance Contract Platform": EnsureDone
moyamo · 2023-11-28T18:02:50.367Z · comments (2)

Important open problems in voting
Closed Limelike Curves · 2024-07-01T02:53:44.690Z · comments (1)

Helpful examples to get a sense of modern automated manipulation
trevor (TrevorWiesinger) · 2023-11-12T20:49:57.422Z · comments (3)

Computational Approaches to Pathogen Detection
jefftk (jkaufman) · 2023-11-01T00:30:13.012Z · comments (5)

[link] A computational complexity argument for many worlds
jessicata (jessica.liu.taylor) · 2024-08-13T19:35:10.116Z · comments (15)

5 Reasons Why Governments/Militaries Already Want AI for Information Warfare
trevor (TrevorWiesinger) · 2023-10-30T16:30:38.020Z · comments (0)

[link] Talking With People Who Speak to Congressional Staffers about AI risk
Eneasz · 2023-12-14T17:55:50.606Z · comments (0)

Learning Math in Time for Alignment
Nicholas / Heather Kross (NicholasKross) · 2024-01-09T01:02:37.446Z · comments (3)

In Defense of Lawyers Playing Their Part
Isaac King (KingSupernova) · 2024-07-01T01:32:58.695Z · comments (9)

[link] Manifund: 2023 in Review
Austin Chen (austin-chen) · 2024-01-18T23:50:13.557Z · comments (0)

Preface to the Sequence on LLM Psychology
Quentin FEUILLADE--MONTIXI (quentin-feuillade-montixi) · 2023-11-07T16:12:07.742Z · comments (0)

Is suffering like shit?
KatjaGrace · 2024-05-31T01:20:03.855Z · comments (5)

Being good at the basics
dominicq · 2023-11-04T14:18:50.976Z · comments (1)

[link] Why you, personally, should want a larger human population
jasoncrawford · 2024-02-23T19:48:10.526Z · comments (32)

0. The Value Change Problem: introduction, overview and motivations
Nora_Ammann · 2023-10-26T14:36:15.466Z · comments (0)

An argument that consequentialism is incomplete
cousin_it · 2024-10-07T09:45:12.754Z · comments (27)

[link] An X-Ray is Worth 15 Features: Sparse Autoencoders for Interpretable Radiology Report Generation
hugofry · 2024-10-07T08:53:14.658Z · comments (0)

DunCon @Lighthaven
Duncan Sabien (Deactivated) (Duncan_Sabien) · 2024-09-29T04:56:27.205Z · comments (0)

[link] NAO Updates, Fall 2024
jefftk (jkaufman) · 2024-10-18T00:00:04.142Z · comments (2)

[link] Big tech transitions are slow (with implications for AI)
jasoncrawford · 2024-10-24T14:25:06.873Z · comments (16)

[link] [Paper Blogpost] When Your AIs Deceive You: Challenges with Partial Observability in RLHF
Leon Lang (leon-lang) · 2024-10-22T13:57:41.125Z · comments (0)

Comparing Quantized Performance in Llama Models
NickyP (Nicky) · 2024-07-15T16:01:24.960Z · comments (2)

[question] How unusual is the fact that there is no AI monopoly?
Viliam · 2024-08-16T20:21:51.012Z · answers+comments (15)

Investigating the Ability of LLMs to Recognize Their Own Writing
Christopher Ackerman (christopher-ackerman) · 2024-07-30T15:41:44.017Z · comments (0)

[link] End Single Family Zoning by Overturning Euclid V Ambler
Maxwell Tabarrok (maxwell-tabarrok) · 2024-07-26T14:08:45.046Z · comments (1)

An Introduction to Representation Engineering - an activation-based paradigm for controlling LLMs
Jan Wehner · 2024-07-14T10:37:21.544Z · comments (4)

[link] OpenAI, DeepMind, Anthropic, etc. should shut down.
Tamsin Leake (carado-1) · 2023-12-17T20:01:22.332Z · comments (48)

Monthly Roundup #13: December 2023
Zvi · 2023-12-19T15:10:08.293Z · comments (5)

[link] How "Pause AI" advocacy could be net harmful
Tamsin Leake (carado-1) · 2023-12-26T16:19:20.724Z · comments (9)

[link] the subreddit size threshold
bhauth · 2024-01-23T00:38:13.747Z · comments (3)

Being against involuntary death and being open to change are compatible
Andy_McKenzie · 2024-05-27T06:37:27.644Z · comments (5)

If you are also the worst at politics
lukehmiles (lcmgcd) · 2024-05-26T20:07:49.201Z · comments (8)

How I build and run behavioral interviews
benkuhn · 2024-02-26T05:50:05.328Z · comments (6)

A quick experiment on LMs’ inductive biases in performing search
Alex Mallen (alex-mallen) · 2024-04-14T03:41:08.671Z · comments (2)

Video and transcript of presentation on Scheming AIs
Joe Carlsmith (joekc) · 2024-03-22T15:52:03.311Z · comments (1)

Why wasn't preservation with the goal of potential future revival started earlier in history?
Andy_McKenzie · 2024-01-16T16:15:08.550Z · comments (1)

Mapping the semantic void II: Above, below and between token embeddings
mwatkins · 2024-02-15T23:00:09.010Z · comments (4)

D&D.Sci (Easy Mode): On The Construction Of Impossible Structures [Evaluation and Ruleset]
abstractapplic · 2024-05-20T09:38:55.228Z · comments (2)

On "Geeks, MOPs, and Sociopaths"
alkjash · 2024-01-19T21:04:48.525Z · comments (35)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

notfnofn on What are some good ways to form opinions on controversial subjects in the current and upcoming era?

It seems to me that you might not have read the question/premise carefully. If you did and stand by this answer/comment, let me know and I'll respond when I have time

cubefox on avturchin's Shortform

Thanks, this was an interesting article. The irony of course being that I, not knowing Russian, read it using Google Translate.

davekasten on davekasten's Shortform

Oh, to be clear I'm not sure this is at all actually likely, but I was curious if anyone had explored the possibility conditional on it being likely

seth-herd on A Case for Conscious Significance rather than Free Will.

I don't know what people mean by "free will" and I don't think they usually do either. Clarifying that is the first project, so I applaud this post.

I substitute the term "self-determination" for "free will", in hopes that that term captures more of what people tend to actually care about in this topic: do I control my own future? Framed this way, I think the answer is more interesting- it's sort of and sometimes, rather than a simple yes or no.

This is something I used to think about a lot. I haven't written about it much outside of comments because it's interesting but less pressing than alignment. I think in sum my take agrees with yours, but I don't think the terminology and focus on consciousness here is the right way to convey it.

I think someone who's really concerned that "free will isn't real" would say sure they help determine outcomes, but the contents of my consciousness were also determined by previous processes. I didn't pick them. I'm an observer, not a cause. My conscious awareness is an observer. It causes the future, but it doesn't choose, it just predicts outcomes.

So your terminological move:

"You are not really in control of your behaviour."
... becomes...
"The continuation of deterministic forces via genetics and experience are not really in control of your behaviour."

I completely agree that this transformation is correct, but I'm not sure it's fully satisfying. I'm afraid someone who's really bothered by determinism and their lack of "free will" wouldn't find this comforting at all - they'd say "well exactly! Me (or my mind or my brain) being just a continuation of deterministic forces is exactly what bothers me!

So here I think it's important to break it down further, and ask how someone would want their choices to work in an ideal world (this move is essentially borrowed from Daniel Dennett's "all the varieties of free will worth wanting").

I think the most people would ask for is to have their decisions and therefore their outcomes controlled by their beliefs, their knowledge, their values, and importantly, their efforts at making decisions.

I think these are all perfectly valid labels for important aspects of cognition (with lots of overlap among knowledge, beliefs, and values). Effort at making a decision also plays a huge role, and I think that's a central concern - it seems like I'm working so hard at my decisions, but is that an illusion? I think what we perceive as effort involves more of the conscious predictions you describe (incidentally I did a whole bunch of work on exactly how the brain does exatly that process of conscious predictions to choose outcomes, best written up in Neural mechanisms of human decision-making, but that's still barely worth reading because it's so specialist-oriented). It also involves more different types of multi-step cognition, like analyzing progress so far and choosing new strategies for decisions or intermediate conclusions for complex decisions.

So my response to people being bothered by being "just" the "continuation of deterministic forces via genetics and experience" is that those are condensed as beliefs, values, knowledge, and skills, and the effort with which those are applied is what determines outcomes and therefore your results and your future.

This leaves intact some concerns about forces you're not conscious of playing a role. Did I decide to do this because it's the best decision, or because an advertiser or a friend put an association or belief in my head in a way I wouldn't endorse on reflection? I think those are valid concerns.

So my answer to "am I really in control of my behavior?" is: sometimes, in some ways - and the exceptions are worth figuring out, so we can have more self-determination in the future.

startattheend on What are some good ways to form opinions on controversial subjects in the current and upcoming era?

I think it's often the case that neither A nor B are true. Common opinions are shallow, often simplified and exaggerated or even entirely besides the point.
Now, you're asking what a good way to form opinions is, well, it depends on what you want.

Do you want to know which side you should vote for to bring the future towards the state that you want?
Do you want to figure out which side is the most correct?
Do you want to figure out the actual truth behind the political issue?
Do you want to hold an opinion which won't disrupt your social life too much or make you unpopular?

I expect that these four will bring you to different answers.

(While I think I understand the problem well, I can't promise that I have a good solution. Besides, it's subjective. Since the topic is controversial, any answer I give will be influenced by the very biases that we're potentially interested in avoiding)

By the way, personally, I don't care much what foreign actors (or team A and B) have to say about anything, so it's not a factor which makes a difference to me.

Edit: I should probably have submitted this as a comment and not an answer. Oh well, I will think up an answer if you respond.

richard-horvath on A superficially plausible promising alternate Earth without lockstep

You lost me at "Bywayeans generally save up enough to move out of their parents' houses around age 9". Likely you will lose most people at "ancap".

But these are only minor things, and I can imagine that it would be possible to pull up some plausible explanation or just to revise one or the other, and keep the core message. I think the main issue with your description Byway are in these:

"...even though Bywayeans are smarter than Earthlings..."

"Bywayeans have a lot of energy..."

"...they love working and innovating..."

These imply that they are already better then us, and would run the world better even in the structure we have on our real Earth.

The message of Dath Ilan is that they and us are the same people, with the same genetic heritage, hence intelligence and vices. But they can still do better, just by having better institutions/methods for cooperation, hence, we can get better too! If you give special power to the inhabitants of your alternate Earth, it may explain why are not stuck in the same mire as we are, and imply we cannot learn from their success.

alexander-gietelink-oldenziel on New intro textbook on AIXI

Thanks a lot!

A few followup questions :..

By computaibility level do you mean Turing degree ?

Why cant the universal distribution be constructed for most levels ?

What exactly is the coding theorem?

What do you mean by conditioning and planning damaging the computability level and why is not so bad ?

ruby on A bird's eye view of ARC's research

Curated! I think it's generally great when people explain what they're doing and why in way legibile to those not working on it. Great because it let's others potentially get involved, build on it, expose flaws or omissions, etc. This one seems particularly clear and well written. While I haven't read all of the research, nor am I particularly qualified to comment on it, I like the idea of a principled/systematic approach behind, in comparison to a lot of work that isn't coming on a deeper, bigger, framework.

(While I'm here though, I'll add a link to Dmitry Vaintrob's comment [LW(p) · GW(p)] that Jacob Hilton described as "best critique of ARC's research agenda that I have read since we started working on heuristic explanations". Eliciting such feedback is the kind of good thing that comes out of up writing agendas – it's possible or likely Dmitry was already tracking the work and already had these critiques, but a post like this seems like a good way to propagate them and have a public back and forth.)

Roughly speaking, if the scalability of an algorithm depends on unknown empirical contingencies (such as how advanced AI systems generalize), then we try to make worst-case assumptions instead of attempting to extrapolate from today's systems.

I like this attitude. The human standard, I think often in alignment work too, is to argue why one's plan will work and find stories for that, and adopting the methodology of the opposite, especially given the unknowns, is much needed in alignment work.

Overall, this is neat. Kudos to Jacob (and rest of the team) for taking the time to put this all together. Doesn't seem all that quick to write, and I think it'd be easy to think they ought to not take time out off from further object-level research to write it. Thanks!

towards_keeperhood on johnswentworth's Shortform

(There might be a sorta annoying analysis one could do to test my hypothesis: On my hypothesis the correlation between the intelligence of very intelligent parents and their children would be even a bit less than on the just-independent-mutations hypothesis, because very intelligent people likely also got lucky in how their gene variants work together but those properties would unlikely to all be passed along and end up dominant.)

chipmonk on The hostile telepaths problem

This reminds me… maybe muscle tension is a frequent solution to this problem?

Some context: Lately I've been wondering, Why do we often experience feelings as things in the body? For example, why do I feel anxiety in my chest rather than just “knowing” I'm anxious?

For example, my previous chronic neck pain seemed to be related to information that manifested in my neck:

I suspect the feeling in my neck represented the information "I have the choice to leave the social situation I'm in right now" and/or "I am disliking/suppressing myself."

Why might this feeling have manifested in my neck?

What if feelings use the body as a screen to communicate information with others? If you have a certain feeling in your chest, maybe others can see that.

BUT: What if a feeling represents information that your system doesn't want other people to know? Hostile telepaths problem.

Im my case:

The feeling represented the awareness that I was insecure, and there were probably situations (probably social situations) in which it partially benefited me to be partially unaware of the fact that I was insecure.

Well, in that case, your system could create muscle tension to "jam the signal"…

If the muscles are stiff, maybe they can't be used as a screen anymore.