LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

When fine-tuning fails to elicit GPT-3.5's chess abilities
Theodore Chapman · 2024-06-14T18:50:52.855Z · comments (3)

How I internalized my achievements to better deal with negative feelings
Raymond Koopmanschap · 2024-02-27T15:10:24.149Z · comments (7)

[link] Rowing vs steering
Saul Munn (saul-munn) · 2024-08-10T07:00:17.594Z · comments (2)

Take SCIFs, it’s dangerous to go alone
latterframe · 2024-05-01T08:02:38.067Z · comments (1)

Concrete empirical research projects in mechanistic anomaly detection
Erik Jenner (ejenner) · 2024-04-03T23:07:21.502Z · comments (0)

[link] Soviet comedy film recommendations
Nina Panickssery (NinaR) · 2024-06-09T23:40:58.536Z · comments (11)

Paper Summary: The Effects of Communicating Uncertainty on Public Trust in Facts and Numbers
Jeffrey Heninger (jeffrey-heninger) · 2024-07-09T16:50:05.776Z · comments (2)

Examining Language Model Performance with Reconstructed Activations using Sparse Autoencoders
Evan Anders (evan-anders) · 2024-02-27T02:43:22.446Z · comments (16)

Housing Roundup #7
Zvi · 2024-03-04T15:00:08.192Z · comments (1)

D&D.Sci Long War: Defender of Data-mocracy
aphyer · 2024-04-26T22:30:15.780Z · comments (20)

Was Releasing Claude-3 Net-Negative?
Logan Riggs (elriggs) · 2024-03-27T17:41:56.245Z · comments (5)

Koan: divining alien datastructures from RAM activations
TsviBT · 2024-04-05T18:04:57.280Z · comments (10)

[link] Post series on "Liability Law for reducing Existential Risk from AI"
Nora_Ammann · 2024-02-29T04:39:50.557Z · comments (1)

Wholesomeness and Effective Altruism
owencb · 2024-02-28T20:28:22.175Z · comments (3)

Evidential Cooperation in Large Worlds: Potential Objections & FAQ
Chi Nguyen · 2024-02-28T18:58:25.688Z · comments (5)

US Presidential Election: Tractability, Importance, and Urgency
kuhanj · 2024-05-29T23:52:22.420Z · comments (2)

D&D.Sci Long War: Defender of Data-mocracy Evaluation & Ruleset
aphyer · 2024-05-14T03:35:10.586Z · comments (3)

Formalizing the Informal (event invite)
abramdemski · 2024-09-10T19:22:53.564Z · comments (0)

Principled Satisficing To Avoid Goodhart
JenniferRM · 2024-08-16T19:05:27.204Z · comments (2)

Debate: Get a college degree?
Ben Pace (Benito) · 2024-08-12T22:23:34.744Z · comments (14)

Work with me on agent foundations: independent fellowship
Alex_Altair · 2024-09-21T13:59:16.706Z · comments (3)

A Robust Natural Latent Over A Mixed Distribution Is Natural Over The Distributions Which Were Mixed
johnswentworth · 2024-08-22T19:19:28.940Z · comments (4)

How difficult is AI Alignment?
Sammy Martin (SDM) · 2024-09-13T15:47:10.799Z · comments (6)

Unit economics of LLM APIs
dschwarz · 2024-08-27T16:51:22.692Z · comments (0)

[link] Surgery Works Well Without The FDA
Maxwell Tabarrok (maxwell-tabarrok) · 2024-01-26T13:31:29.968Z · comments (28)

[question] What rationality failure modes are there?
Ulisse Mini (ulisse-mini) · 2024-01-19T09:12:57.924Z · answers+comments (11)

Taking responsibility and partial derivatives
Ruby · 2023-12-31T04:33:51.419Z · comments (1)

Navigating emotions in an uncertain & confusing world
Akash (akash-wasil) · 2023-11-20T18:16:09.492Z · comments (1)

Estimating efficiency improvements in LLM pre-training
Daan · 2024-01-19T19:32:45.124Z · comments (3)

[link] AI Girlfriends Won't Matter Much
Maxwell Tabarrok (maxwell-tabarrok) · 2023-12-23T15:58:30.308Z · comments (22)

[link] We Need Major, But Not Radical, FDA Reform
Maxwell Tabarrok (maxwell-tabarrok) · 2024-02-24T16:54:33.061Z · comments (12)

Are humans misaligned with evolution?
TekhneMakre · 2023-10-19T03:14:14.759Z · comments (13)

[link] cold aluminum for medicine
bhauth · 2023-12-16T14:38:03.260Z · comments (4)

Apply to the Constellation Visiting Researcher Program and Astra Fellowship, in Berkeley this Winter
Nate Thomas (nate-thomas) · 2023-10-26T03:07:34.118Z · comments (10)

MonoPoly Restricted Trust
ymeskhout · 2024-01-02T23:02:55.066Z · comments (37)

[link] Project ideas: Epistemics
Lukas Finnveden (Lanrian) · 2024-01-05T23:41:23.721Z · comments (4)

Deep and obvious points in the gap between your thoughts and your pictures of thought
KatjaGrace · 2024-02-23T07:30:07.461Z · comments (6)

Monthly Roundup #11: October 2023
Zvi · 2023-10-03T14:10:01.686Z · comments (12)

How toy models of ontology changes can be misleading
Stuart_Armstrong · 2023-10-21T21:13:56.384Z · comments (0)

AI Risk and the US Presidential Candidates
Zane · 2024-01-06T20:18:04.945Z · comments (22)

[question] What did you change your mind about in the last year?
mike_hawke · 2023-11-23T20:53:45.664Z · answers+comments (16)

How to partition teams to move fast? Debating "low-dimensional cuts"
jacobjacob · 2023-10-13T21:43:53.067Z · comments (2)

NYT is suing OpenAI&Microsoft for alleged copyright infringement; some quick thoughts
Mikhail Samin (mikhail-samin) · 2023-12-27T18:44:33.976Z · comments (17)

How Emergency Medicine Solves the Alignment Problem
StrivingForLegibility · 2023-12-26T05:24:35.579Z · comments (4)

Estimating effective dimensionality of MNIST models
Arjun Panickssery (arjun-panickssery) · 2023-11-02T14:13:09.012Z · comments (3)

Matrix completion prize results
paulfchristiano · 2023-12-20T15:40:04.281Z · comments (0)

[link] energy landscapes of experts
bhauth · 2023-10-02T14:08:32.370Z · comments (2)

Pivotal Acts might Not be what You Think they are
Johannes C. Mayer (johannes-c-mayer) · 2023-11-05T17:23:50.464Z · comments (13)

What makes teaching math special
Viliam · 2023-12-17T14:15:01.136Z · comments (27)

The Perils of Professionalism
Screwtape · 2023-11-07T00:07:33.213Z · comments (1)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

cfoster0 on What prevents SB-1047 from triggering on deep fake porn/voice cloning fraud?

I’m not sure if you intended the allusion to “the tendentious assumption in the other comment thread that courts are maximally adversarial processes bent on on misreading legislation to achieve their perverted ends”, but if it was aimed at the thread I commented on… what? IMO it is fair game to call out as false the claim that

It only counts if the $500m comes from "cyber attacks on critical infrastructure" or "with limited human oversight, intervention, or supervision....results in death, great bodily injury, property damage, or property loss."

even if deepfake harms wouldn’t fall under this condition. Local validity matters.

I agree with you that deepfake harms are unlikely to be direct triggers for the bill’s provisions, for similar reasons as you mentioned.

dagon on Is it rational to modify one's utility function?

so basically, we could make an AI that wants to maximize a variable called Utility

Oh, maybe this is the confusion. It's not a variable called Utility. It's the actual true goal of the agent. We call it "utility" when analyzing decisions, and VNM-rational agents act as if they have a utility function over states of the world, but it doesn't have to be external or programmable.

I'd taken your pseudocode as a shorthand for "design the rational agent such that what it wants is ...". It's not literally a variable, nor a simple piece of code that non-simple code could change.

t3t on What prevents SB-1047 from triggering on deep fake porn/voice cloning fraud?

Notwithstanding the tendentious assumption in the other comment thread that courts are maximally adversarial processes bent on on misreading legislation to achieve their perverted ends, I would bet that the relevant courts would not in fact rule that a bunch of deepfaked child porn counted as "Other grave harms to public safety and security that are of comparable severity to the harms described in subparagraphs (A) to (C), inclusive", where those other things are "CBRN > mass casualties", "cyberattack on critical infra", and "autonomous action > mass casualties". Happy to take such a bet at 2:1 odds.

But there are some simpler reason that particular hypothetical fails:

Image models are just not nearly as expensive to train, so it's unlikely that they'd fall under the definition of a covered model to begin with.
Even if someone used a covered multimodal model, existing models can already do this.

See:

(2) “Critical harm” does not include any of the following:
(A) Harms caused or materially enabled by information that a covered model or covered model derivative outputs if the information is otherwise reasonably publicly accessible by an ordinary person from sources other than a covered model or covered model derivative.

benito on Petrov Day Ceremony (TODAY)

If you’re late call me at 510 998 4771

elityre on A Path out of Insufficient Views

If you're saying we should work on enlightenment before working on AGI x-risk, I disagree.
We may well not have the time.

I am very aware that we may not have time.

But sometimes people will make an argument "unless we figure out X our attempts at resolving x-risk are really really doomed."

Lots of different people, for lots of different versions of X, actually:

solving agent foundations,
scaling human enlightenment (so that we stop flailying making things worse all the time),
building a culture that can talk about the fact that we can't see or talk about conflict (and so our efforts to do good in the world get predictably coopted by extractive forces).
etc.

I definitely want to know if any of those statements are true, for any particular version of X, even if we don't have time to do X before the deadline.

Not having time doesn't have any b

cronodas on Predictive Processing, Heterosexuality and Delusions of Grandeur

If things go wrong[1] then our neural net will conclude that it has high status despite all evidence to the contrary. We have programmed schizophrenia.

No, you've programmed grandiose delusions - a lot more goes wrong with schizophrenia than just that.

cubefox on [Intuitive self-models] 2. Conscious Awareness

I don’t need a “theory” to explain how a “hypothetical” learning algorithm can build a generative model that can represent this kind of information in its latent variables, and draw appropriate inferences.

Sure, but we would still need a separate explanation if we want to understand how representation/reference works in a model (or in the brain) itself. If we are interested in that, of course. It could be interesting from the standpoint of philosophy of mind, philosophy of language, linguistics, cognitive psychology, and of course machine learning interpretability.

If, when you run these algorithms, you wind up with all kinds of edge cases where it’s unclear what is “about” what, (and you do), then that’s a sign that you should not be treating “aboutness” as a bedrock principle in the first place.

I don't think we did run into any edge cases of representation so far where something partially represents or is partially represented, like chess is partially sport-like. Representation/reference/aboutness doesn't seem a very vague concept. Apparently the difficulty of finding an adequate definition isn't due to vagueness.

That being said, it's clearly not necessary for your theory to cover this topic if you don't find it very interesting and/or you have other objectives.

elityre on A Path out of Insufficient Views

The "one weird trick" to getting the right answers is to discard all stuck, fixed points. Discard all priors and posteriors. Discard all aliefs and beliefs. Discard worldview after worldview. Discard perspective. Discard unity. Discard separation. Discard conceptuality. Discard map, discard territory. Discard past, present, and future. Discard a sense of you. Discard a sense of world. Discard dichotomy and trichotomy. Discard vague senses of wishy-washy flip floppiness. Discard something vs nothing. Discard one vs all. Discard symbols, discard signs, discard waves, discard particles.
All of these things are Ignorance. Discard Ignorance.

They don't seem like ignorance to me! Many of them seem distinctly like knowledge!

You probably don't understand what I just said.

It does seem that way. : P

That's fine.

Ok then, what would be a polite and appropriate way to respond to your speech acts like these? Should I state that they sound wrong to me? Should I ignore the content them and treat them as your artistic expression?

archimedes on What Depression Is Like

Yes, but with a very different description of the subjective experience -- kind of like getting a sunburn on your back feels very different than most other types of back pain.

elityre on A Path out of Insufficient Views

No worldview will be able to output the best answer in every circumstance. This is not a matter of compute.

Why not? Or why can't you have a worldview that computes the best answer to any given "what should I do" question, to arbitrary but not infinite precision?

Is this something that you think you know mainly because of your personal experience with pervious worldviews failing you? Some other way?

Is it something that your reader should be able to infer from this post, or from their own experience of life (assuming they're paying attention?)