LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Come to Manifest 2024 (June 7-9 in Berkeley)
Saul Munn (saul-munn) · 2024-03-27T21:30:17.306Z · comments (2)

[link] the micro-fulfillment cambrian explosion
bhauth · 2023-12-04T01:15:34.342Z · comments (5)

[Intuitive self-models] 5. Dissociative Identity (Multiple Personality) Disorder
Steven Byrnes (steve2152) · 2024-10-15T13:31:46.157Z · comments (7)

Dating Roundup #2: If At First You Don’t Succeed
Zvi · 2024-01-02T16:00:04.955Z · comments (29)

[link] On the Role of Proto-Languages
adamShimi · 2024-09-22T16:50:34.720Z · comments (1)

Ten Modes of Culture War Discourse
jchan · 2024-01-31T13:58:20.572Z · comments (15)

AI #44: Copyright Confrontation
Zvi · 2023-12-28T14:30:10.237Z · comments (13)

[link] OpenAI releases GPT-4o, natively interfacing with text, voice and vision
Martín Soto (martinsq) · 2024-05-13T18:50:52.337Z · comments (23)

Safe Stasis Fallacy
Davidmanheim · 2024-02-05T10:54:44.061Z · comments (2)

Complexity of value but not disvalue implies more focus on s-risk. Moral uncertainty and preference utilitarianism also do.
Chi Nguyen · 2024-02-23T06:10:05.881Z · comments (18)

[link] Unlocking Solutions—By Understanding Coordination Problems
James Stephen Brown (james-brown) · 2024-07-27T04:52:13.435Z · comments (4)

Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
leogao · 2023-12-16T05:39:10.558Z · comments (5)

On Anthropic’s Sleeper Agents Paper
Zvi · 2024-01-17T16:10:05.145Z · comments (5)

[Closed] PIBBSS is hiring in a variety of roles (alignment research and incubation program)
Nora_Ammann · 2024-04-09T08:12:59.241Z · comments (0)

[link] How Likely Are Various Precursors of Existential Risk?
NunoSempere (Radamantis) · 2024-10-28T13:27:31.620Z · comments (4)

[link] Breaking Circuit Breakers
mikes · 2024-07-14T18:57:20.251Z · comments (13)

Causal Graphs of GPT-2-Small's Residual Stream
David Udell · 2024-07-09T22:06:55.775Z · comments (7)

AI #50: The Most Dangerous Thing
Zvi · 2024-02-08T14:30:13.168Z · comments (4)

Zvi's Manifold Markets House Rules
Zvi · 2023-11-13T00:28:02.147Z · comments (6)

[link] S-Risks: Fates Worse Than Extinction
aggliu · 2024-05-04T15:30:36.666Z · comments (2)

Be More Katja
Nathan Young · 2024-03-11T21:12:14.249Z · comments (0)

[link] LLMs seem (relatively) safe
JustisMills · 2024-04-25T22:13:06.221Z · comments (24)

AI #76: Six Shorts Stories About OpenAI
Zvi · 2024-08-08T13:50:04.659Z · comments (10)

Fat Tails Discourage Compromise
niplav · 2024-06-17T09:39:16.489Z · comments (5)

AMA: Earning to Give
jefftk (jkaufman) · 2023-11-07T16:20:10.972Z · comments (8)

Trading off Lives
jefftk (jkaufman) · 2024-01-03T03:40:05.603Z · comments (12)

AI #71: Farewell to Chevron
Zvi · 2024-07-04T13:40:05.905Z · comments (9)

AI #40: A Vision from Vitalik
Zvi · 2023-11-30T17:30:08.350Z · comments (12)

Acting Wholesomely
owencb · 2024-02-26T21:49:16.526Z · comments (64)

[link] Open Phil releases RFPs on LLM Benchmarks and Forecasting
LawrenceC (LawChan) · 2023-11-11T03:01:09.526Z · comments (0)

[question] Can we get an AI to "do our alignment homework for us"?
Chris_Leong · 2024-02-26T07:56:22.320Z · answers+comments (33)

Calendar feature geometry in GPT-2 layer 8 residual stream SAEs
Patrick Leask (patrickleask) · 2024-08-17T01:16:53.764Z · comments (0)

Per protocol analysis as medical malpractice
braces · 2024-01-31T16:22:21.367Z · comments (8)

We are headed into an extreme compute overhang
devrandom · 2024-04-26T21:38:21.694Z · comments (33)

2022 (and All Time) Posts by Pingback Count
Raemon · 2023-12-16T21:17:00.572Z · comments (14)

AI #37: Moving Too Fast
Zvi · 2023-11-09T17:50:04.324Z · comments (5)

Schelling points in the AGI policy space
mesaoptimizer · 2024-06-26T13:19:25.186Z · comments (2)

Can we build a better Public Doublecrux?
Raemon · 2024-05-11T19:21:53.326Z · comments (6)

Anthropical Paradoxes are Paradoxes of Probability Theory
Ape in the coat · 2023-12-06T08:16:26.846Z · comments (18)

[link] OpenAI Staff (including Sutskever) Threaten to Quit Unless Board Resigns
Seth Herd · 2023-11-20T14:20:33.539Z · comments (28)

Announcing the Double Crux Bot
sanyer (santeri-koivula) · 2024-01-09T18:54:15.361Z · comments (8)

Parental Writing Selection Bias
jefftk (jkaufman) · 2024-10-13T14:00:03.225Z · comments (3)

Reflections on my first year of AI safety research
Jay Bailey · 2024-01-08T07:49:08.147Z · comments (3)

AI #45: To Be Determined
Zvi · 2024-01-04T15:00:05.936Z · comments (4)

Was Releasing Claude-3 Net-Negative?
Logan Riggs (elriggs) · 2024-03-27T17:41:56.245Z · comments (5)

The Stochastic Parrot Hypothesis is debatable for the last generation of LLMs
Quentin FEUILLADE--MONTIXI (quentin-feuillade-montixi) · 2023-11-07T16:12:20.031Z · comments (20)

[link] Slightly More Than You Wanted To Know: Pregnancy Length Effects
JustisMills · 2024-10-21T01:26:02.030Z · comments (4)

Gradient Descent on the Human Brain
Jozdien · 2024-04-01T22:39:24.862Z · comments (5)

AI #87: Staying in Character
Zvi · 2024-10-29T07:10:08.212Z · comments (3)

Pseudonymity and Accusations
jefftk (jkaufman) · 2023-12-21T19:20:19.944Z · comments (20)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

quila on quila's Shortform

something i'd be interested in reading: writings about the authors alignment ontologies over time, i.e. from when they first heard of AI till now

unexpectedvalues on Should CA, TX, OK, and LA merge into a giant swing state, just for elections?

An interesting thing about this proposal is that it would make every state besides CA, TX, OK, and LA pretty much irrelevant for the outcome of the presidential election. E.g. in this election, whichever candidate won CATXOKLA would have enough electoral votes to win the election, even if the other candidate won every swing state.

...which of course would be unfair to the non-CATXOKLA states, but like, not any more unfair than the current system?

rich-d on Graceful Degradation

Graceful degradation was something that I had originally heard of in a computing context, but that I find has real application in the legal field (which is my field of work). When giving legal advice, whenever possible, you want to give guidance that will work even if only part of it is followed. (Because even as an in-house lawyer, I can pretty much count on my clients ignoring or reinterpreting my advice pretty regularly.)

This is especially important when some action or behavior becomes critical, but only in certain circumstances. For instance, advising my engineering clients NOT to do their own research into our competitors' proprietary technology is very important advice (because if they do, it leads to higher damages if they are found to infringe patents on said technology, and can also put the company at risk for misappropriating another company's trade secrets). On the other hand, if they are to learn something proprietary about a competitor, it is critical that they let the legal team know about it, since the consequences of mishandling that information are so high.

So to attempt to give gracefully degrading instructions in this space becomes a little self-contradictory if half the instructions are forgotten. "Don't try to reverse engineer competitor's code. But if you do, make sure to tell me about it." This usually results in clients remembering either: "Don't tell the lawyers if you learn competitor information" (resulting in not warning us they have competitor info) or "Be sure to tell the lawyers about any reverse engineering you do" (resulting in teams going out to try to specifically research competitor information).

This type of "Don't ..., But if you do ..." situation resists degrading gracefully, but comes up more than I'd like.

On an unrelated note, when I first learned this term, I just had an image of a very refined woman at a fancy dinner party, taking a sip of her wine and then turning to her husband and saying, "Darling, I love you, but this is simply the worst affair you've ever dragged me out to."

gabriel-brito on A Different Perspective on Rationality - Would This Be Valuable?

Haha, sorry and thank you! Maybe now:
https://www.lesswrong.com/posts/WbQRxeCCmypgKrT7R/when-x-negotiatiates-with-y

gabriel-brito on A Different Perspective on Rationality - Would This Be Valuable?

Thank you so much for this insightful comment! Your words gave me just the encouragement I needed to go ahead with my post, and the references you mentioned were truly inspiring. Knowing about the tradition of using dialogues to explore complex ideas, from Gödel, Escher, Bach to Galileo’s Dialogue, helped me see the potential in this approach to reach different types of readers.

Thanks to your encouragement, I’ve now published my first post! I’d be thrilled to hear any feedback you have, as you so kindly offered. Here’s the link: When X Negotiates with Y [LW · GW]. I hope you enjoy it, and thank you again for your support 😊.

austin-chen on 5 homegrown EA projects, seeking small donors

@Matt Putz [LW · GW] thanks for supporting Gavin's work and letting us know; I'm very happy to hear that my post helped you find this!

I also encourage others to check out OP's RFPs. I don't know about Gavin, but I was peripherally aware of this RFP, and it wasn't obvious to me that Gavin should have considered applying, for these reasons:

Gavin's work seems aimed internally towards existing EA folks, while this RFP's media/comms examples (at a glance) seems to be aimed externally for public-facing outreach
I'm not sure what the typical grant size that the OP RFP is targeting, but my cached heuristic is that OP tends to fund projects looking for $100k+ and that smaller projects should look elsewhere (eg through EAIF or LTFF), due to grantmaker capacity constraints on OP's side
Relatedly, the idea of filling out an OP RFP seems somewhat time-consuming and burdensome (eg somewhere between 3 hours and 2 days), so I think many grantees might not consider doing so unless asking for large amounts
Also, the RFP form seems to indicate a turnaround time of 3 months, which might have seemed too slow for a project like Gavin's

I'm evidently wrong on all these points given that OP is going to fund Gavin's project, which is great! So I'm listing these in the spirit of feedback. Some easy wins to encourage smaller projects to apply might be to update the RFP page to 1. list some example grants and grant sizes that were sourced through this, and 2. describe how much time you expect an applicant to take to fill out the form (something EA Funds does, which I appreciate, even if I invariably take much more time than they state).

cstinesublime on keltan's Shortform

I'm curious why you opted for Aristotle (albeit "modern") as the prompt pre-load? Most of those responses seem not directly tethered to Aristotelian concepts/books or even what he directly posits as being the most important skills and faculties of human cognition. For example, cold reading, I don't recall anything of the sort anywhere in any Aristotle I've read.

While we're not sure Aristotle himself designed the layout of the corpus, we do know that in the Nicomachean Ethics lists the faculties of "whereby the soul attains Truth":

Techne (Τεχνε) - which refers to conventional ways of achieving goals, i.e. without deliberation
Episteme (Επιστήμε) - which is apodeiktike or the faculty of arguing from proofs
Phronesis (Φρονέσις) - confusingly translated as "practical wisdom" this refers to the ability to deliberate to attain goals by means of deliberation. Excellence in phronesis is translated by the latinate word 'Prudence'.
Sofia (Σοφια) - often translated as 'wisdom' - Aristotle calls this the investigation of causes.
Nous (Νους ) - which refers to the archai - or the 'first principles'

According to Diogenes Laertius, the corpus (at least as it has come to us) divides into the practical books and the theoretical - the practical itself would be subdivided between the books on Techne (say Rhetoric and Poetics), and Phronesis (Ethics and Politics), the theoretical is then covered in works like the Metaphysics (which is probably not even a cohesive book, but a hodge-podge), Categories etc. etc.

This would appear to me to be a better guide for the timeless education in Aristotelian tradition and how we should guide a modern adaptation.

adam_scholl on adam_scholl's Shortform

It seems the pro-Trump Polymarket whale may have had a real edge after all. Wall Street Journal reports (paywalled link, screenshot) that he’s a former professional trader, who commissioned his own polls from a major polling firm using an alternate methodology—the neighbor method, i.e. asking respondents who they expect their neighbors will vote for—he thought would be less biased by preference falsification.

I didn't bet against him, though I strongly considered it; feeling glad this morning that I didn't.

anthony-digiovanni on Winning isn't enough

Without a clear definition of "winning,"

This is part of the problem we're pointing out in the post. We've encountered claims of this "winning" flavor that haven't been made precise, so we survey different things "winning" could mean more precisely, and argue that they're inadequate for figuring out which norms of rationality to adopt.

ricraz on Against Almost Every Theory of Impact of Interpretability

The former can be sufficient—e.g. there are good theoretical researchers who have never done empirical work themselves.

In hindsight I think "close conjunction" was too strong—it's more about picking up the ontologies and key insights from empirical work, which can be possible without following it very closely.