LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Why Swiss watches and Taylor Swift are AGI-proof
Kevin Kohler (KevinKohler) · 2024-09-05T13:23:27.033Z · comments (11)

Automating LLM Auditing with Developmental Interpretability
htlou · 2024-09-04T15:50:04.337Z · comments (0)

[question] Is there any rigorous work on using anthropic uncertainty to prevent situational awareness / deception?
David Scott Krueger (formerly: capybaralet) (capybaralet) · 2024-09-04T12:40:07.678Z · answers+comments (7)

[link] Update on the Mysterious Trump Buyers on Polymarket
Annapurna (jorge-velez) · 2024-11-04T19:22:06.540Z · comments (9)

OpenAI defected, but we can take honest actions
Remmelt (remmelt-ellen) · 2024-10-21T08:41:25.728Z · comments (15)

Is Text Watermarking a lost cause?
egor.timatkov · 2024-10-01T16:20:51.113Z · comments (13)

Training a Sparse Autoencoder in < 30 minutes on 16GB of VRAM using an S3 cache
Louka Ewington-Pitsos (louka-ewington-pitsos) · 2024-08-24T07:39:00.057Z · comments (0)

Invitation to lead a project at AI Safety Camp (Virtual Edition, 2025)
Linda Linsefors · 2024-08-23T14:18:24.327Z · comments (2)

[link] some questionable space launch guns
bhauth · 2024-10-13T22:52:26.418Z · comments (0)

Bridging the VLM and mech interp communities for multimodal interpretability
Sonia Joseph (redhat) · 2024-10-28T14:41:41.969Z · comments (5)

[link] Four Levels of Voting Methods
hive · 2024-09-26T18:15:00.565Z · comments (3)

[link] AlignedCut: Visual Concepts Discovery on Brain-Guided Universal Feature Space
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-09-14T23:23:26.296Z · comments (1)

[link] Instruction Following without Instruction Tuning
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-09-24T13:49:09.078Z · comments (0)

My career exploration: Tools for building confidence
lynettebye · 2024-09-13T11:37:55.843Z · comments (0)

[link] Will we ever run out of new jobs?
Kevin Kohler (KevinKohler) · 2024-08-19T15:04:03.849Z · comments (7)

Physical Therapy Sucks (but have you tried hiding it in some peanut butter?)
Declan Molony (declan-molony) · 2024-09-10T05:54:47.000Z · comments (12)

Appealing to the Public
jefftk (jkaufman) · 2024-10-23T19:00:07.669Z · comments (0)

Reducing global AI competition through the Commerce Control List and Immigration reform: a dual-pronged approach
Ben Smith (ben-smith) · 2024-09-03T05:28:24.549Z · comments (2)

Slave Morality: A place for every man and every man in his place
Martin Sustrik (sustrik) · 2024-09-19T04:20:04.491Z · comments (7)

[link] My lukewarm take on GLP-1 agonists
George3d6 · 2024-08-26T12:34:27.929Z · comments (0)

Interview with Robert Kralisch on Simulators
WillPetillo · 2024-08-26T05:49:15.543Z · comments (0)

[link] CultFrisbee
Gauraventh (aryangauravyadav) · 2024-08-11T21:36:36.550Z · comments (3)

Hiring a writer to co-author with me (Spencer Greenberg for ClearerThinking.org)
spencerg · 2024-10-27T17:34:50.479Z · comments (0)

[question] Is there a CFAR handbook audio option?
FinalFormal2 · 2024-10-26T17:08:36.480Z · answers+comments (0)

[link] Non-Transactional Compliments
Jonathan Moregård (JonathanMoregard) · 2024-08-09T13:42:16.471Z · comments (0)

[question] Does the "ancient wisdom" argument have any validity? If a particular teaching or tradition is old, to what extent does this make it more trustworthy?
SpectrumDT · 2024-11-04T15:20:14.822Z · answers+comments (11)

Review: Dr Stone
ProgramCrafter (programcrafter) · 2024-09-29T10:35:53.175Z · comments (5)

Thinking in 2D
sarahconstantin · 2024-10-20T19:30:05.842Z · comments (0)

[link] Why good things often don’t lead to better outcomes
DMMF · 2024-09-19T16:37:07.778Z · comments (1)

Announcing the Ultimate Jailbreaking Championship
InnerHufflepuff (grayswan) · 2024-09-04T00:35:31.234Z · comments (1)

Simulation-aware causal decision theory: A case for one-boxing in CDT
kongus_bongus · 2024-08-09T18:09:20.013Z · comments (11)

Join a LessWrong Team for the Unaging System Challenge
Crissman · 2024-10-23T06:01:08.018Z · comments (5)

Advisors for Smaller Major Donors?
jefftk (jkaufman) · 2024-11-06T14:30:06.187Z · comments (2)

[link] Where is the Learn Everything System?
Shoshannah Tekofsky (DarkSym) · 2024-09-27T21:30:16.379Z · comments (8)

[link] Fragile, Robust, and Antifragile Preference Satisfaction
adamShimi · 2024-11-02T17:25:55.986Z · comments (0)

Meme Talking Points
ymeskhout · 2024-11-06T15:27:54.024Z · comments (0)

[link] Pronouns are Annoying
ymeskhout · 2024-09-18T13:30:04.620Z · comments (21)

[link] Benefits of Psyllium Dietary Fiber in Particular
Brendan Long (korin43) · 2024-08-28T18:13:23.891Z · comments (7)

[link] AI x Human Flourishing: Introducing the Cosmos Institute
Brendan McCord (brendan-mccord) · 2024-09-05T18:23:32.690Z · comments (5)

Against Explosive Growth
c.trout (ctrout) · 2024-09-04T21:45:03.120Z · comments (1)

[link] Verification methods for international AI agreements
Akash (akash-wasil) · 2024-08-31T14:58:10.986Z · comments (1)

[link] AI & wisdom 2: growth and amortised optimisation
L Rudolf L (LRudL) · 2024-10-28T21:07:39.449Z · comments (0)

[link] AI & wisdom 3: AI effects on amortised optimisation
L Rudolf L (LRudL) · 2024-10-28T21:08:56.604Z · comments (0)

Are LLMs on the Path to AGI?
Davidmanheim · 2024-08-30T03:14:04.710Z · comments (2)

[link] The Ap Distribution
criticalpoints · 2024-08-24T21:45:35.029Z · comments (3)

[question] Looking to interview AI Safety researchers for a book
jeffreycaruso · 2024-08-24T19:57:33.119Z · answers+comments (0)

Pomodoro Method Randomized Self Experiment
niplav · 2024-09-29T21:55:04.740Z · comments (2)

[link] Runner's High On Demand: A Story of Luck & Persistence
Shoshannah Tekofsky (DarkSym) · 2024-09-29T17:15:29.494Z · comments (6)

Inverse Problems In Everyday Life
silentbob · 2024-10-15T11:42:30.276Z · comments (1)

The deepest atheist: Sam Altman
Trey Edwin (Paolo Vivaldi) · 2024-10-10T03:27:34.465Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

alexander-gietelink-oldenziel on adam_scholl's Shortform

Yes. https://www.lesswrong.com/posts/tDkYdyJSqe3DddtK4/alexander-gietelink-oldenziel-s-shortform?commentId=JqDaYkRyw2WSAZLDg [LW(p) · GW(p)]

lc on Shortform

I regret that both factions couldn't lose.

anthony-digiovanni on Winning isn't enough

Adding to Jesse's comment, the "We’ve often heard things along the lines of..." line refers both to personal communications and to various comments we've seen, e.g.:

[link] [EA(p) · GW(p)]: "Since this intuition leads to the (surely false) conclusion that a rational beneficent agent might just as well support the For Malaria Foundation as the Against Malaria Foundation, it seems to me that we have very good reason to reject that theoretical intuition"
[link] [EA(p) · GW(p)]: "including a few mildly stubborn credence functions in some judiciously chosen representors can entail effective altruism from the longtermist perspective is a fool’s errand. Yet this seems false"
[link] [LW(p) · GW(p)]: "I think that if you try to get any meaningful mileage out of the maximality rule ... basically everything becomes permissible, which seems highly undesirable"
- (Also, as we point out in the post, this is only true insofar as you only use maximality, applied to total consequences. You can still regard obviously evil things as unacceptable on non-consequentialist grounds, for example.)

quila on quila's Shortform

something i'd be interested in reading: writings about the authors alignment ontologies over time, i.e. from when they first heard of AI till now

unexpectedvalues on Should CA, TX, OK, and LA merge into a giant swing state, just for elections?

An interesting thing about this proposal is that it would make every state besides CA, TX, OK, and LA pretty much irrelevant for the outcome of the presidential election. E.g. in this election, whichever candidate won CATXOKLA would have enough electoral votes to win the election, even if the other candidate won every swing state.

...which of course would be unfair to the non-CATXOKLA states, but like, not any more unfair than the current system?

rich-d on Graceful Degradation

Graceful degradation was something that I had originally heard of in a computing context, but that I find has real application in the legal field (which is my field of work). When giving legal advice, whenever possible, you want to give guidance that will work even if only part of it is followed. (Because even as an in-house lawyer, I can pretty much count on my clients ignoring or reinterpreting my advice pretty regularly.)

This is especially important when some action or behavior becomes critical, but only in certain circumstances. For instance, advising my engineering clients NOT to do their own research into our competitors' proprietary technology is very important advice (because if they do, it leads to higher damages if they are found to infringe patents on said technology, and can also put the company at risk for misappropriating another company's trade secrets). On the other hand, if they are to learn something proprietary about a competitor, it is critical that they let the legal team know about it, since the consequences of mishandling that information are so high.

So to attempt to give gracefully degrading instructions in this space becomes a little self-contradictory if half the instructions are forgotten. "Don't try to reverse engineer competitor's code. But if you do, make sure to tell me about it." This usually results in clients remembering either: "Don't tell the lawyers if you learn competitor information" (resulting in not warning us they have competitor info) or "Be sure to tell the lawyers about any reverse engineering you do" (resulting in teams going out to try to specifically research competitor information).

This type of "Don't ..., But if you do ..." situation resists degrading gracefully, but comes up more than I'd like.

On an unrelated note, when I first learned this term, I just had an image of a very refined woman at a fancy dinner party, taking a sip of her wine and then turning to her husband and saying, "Darling, I love you, but this is simply the worst affair you've ever dragged me out to."

gabriel-brito on A Different Perspective on Rationality - Would This Be Valuable?

Haha, sorry and thank you! Maybe now:
https://www.lesswrong.com/posts/WbQRxeCCmypgKrT7R/when-x-negotiatiates-with-y

gabriel-brito on A Different Perspective on Rationality - Would This Be Valuable?

Thank you so much for this insightful comment! Your words gave me just the encouragement I needed to go ahead with my post, and the references you mentioned were truly inspiring. Knowing about the tradition of using dialogues to explore complex ideas, from Gödel, Escher, Bach to Galileo’s Dialogue, helped me see the potential in this approach to reach different types of readers.

Thanks to your encouragement, I’ve now published my first post! I’d be thrilled to hear any feedback you have, as you so kindly offered. Here’s the link: When X Negotiates with Y [LW · GW]. I hope you enjoy it, and thank you again for your support 😊.

austin-chen on 5 homegrown EA projects, seeking small donors

@Matt Putz [LW · GW] thanks for supporting Gavin's work and letting us know; I'm very happy to hear that my post helped you find this!

I also encourage others to check out OP's RFPs. I don't know about Gavin, but I was peripherally aware of this RFP, and it wasn't obvious to me that Gavin should have considered applying, for these reasons:

Gavin's work seems aimed internally towards existing EA folks, while this RFP's media/comms examples (at a glance) seems to be aimed externally for public-facing outreach
I'm not sure what the typical grant size that the OP RFP is targeting, but my cached heuristic is that OP tends to fund projects looking for $100k+ and that smaller projects should look elsewhere (eg through EAIF or LTFF), due to grantmaker capacity constraints on OP's side
Relatedly, the idea of filling out an OP RFP seems somewhat time-consuming and burdensome (eg somewhere between 3 hours and 2 days), so I think many grantees might not consider doing so unless asking for large amounts
Also, the RFP form seems to indicate a turnaround time of 3 months, which might have seemed too slow for a project like Gavin's

I'm evidently wrong on all these points given that OP is going to fund Gavin's project, which is great! So I'm listing these in the spirit of feedback. Some easy wins to encourage smaller projects to apply might be to update the RFP page to 1. list some example grants and grant sizes that were sourced through this, and 2. describe how much time you expect an applicant to take to fill out the form (something EA Funds does, which I appreciate, even if I invariably take much more time than they state).

cstinesublime on keltan's Shortform

I'm curious why you opted for Aristotle (albeit "modern") as the prompt pre-load? Most of those responses seem not directly tethered to Aristotelian concepts/books or even what he directly posits as being the most important skills and faculties of human cognition. For example, cold reading, I don't recall anything of the sort anywhere in any Aristotle I've read.

While we're not sure Aristotle himself designed the layout of the corpus, we do know that in the Nicomachean Ethics lists the faculties of "whereby the soul attains Truth":

Techne (Τεχνε) - which refers to conventional ways of achieving goals, i.e. without deliberation
Episteme (Επιστήμε) - which is apodeiktike or the faculty of arguing from proofs
Phronesis (Φρονέσις) - confusingly translated as "practical wisdom" this refers to the ability to deliberate to attain goals by means of deliberation. Excellence in phronesis is translated by the latinate word 'Prudence'.
Sofia (Σοφια) - often translated as 'wisdom' - Aristotle calls this the investigation of causes.
Nous (Νους ) - which refers to the archai - or the 'first principles'

According to Diogenes Laertius, the corpus (at least as it has come to us) divides into the practical books and the theoretical - the practical itself would be subdivided between the books on Techne (say Rhetoric and Poetics), and Phronesis (Ethics and Politics), the theoretical is then covered in works like the Metaphysics (which is probably not even a cohesive book, but a hodge-podge), Categories etc. etc.

This would appear to me to be a better guide for the timeless education in Aristotelian tradition and how we should guide a modern adaptation.