LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

The first future and the best future
KatjaGrace · 2024-04-25T06:40:04.510Z · comments (12)

Demystifying "Alignment" through a Comic
milanrosko · 2024-06-09T08:24:22.454Z · comments (19)

Skills I'd like my collaborators to have
Raemon · 2024-02-09T08:20:37.686Z · comments (9)

One Day Sooner
Screwtape · 2023-11-02T19:00:58.427Z · comments (7)

Picking Mentors For Research Programmes
Raymond D · 2023-11-10T13:01:14.197Z · comments (8)

Why I'm doing PauseAI
Joseph Miller (Josephm) · 2024-04-30T16:21:54.156Z · comments (16)

New LessWrong feature: Dialogue Matching
jacobjacob · 2023-11-16T21:27:16.763Z · comments (22)

The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks
Lucius Bushnaq (Lblack) · 2024-05-20T17:53:25.985Z · comments (4)

[link] A case for AI alignment being difficult
jessicata (jessica.liu.taylor) · 2023-12-31T19:55:26.130Z · comments (56)

New LessWrong review winner UI ("The LeastWrong" section and full-art post pages)
kave · 2024-02-28T02:42:05.801Z · comments (64)

On the future of language models
owencb · 2023-12-20T16:58:28.433Z · comments (17)

Scaling and evaluating sparse autoencoders
leogao · 2024-06-06T22:50:39.440Z · comments (6)

A bird's eye view of ARC's research
Jacob_Hilton · 2024-10-23T15:50:06.123Z · comments (12)

[link] My techno-optimism [By Vitalik Buterin]
habryka (habryka4) · 2023-11-27T23:53:35.859Z · comments (17)

In favour of exploring nagging doubts about x-risk
owencb · 2024-06-25T23:52:01.322Z · comments (2)

[question] What convincing warning shot could help prevent extinction from AI?
Charbel-Raphaël (charbel-raphael-segerie) · 2024-04-13T18:09:29.096Z · answers+comments (18)

Nonlinear’s Evidence: Debunking False and Misleading Claims
KatWoods (ea247) · 2023-12-12T13:16:12.008Z · comments (171)

Deception Chess: Game #1
Zane · 2023-11-03T21:13:55.777Z · comments (19)

Backdoors as an analogy for deceptive alignment
Jacob_Hilton · 2024-09-06T15:30:06.172Z · comments (2)

[link] Transformer Circuit Faithfulness Metrics Are Not Robust
Joseph Miller (Josephm) · 2024-07-12T03:47:30.077Z · comments (5)

SAE reconstruction errors are (empirically) pathological
wesg (wes-gurnee) · 2024-03-29T16:37:29.608Z · comments (16)

[link] Carl Sagan, nuking the moon, and not nuking the moon
eukaryote · 2024-04-13T04:08:50.166Z · comments (8)

Dreams of AI alignment: The danger of suggestive names
TurnTrout · 2024-02-10T01:22:51.715Z · comments (59)

[link] The Witness
Richard_Ngo (ricraz) · 2023-12-03T22:27:16.248Z · comments (4)

LLMs can learn about themselves by introspection
Felix J Binder (fjb) · 2024-10-18T16:12:51.231Z · comments (38)

[link] Poker is a bad game for teaching epistemics. Figgie is a better one.
rossry · 2024-07-08T06:05:20.459Z · comments (47)

What happens if you present 500 people with an argument that AI is risky?
KatjaGrace · 2024-09-04T16:40:03.562Z · comments (7)

[link] Notes from a Prompt Factory
Richard_Ngo (ricraz) · 2024-03-10T05:13:39.384Z · comments (19)

Lsusr's Rationality Dojo
lsusr · 2024-02-13T05:52:03.757Z · comments (17)

[link] A Chess-GPT Linear Emergent World Representation
Adam Karvonen (karvonenadam) · 2024-02-08T04:25:15.222Z · comments (14)

Response to nostalgebraist: proudly waving my moral-antirealist battle flag
Steven Byrnes (steve2152) · 2024-05-29T16:48:29.408Z · comments (29)

LLM Applications I Want To See
sarahconstantin · 2024-08-19T21:10:03.101Z · comments (5)

Me, Myself, and AI: the Situational Awareness Dataset (SAD) for LLMs
L Rudolf L (LRudL) · 2024-07-08T22:24:38.441Z · comments (28)

On Dwarksh’s Podcast with Leopold Aschenbrenner
Zvi · 2024-06-10T12:40:03.348Z · comments (7)

General Thoughts on Secular Solstice
Jeffrey Heninger (jeffrey-heninger) · 2024-03-23T18:48:43.940Z · comments (60)

[link] LessOnline (May 31—June 2, Berkeley, CA)
Ben Pace (Benito) · 2024-03-26T02:34:00.000Z · comments (24)

A simple model of math skill
Alex_Altair · 2024-07-21T18:57:33.697Z · comments (16)

Behavioral red-teaming is unlikely to produce clear, strong evidence that models aren't scheming
Buck · 2024-10-10T13:36:53.810Z · comments (2)

On the Executive Order
Zvi · 2023-11-01T14:20:01.657Z · comments (4)

[link] Advice for journalists
Nathan Young · 2024-10-07T16:46:40.929Z · comments (53)

[link] Advice for Activists from the History of Environmentalism
Jeffrey Heninger (jeffrey-heninger) · 2024-05-16T18:40:02.064Z · comments (8)

Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small
Joseph Bloom (Jbloom) · 2024-02-02T06:54:53.392Z · comments (37)

[link] The Minority Faction
Richard_Ngo (ricraz) · 2024-06-24T20:01:27.436Z · comments (6)

[link] CIV: a story
Richard_Ngo (ricraz) · 2024-06-15T22:36:50.415Z · comments (6)

[link] My cover story in Jacobin on AI capitalism and the x-risk debates
garrison · 2024-02-12T23:34:16.526Z · comments (5)

Announcing the London Initiative for Safe AI (LISA)
James Fox · 2024-02-02T23:17:47.011Z · comments (0)

[link] "Deep Learning" Is Function Approximation
Zack_M_Davis · 2024-03-21T17:50:36.254Z · comments (28)

[Valence series] 1. Introduction
Steven Byrnes (steve2152) · 2023-12-04T15:40:21.274Z · comments (14)

Why comparative advantage does not help horses
Sherrinford · 2024-09-30T22:27:57.450Z · comments (10)

Learning-theoretic agenda reading list
Vanessa Kosoy (vanessa-kosoy) · 2023-11-09T17:25:35.046Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

linda-linsefors on AI Safety Camp 10

You can find their prefeed contact info in each document in the Team section.

mikkel-wilson on The hostile telepaths problem

For the record, this sentence popped into my head while reading this: "Wait, but what if I'm Omega-V, and [Valentine] is a two boxer?"

abramdemski on Three Notions of "Power"

It seems to me that the importance and interaction of these different types of power in the future depends a lot on our choices now, ie, what kind of future we shape. Hierarchies could get smashed in one way or another, making John's prediction correct, or we could engineer a future that evolves from the present more smoothly, in which case you'd be correct.

sarahconstantin on sarahconstantin's Shortform

links 10/30/2024: https://roamresearch.com/#/app/srcpublic/page/10-30-2024

https://pmc.ncbi.nlm.nih.gov/articles/PMC10136898/ FRET is a biosensor modality.
- "FRET is a non-radiative transfer of energy from an excited donor fluorophore molecule to a nearby acceptor fluorophore molecule...When the biomolecule of interest is present, it can cause a change in the distance between the donor and acceptor, leading to a change in the efficiency of FRET and a corresponding change in the fluorescence intensity of the acceptor. This change in fluorescence can be used to detect and quantify the biomolecule of interest."
- advantages:
  - real-time
  - non-destructive
  - sensitive to very low concentrations (picomolar and nanomolar)
  - highly specific because it detects conformational changes in biological molecules
- this article is from a not-great journal and the author clearly does not have English as a first language... at some point i will need a more reputable source, this was from googling FRET quickly
https://www.astralcodexten.com/p/the-case-against-proposition-36 Clara Collier gives the narrow, evidence-based case that shorter jail sentences didn't cause California's property crime wave or drug overdose death epidemic, and longer jail sentences won't fix those problems
- I'm pretty convinced but I don't follow this topic in great detail
metastatic malignant peripheral nerve sheath tumor is pretty bad -- median survival is only 8 months after metastases are detected. but one M.O. that seems to help in several case studies is "sequence the tumor, find a mutation, use a drug that's approved for other cancer types with the same mutation."
- PD-L1 overexpression? use a PD-1 inhibitor! checkpoint immunotherapy stays winning.
- BRAF V600E mutation? try a BRAF inhibitor!
  - https://jnccn.org/view/journals/jnccn/11/12/article-p1466.xml vemurafenib
- other Raf stuff: maybe sorafenib?
  - https://www.tandfonline.com/doi/abs/10.4161/cbt.7.6.5932
- shit that doesn't work:
  - sirolimus https://onlinelibrary.wiley.com/doi/full/10.1155/2020/5784876
- chemo is...not great but better than nothing. some partial responses, no complete responses, survival extended by maybe a few months. mostly it seems best to have doxorubicin in the mix.
  - https://onlinelibrary.wiley.com/doi/full/10.1155/2017/8685638
  - https://ascopubs.org/doi/abs/10.1200/JCO.2024.42.16_suppl.11583
  - https://www.sciencedirect.com/science/article/pii/S0923753419377907
  - https://ascopubs.org/doi/abs/10.1200/jco.2010.28.15_suppl.e20512
  - https://onlinelibrary.wiley.com/doi/full/10.1155/2011/705345 ok here's a complete response to chemo + surgery. it can ever happen.
  - https://ar.iiarjournals.org/content/40/3/1619.short case of long-term survival after keeping chemotherapy going a *really long time* at gradually decreasing dose and widening inter-treatment interval.
- https://onlinelibrary.wiley.com/doi/abs/10.1002/ijc.33201 pazopanib, an angiogenesis inhibitor, similarly has a low response rate but can extend survival a bit
https://proof-scaling-meeting.vercel.app/ formal verification conference
https://chalmermagne.substack.com/p/death-by-a-thousand-roundtables what it's actually like to work in UK policy. sounds dismal.
https://www.972mag.com/lavender-ai-israeli-army-gaza/ AI bombing. critical perspective on Israel.
https://goingon.org/ a timeline-based, "citizen journalism" news site.
https://statistics.berkeley.edu/about/news/steinhardt-announces-co-founding-transluce-non-profit-ai-research-lab AI interpretability nonprofit, Jacob Steinhardt
- mech-interp seems like straightforwardly real and good work from a variety of perspectives on AI. helps with many risk scenarios including some x-risk scenarios; helps make the technology stronger & more reliable, which is good for the industry in the long run.
https://blog.benjaminreinhardt.com/young-people-technical-training this is straightforwardly true, yes, you should learn technical stuff.
https://www.washingtonpost.com/opinions/2024/10/28/jeff-bezos-washington-post-trust/ Jeff Bezos on why the Washington Post isn't endorsing a Presidential candidate. this is a solidly written persuasive essay; it seemed legit to me, but I could be persuaded otherwise.

romeostevensit on The hostile telepaths problem

It's worth noting that many therapists break therapeutic alliance for ideological or liability reasons and this is one of the reasons that self therapy, peer therapy, llms, and workbooks can sometimes be better.

anthonyc on Occupational Licensing Roundup #1

The obvious follow up is indeed ‘why only military’? License portability should be universal.

I assume the short answer is: This is a group that is small enough not to credibly threaten anyone's wages, sympathetic/virtuous enough that people feel uncomfortable or expect to pay a price for arguing against them, and not actually in control of where they live. If nothing else it opens up surface area for other attack vectors, once people realize with data that those dastardly New Mexican plumbers can also safely fix toilets in Denver.

On music therapists - something like 15-20 states require this, yes. My wife is a therapeutic musician, not a music therapist (trained to play live music for patients in clinical settings, not interactive, not part of an ongoing treatment plan) - not a whole lot of overlap despite the similar names). There was have been times where MTs have tried to get legislation passed to prevent TMs from practicing.

Personally I'm amazed that this hasn't come under fire over the decades by more litigation under the Interstate Commerce and Full Faith and Credit clauses. Somehow it applies to driver's licenses and vehicle registrations but not (again in CO because I have the list open) landscape architects and... boxers? Really? Especially for things like telemedicine, where it's illegal for my therapist to talk to me on the phone or send me a text if I'm not physically in their state. Theoretically if they're part of a larger practice it's illegal for their own office to talk to them if they're traveling. I'm honestly baffled how this would stand up to real legal/constitutional scrutiny. say, I'm traveling in NH, and call my doctor's office in MA to ask them to refill a prescription at a pharmacy near me, and the office calls my doctor who happens to be at a conference in CT that that to confirm what they should do. On what grounds could NH and CT claim this is accomplishing anything whatsoever? As the patient I can move (most) prescriptions around between pharmacies and states whenever I want, I can see doctors in whatever state I want. As a doctor, you can see patients from whatever state you want as long as they travel to you and not the other way around.

romeostevensit on The hostile telepaths problem

Agree with the approach with the caveat that some people in group 2 are naive cooperators and therefore second order defectors since they are suckers for group 1. Eg the person who will tell the truth to the Nazis out of mistaken theories of ethics or just behavioral conditioning.

romeostevensit on The hostile telepaths problem

I was reading this earlier and it dovetails very well with this post. Framing defending yourself against hostile people and processes as primarily selfish itself serves the hostile.

steve2152 on The Alignment Trap: AI Safety as Path to Power

When we develop mechanisms to control AI systems, we are essentially creating tools that could be used by any sufficiently powerful entity - whether that's a government, corporation, or other organization. The very features that make an AI system "safe" in terms of human control could make it a more effective instrument of power consolidation.

…And if we fail to develop such mechanisms, AI systems will still be an “instrument of power consolidation”, but the power being consolidated will be the AI’s own power, right?

I mean, 90% of this article—the discussion of offense-defense balance, and limits on human power and coordination—applies equally to “humans using AI to get power” versus “AI getting power for its own purposes”, right?

E.g. out-of-control misaligned AI is still an “enabler of coherent entities”, because it can coordinate with copies of itself.

I guess you’re not explicitly arguing against “open publication of safety advances” but just raising a point of consideration? Anyway, a more balanced discussion of the pros and cons of “open publication of safety advances” would include:

Is “humans using AI to get power” less bad versus more bad than “AI getting power for its own purposes”? (I lean towards “probably less bad but it sure depends on the humans and the AI”)
If AI obedience is an unsolved technical problem to such-and-such degree, to what extent does that lead to people not developing ever-more-powerful AI anyway? (I lean towards “not much”, cf. Meta / LeCun today [LW · GW], or the entire history of AI)
Is the sentence “in reality we should expect combined human-AI entities to reach dangerous capabilities before pure artificial intelligence” really true, and if so how much earlier and does it matter? (I lean towards “not necessarily true in the first place, and if true, probably not by much, and it’s not all that important”)

It’s probably a question that needs to be considered on a case-by-case basis anyway. ¯\_(ツ)_/¯

daphne_w on The Summoned Heroine's Prediction Markets Keep Providing Financial Services To The Demon King!

That's probably the only "military secret" that really matters.

The soldiers guarding the outer wall and the Citadel treasurer that pays their overtime wages would beg to differ.