LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

My take on Jacob Cannell’s take on AGI safety
Steven Byrnes (steve2152) · 2022-11-28T14:01:15.584Z · comments (15)

K-types vs T-types — what priors do you have?
Cleo Nardo (strawberry calm) · 2022-11-03T11:29:00.809Z · comments (25)

[link] Far-UVC Light Update: No, LEDs are not around the corner (tweetstorm)
Davidmanheim · 2022-11-02T12:57:23.445Z · comments (27)

Don't design agents which exploit adversarial inputs
TurnTrout · 2022-11-18T01:48:38.372Z · comments (64)

Why Would AI "Aim" To Defeat Humanity?
HoldenKarnofsky · 2022-11-29T19:30:07.828Z · comments (10)

[link] Real-Time Research Recording: Can a Transformer Re-Derive Positional Info?
Neel Nanda (neel-nanda-1) · 2022-11-01T23:56:06.215Z · comments (16)

All AGI Safety questions welcome (especially basic ones) [~monthly thread]
Robert Miles (robert-miles) · 2022-11-01T23:23:04.146Z · comments (105)

Against "Classic Style"
Cleo Nardo (strawberry calm) · 2022-11-23T22:10:50.422Z · comments (30)

2022 LessWrong Census?
SurfingOrca · 2022-11-07T05:16:33.207Z · comments (13)

The First Filter
adamShimi · 2022-11-26T19:37:04.607Z · comments (5)

[link] Career Scouting: Dentistry
koratkar · 2022-11-20T15:55:12.431Z · comments (5)

Clarifying wireheading terminology
leogao · 2022-11-24T04:53:23.925Z · comments (6)

Deontology and virtue ethics as "effective theories" of consequentialist ethics
Jan_Kulveit · 2022-11-17T14:11:49.087Z · comments (9)

Announcing AI safety Mentors and Mentees
Marius Hobbhahn (marius-hobbhahn) · 2022-11-23T15:21:12.636Z · comments (7)

Against a General Factor of Doom
Jeffrey Heninger (jeffrey-heninger) · 2022-11-23T16:50:04.229Z · comments (19)

Here's the exit.
Valentine · 2022-11-21T18:07:23.607Z · comments (178)

Alignment allows "nonrobust" decision-influences and doesn't require robust grading
TurnTrout · 2022-11-29T06:23:00.394Z · comments (42)

What’s the Deal with Elon Musk and Twitter?
Zvi · 2022-11-07T13:50:00.991Z · comments (11)

[link] New Frontiers in Mojibake
Adam Scherlis (adam-scherlis) · 2022-11-26T02:37:27.290Z · comments (7)

The Least Controversial Application of Geometric Rationality
Scott Garrabrant · 2022-11-25T16:50:56.497Z · comments (22)

FTX will probably be sold at a steep discount. What we know and some forecasts on what will happen next
Nathan Young · 2022-11-09T02:14:19.623Z · comments (21)

[link] Could a single alien message destroy us?
Writer · 2022-11-25T07:32:24.889Z · comments (23)

Open technical problem: A Quinean proof of Löb's theorem, for an easier cartoon guide
Andrew_Critch · 2022-11-24T21:16:43.879Z · comments (35)

Humans do acausal coordination all the time
Adam Jermyn (adam-jermyn) · 2022-11-02T14:40:39.730Z · comments (35)

A philosopher's critique of RLHF
ThomasW (ThomasWoodside) · 2022-11-07T02:42:51.234Z · comments (8)

Announcing Nonlinear Emergency Funding
KatWoods (ea247) · 2022-11-13T19:02:57.803Z · comments (0)

Human-level Diplomacy was my fire alarm
Lao Mein (derpherpize) · 2022-11-23T10:05:36.127Z · comments (15)

Some advice on independent research
Marius Hobbhahn (marius-hobbhahn) · 2022-11-08T14:46:19.134Z · comments (5)

Kelsey Piper's recent interview of SBF
agucova · 2022-11-16T20:30:35.901Z · comments (29)

What's the Alternative to Independence?
jefftk (jkaufman) · 2022-11-13T15:30:01.186Z · comments (3)

Human-level Full-Press Diplomacy (some bare facts).
Cleo Nardo (strawberry calm) · 2022-11-22T20:59:18.155Z · comments (7)

Noting an unsubstantiated communal belief about the FTX disaster
Yitz (yitz) · 2022-11-13T05:37:03.087Z · comments (52)

Developer experience for the motivation
Adam Zerner (adamzerner) · 2022-11-16T07:12:19.893Z · comments (7)

"Rudeness", a useful coordination mechanic
Raemon · 2022-11-11T22:27:35.023Z · comments (20)

A Mystery About High Dimensional Concept Encoding
Fabien Roger (Fabien) · 2022-11-03T17:05:56.034Z · comments (13)

Information Markets
eva_ · 2022-11-02T01:24:11.639Z · comments (6)

A Short Dialogue on the Meaning of Reward Functions
Leon Lang (leon-lang) · 2022-11-19T21:04:30.076Z · comments (0)

For ELK truth is mostly a distraction
c.trout (ctrout) · 2022-11-04T21:14:52.279Z · comments (0)

[link] The FTX Saga - Simplified
Annapurna (jorge-velez) · 2022-11-16T02:42:55.739Z · comments (10)

Spectrum of Independence
jefftk (jkaufman) · 2022-11-05T02:40:03.822Z · comments (7)

The biological function of love for non-kin is to gain the trust of people we cannot deceive
chaosmage · 2022-11-07T20:26:29.876Z · comments (3)

Rationalist Town Hall: FTX Fallout Edition (RSVP Required)
Ben Pace (Benito) · 2022-11-23T01:38:25.516Z · comments (13)

The optimal angle for a solar boiler is different than for a solar panel
Yair Halberstadt (yair-halberstadt) · 2022-11-10T10:32:47.187Z · comments (4)

Weekly Roundup #4
Zvi · 2022-11-04T15:00:01.096Z · comments (1)

A newcomer’s guide to the technical AI safety field
zeshen · 2022-11-04T14:29:46.873Z · comments (3)

We must be very clear: fraud in the service of effective altruism is unacceptable
evhub · 2022-11-10T23:31:06.422Z · comments (56)

Don't align agents to evaluations of plans
TurnTrout · 2022-11-26T21:16:23.425Z · comments (49)

Why square errors?
Aprillion (Peter Hozák) (Aprillion) · 2022-11-26T13:40:37.318Z · comments (11)

Counterfactability
Scott Garrabrant · 2022-11-07T05:39:05.668Z · comments (4)

[link] Scott Aaronson on "Reform AI Alignment"
shminux · 2022-11-20T22:20:23.895Z · comments (17)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

gurkenglas on "No-one in my org puts money in their pension"

Sure, he's trying to cause alarm via alleged excerpts from his life. Surely society should have some way to move to a state of alarm iff that's appropriate, do you see a better protocol than this one?

algon on Why you should learn a musical instrument

No, it's just that my prior says nootropics almost never work so I was wondering if you had some data suggesting this did e.g. by dowing a RCT on yourself or using signal processing techniques to detect if supplementing this stuff lead to a causal change in reflex times or so forth.

EDIT: Though I am vegan and I'm really ignorant about what makes for a good diet. So I'd be curious to hear why it's helpful for vegans to take this stuff.

bideup on Stephen Fowler's Shortform

Can anybody confirm whether Paul is likely systematically silenced re OpenAI?

keltan on Stephen Fowler's Shortform

I’d like to see people who are more informed than I am have a conversation about this. Maybe at Less.online?

https://www.lesswrong.com/posts/zAqqeXcau9y2yiJdi/can-we-build-a-better-public-doublecrux [LW · GW]

keltan on Why you should learn a musical instrument

Only bc I’m vegan. If I wasn’t, I wouldn’t be supplementing it.

I wish I could say I had a more accurate model. But my understanding doesn’t go deeper than DHA = Myelin = Faster processing

Was this purely a question? Or is there something I should look into here?

algon on Why you should learn a musical instrument

Why do you think DHA algea powder works?

martin-vlach on Language Models Model Us

honestly the code linked is not that complicated..: https://github.com/eggsyntax/py-user-knowledge/blob/aa6c5e57fbd24b0d453bb808b4cc780353f18951/openai_uk.py#L11

martin-vlach on Language Models Model Us

To work around the non-top-n you can supply logit_bias list to the API.

martin-vlach on Language Models Model Us

As the Llama3 70B base model is said very clean( unlike base DeepSeek for example, which is instruction-spoiled already) and similarly capable to GPT3.5, you could explore that hypothesis.
Details: Check Groq or TogetherAI for free inference, not sure if test data would fit Llama3 context window.

mesaoptimizer on Stephen Fowler's Shortform

I just realized that Paul Christiano and Dario Amodei both probably have signed non-disclosure + non-disparagement contracts since they both left OpenAI.

That impacts how I'd interpret Paul's (and Dario's) claims and opinions (or the lack thereof), that relates to OpenAI or alignment proposals entangled with what OpenAI is doing. If Paul has systematically silenced himself, and a large amount of OpenPhil and SFF money has been mis-allocated because of systematically skewed beliefs that these organizations have had due to Paul's opinions or lack thereof, well. I don't think this is the case though -- I expect Paul, Dario, and Holden all seem to have converged on similar beliefs (whether they track reality or not) and have taken actions consistent with those beliefs.