LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Educational CAI: Aligning a Language Model with Pedagogical Theories
Bharath Puranam (bharath-puranam) · 2024-11-01T18:55:26.993Z · comments (1)

[link] Join the $10K AutoHack 2024 Tournament
Paul Bricman (paulbricman) · 2024-09-25T11:54:20.112Z · comments (0)

Using LLM's for AI Foundation research and the Simple Solution assumption
Donald Hobson (donald-hobson) · 2024-09-24T11:00:53.658Z · comments (0)

Some reasons to start a project to stop harmful AI
Remmelt (remmelt-ellen) · 2024-08-22T16:23:34.132Z · comments (0)

[link] Universal basic income isn’t always AGI-proof
Kevin Kohler (KevinKohler) · 2024-09-05T15:39:18.389Z · comments (3)

Seeking mentorship
Kevin Afachao (kevin-afachao) · 2024-09-21T16:54:58.353Z · comments (0)

[question] Artificial V/S Organoid Intelligence
10xyz (10xyz-coder) · 2024-10-23T14:31:46.385Z · answers+comments (0)

[link] Could Things Be Very Different?—How Historical Inertia Might Blind Us To Optimal Solutions
James Stephen Brown (james-brown) · 2024-09-11T09:53:07.474Z · comments (0)

[link] AI Safety Newsletter #41: The Next Generation of Compute Scale Plus, Ranking Models by Susceptibility to Jailbreaking, and Machine Ethics
Corin Katzke (corin-katzke) · 2024-09-11T19:14:08.274Z · comments (1)

If I care about measure, choices have additional burden (+AI generated LW-comments)
avturchin · 2024-11-15T10:27:15.212Z · comments (10)

Can Current LLMs be Trusted To Produce Paperclips Safely?
Rohit Chatterjee (rohit-c) · 2024-08-19T17:17:07.530Z · comments (0)

[link] 2024 Election Forecasting Contest
mike20731 · 2024-10-05T20:43:16.203Z · comments (0)

Tbilisi Georgia - ACX Meetups Everywhere Fall 2024
Dmitrii (dmitrii) · 2024-08-29T18:36:43.223Z · comments (4)

Bellevue-Redmond USA - ACX Meetups Everywhere Fall 2024
Cedar (xida-ren) · 2024-08-29T18:43:57.014Z · comments (8)

Ways to think about alignment
Abhimanyu Pallavi Sudhir (abhimanyu-pallavi-sudhir) · 2024-10-27T01:40:50.762Z · comments (0)

Some Comments on Recent AI Safety Developments
testingthewaters · 2024-11-09T16:44:58.936Z · comments (0)

[question] How do you follow AI (safety) news?
PeterH · 2024-09-24T13:58:48.916Z · answers+comments (2)

Jailbreaking ChatGPT and Claude using Web API Context Injection
Jaehyuk Lim (jason-l) · 2024-10-21T21:34:37.579Z · comments (0)

[link] The ELYSIUM Proposal - Extrapolated voLitions Yielding Separate Individualized Utopias for Mankind
Roko · 2024-10-16T01:24:51.102Z · comments (18)

Tokyo (日本語) Japan - ACX Meetups Everywhere Fall 2024
Emi (emi-2) · 2024-08-29T18:35:28.013Z · comments (0)

Interest poll: A time-waster blocker for desktop Linux programs
nahoj · 2024-08-22T20:44:04.479Z · comments (5)

Towards a Clever Hans Test: Unmasking Sentience Biases in Chatbot Interactions
glykokalyx · 2024-11-10T22:34:58.956Z · comments (0)

[question] is there a big dictionary somewhere with all your jargon and acronyms and whatnot?
KvmanThinking (avery-liu) · 2024-10-17T11:30:50.937Z · answers+comments (7)

[question] Is there a known method to find others who came across the same potential infohazard without spoiling it to the public?
hive · 2024-10-17T10:47:05.099Z · answers+comments (6)

[link] A Logical Proof for the Emergence and Substrate Independence of Sentience
rife (edgar-muniz) · 2024-10-24T21:08:09.398Z · comments (31)

It is time to start war gaming for AGI
yanni kyriacos (yanni) · 2024-10-17T05:14:17.932Z · comments (1)

Likelihood calculation with duobels
Martin Gerdes (martin-gerdes) · 2024-10-01T16:21:01.268Z · comments (0)

[question] Noticing the World
EvolutionByDesign (bioluminescent-darkness) · 2024-11-04T16:41:44.696Z · answers+comments (1)

[question] How do you finish your tasks faster?
Cipolla · 2024-08-21T20:01:41.306Z · answers+comments (2)

Effects of Non-Uniform Sparsity on Superposition in Toy Models
Shreyans Jain (shreyans-jain) · 2024-11-14T16:59:43.234Z · comments (3)

[link] Predictions as Public Works Project — What Metaculus Is Building Next
ChristianWilliams · 2024-10-22T16:35:13.999Z · comments (0)

Building Safer AI from the Ground Up: Steering Model Behavior via Pre-Training Data Curation
Antonio Clarke (antonio-clarke) · 2024-09-29T18:48:23.308Z · comments (0)

[question] What are the primary drivers that caused selection pressure for intelligence in humans?
Towards_Keeperhood (Simon Skade) · 2024-11-07T09:40:20.275Z · answers+comments (15)

[question] Is OpenAI net negative for AI Safety?
Lysandre Terrisse · 2024-11-02T16:18:02.859Z · answers+comments (0)

What are Emotions?
Myles H (zarsou9) · 2024-11-15T04:20:27.388Z · comments (8)

Developmental Stages in Multi-Problem Grokking
James Sullivan · 2024-09-29T18:58:22.954Z · comments (0)

Transformers Explained (Again)
RohanS · 2024-10-22T04:06:33.646Z · comments (0)

Hamiltonian Dynamics in AI: A Novel Approach to Optimizing Reasoning in Language Models
Javier Marin Valenzuela (javier-marin-valenzuela) · 2024-10-09T19:14:56.162Z · comments (0)

Near-death experiences
Declan Molony (declan-molony) · 2024-10-08T06:34:04.107Z · comments (1)

On the Practical Applications of Interpretability
Nick Jiang (nick-jiang) · 2024-10-15T17:18:25.280Z · comments (0)

Bellevue Meetup
Cedar (xida-ren) · 2024-10-16T01:07:58.761Z · comments (0)

San Francisco ACX Meetup “First Saturday”
Nate Sternberg (nate-sternberg) · 2024-09-29T03:13:34.615Z · comments (0)

[question] EndeavorOTC legit?
FinalFormal2 · 2024-10-17T01:33:12.606Z · answers+comments (0)

Methodology: Contagious Beliefs
James Stephen Brown (james-brown) · 2024-10-19T03:58:17.966Z · comments (0)

[link] Podcast discussing Hanson's Cultural Drift Argument
vaishnav92 · 2024-10-20T17:58:41.416Z · comments (0)

On Measuring Intellectual Performance - personal experience and several thoughts
Alexander Gufan (alexander-gufan) · 2024-09-20T17:21:19.747Z · comments (2)

Collapsing “Collapsing the Belief/Knowledge Distinction”
Jeremias (jeremias-sur) · 2024-09-20T16:11:33.558Z · comments (0)

Endogenous Growth and Human Intelligence
Nicholas D. (nicholas-d) · 2024-09-18T14:05:54.567Z · comments (0)

Personal Philosophy
Xor · 2024-10-13T03:01:59.324Z · comments (0)

For Limited Superintelligences, Epistemic Exclusion is Harder than Robustness to Logical Exploitation
Lorec · 2024-09-15T20:49:06.370Z · comments (9)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

quinn-dougherty on Alexander Gietelink Oldenziel's Shortform

my dude, top level post- this does not read like a shortform

seth-herd on OpenAI Email Archives (from Musk v. Altman)

I agree that there was a lot more to that exchange than that quick summary.

My point was that there wasn't enough or it wasn't careful enough.

buck on Win/continue/lose scenarios and execute/replace/audit protocols

Thanks, this is closely related!

nathan-helm-burger on Alexander Gietelink Oldenziel's Shortform

The two suggestions that come to mind after brief thought are:

Search the internet for prompts others have found to work for this. I expect a fairly lengthy and complicated prompt would do better than a short straightforward one.
Use a base model as a source of creativity, then run that output through a chat model to clean it up (grammar, logical consistency, etc)

gwern on Happiness and Goodness as Universal Terminal Virtues

Coffee culture in America doesn't have much to do with the Revolutionary War. The rise of coffee is much later than the American Revolution. The brief boycott didn't last (after all, Americans - infamous smugglers in general - were smuggling plenty of tea because of the taxes, so sourcing tea was not a problem) and there was enormous consumption of tea consistently throughout: https://en.wikipedia.org/wiki/American_tea_culture#Colonial_and_Revolutionary_eras In fact, I was surprised to learn recently that American tea was overwhelmingly green tea in the 1800s, and one of the biggest export markets for green tea worldwide.

(This was really surprising to me, because if you look around the 1900s, even as late as the 1990s, black tea is the standard American tea; all iced tea is of course black tea, and your local grocery store would be full of mostly just black teas with a few token green teas, and exactly one oolong tea if you were lucky - as I found out the hard way when I became interested in non-black teas.)

mr-hire on Ayn Rand’s model of “living money”; and an upside of burnout

I often talk about w/ clients burnout as your subconscious/parts 'going on strike' because you've ignored them for too long

I never made the analogy to Atlas Shrugged and the live money leaving the dead money because it wasn't actually tending to the needs of the system, but now you've got me thinking

lao-mein on Making a conservative case for alignment

I will once again recommend Elizer go on the Glenn Beck Show [LW · GW].

mr-hire on Is the Power Grid Sustainable?

really, say more?

romeostevensit on OpenAI Email Archives (from Musk v. Altman)

My personal experience was that superintelligence made it harder to think clearly about AI by making lots of distinctions and few claims.

quila on notfnofn's Shortform

from ycombinator comments on a post of that:^[1]

(links to comment by ~~original user~~ a different account^[2] continuing the chat):

CDN media

^{^}
(sharing just because it's relevant info)
^{^}
source: https://old.reddit.com/r/singularity/comments/1gqss21/gemini_freaks_out_after_the_user_keeps_asking_to/lx0v9se/