LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Contextual Constitutional AI
aksh-n · 2024-09-28T23:24:43.529Z · comments (1)

Shutting down all competing AI projects might not buy a lot of time due to Internal Time Pressure
ThomasCederborg · 2024-10-03T00:01:34.011Z · comments (7)

How I'd like alignment to get done (as of 2024-10-18)
TristanTrim · 2024-10-18T23:39:03.107Z · comments (2)

The current state of RSPs
Zach Stein-Perlman · 2024-11-04T16:00:42.630Z · comments (0)

[link] AI Prejudices: Practical Implications
PeterMcCluskey · 2024-10-19T02:19:58.695Z · comments (0)

Editing at the Take Level
jefftk (jkaufman) · 2024-09-24T11:30:04.914Z · comments (1)

AXRP Episode 38.0 - Zhijing Jin on LLMs, Causality, and Multi-Agent Systems
DanielFilan · 2024-11-14T07:00:06.977Z · comments (0)

Source Control for Prototyping and Analysis
jefftk (jkaufman) · 2024-09-26T01:50:04.145Z · comments (0)

Do you want to do a debate on youtube? I'm looking for polite, truth-seeking participants.
Nathan Young · 2024-10-10T09:32:59.162Z · comments (0)

Can startups be impactful in AI safety?
Esben Kran (esben-kran) · 2024-09-13T19:00:33.306Z · comments (0)

Goal: Understand Intelligence
Johannes C. Mayer (johannes-c-mayer) · 2024-11-03T21:20:02.900Z · comments (19)

Lenses of Control
WillPetillo · 2024-10-22T07:51:06.355Z · comments (0)

Amoeba roles in tech
Sindhu Shivaprasad (sindhu-shivaprasad) · 2024-10-04T17:25:46.568Z · comments (0)

GPT-4o Can In Some Cases Solve Moderately Complicated Captchas
dirk (abandon) · 2024-11-09T04:04:37.782Z · comments (2)

Clarifying Alignment Fundamentals Through the Lens of Ontology
eternal/ephemera · 2024-10-07T20:57:33.238Z · comments (4)

ML4Good (AI Safety Bootcamp) - Experience report
JanEbbing · 2024-11-05T01:18:43.554Z · comments (0)

Evolutionary prompt optimization for SAE feature visualization
neverix · 2024-11-14T13:06:49.728Z · comments (0)

A Poem Is All You Need: Jailbreaking ChatGPT, Meta & More
Sharat Jacob Jacob (sharat-jacob-jacob) · 2024-10-29T12:41:30.337Z · comments (0)

LLM Psychometrics and Prompt-Induced Psychopathy
Korbinian K. (korbinian-koch) · 2024-10-18T18:11:24.256Z · comments (2)

Binary encoding as a simple explicit construction for superposition
tailcalled · 2024-10-12T21:18:31.731Z · comments (0)

Substituting Talkbox for Breath Controller
jefftk (jkaufman) · 2024-10-27T19:10:03.768Z · comments (0)

[link] Comparing Forecasting Track Records for AI Benchmarking and Beyond
ChristianWilliams · 2024-09-25T21:01:15.975Z · comments (0)

Updating the NAO Simulator
jefftk (jkaufman) · 2024-10-30T13:50:06.908Z · comments (0)

Beyond Defensive Technology
ejk64 · 2024-10-14T11:34:24.595Z · comments (1)

[link] The Computational Complexity of Circuit Discovery for Inner Interpretability
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-10-17T13:18:46.378Z · comments (2)

Self location for LLMs by LLMs: Self-Assessment Checklist.
weightt an (weightt-an) · 2024-09-26T19:57:31.707Z · comments (0)

What is Randomness?
martinkunev · 2024-09-27T17:49:42.704Z · comments (2)

[link] A primer on ML in antibody engineering
Abhishaike Mahajan (abhishaike-mahajan) · 2024-09-23T17:03:07.628Z · comments (0)

Switching to a 4GB SD
jefftk (jkaufman) · 2024-09-23T11:20:05.432Z · comments (1)

Switching to a Yamaha P-121 Keyboard
jefftk (jkaufman) · 2024-10-02T02:20:02.284Z · comments (0)

Palisade is hiring: Exec Assistant, Content Lead, Ops Lead, and Policy Lead
Charlie Rogers-Smith (charlie.rs) · 2024-10-09T00:04:03.837Z · comments (0)

[link] Anthropic - The case for targeted regulation
anaguma · 2024-11-05T07:07:48.174Z · comments (0)

Conversational Signposts—An Antidote to Dull Social Interactions
Declan Molony (declan-molony) · 2024-10-22T05:37:56.175Z · comments (6)

[link] [Linkpost] Building Altruistic and Moral AI Agent with Brain-inspired Affective Empathy Mechanisms
Gunnar_Zarncke · 2024-11-04T10:15:35.550Z · comments (0)

Spooky Recommendation System Scaling
phdead · 2024-10-31T22:00:51.728Z · comments (0)

[link] Intention-to-Treat (Re: How harmful is music, really?)
kqr · 2024-09-18T18:44:41.128Z · comments (0)

[link] OpenAI’s cybersecurity is probably regulated by NIS Regulations
Adam Jones (domdomegg) · 2024-10-25T11:06:38.392Z · comments (2)

Motte-and-Bailey: a Short Explanation
Lorec · 2024-10-23T22:29:55.074Z · comments (0)

[link] AISafety.info: What are Inductive Biases?
Algon · 2024-09-19T17:26:24.581Z · comments (4)

[question] LW resources on childhood experiences?
nahir91595 · 2024-10-14T17:04:07.810Z · answers+comments (7)

Just How Good Are Modern Chess Computers?
nem · 2024-09-19T18:57:21.254Z · comments (1)

Request for advice: Research for Conversational Game Theory for LLMs
Rome Viharo (rome-viharo) · 2024-10-16T17:53:30.243Z · comments (0)

[question] Using hex to get murder advice from GPT-4o
Laurence Freeman (laurence-freeman) · 2024-11-13T18:30:23.475Z · answers+comments (5)

[question] What's a good book for a technically-minded 11-year old?
Martin Sustrik (sustrik) · 2024-10-19T06:05:12.178Z · answers+comments (32)

[link] How harmful is music, really?
dkl9 · 2024-09-17T14:53:25.426Z · comments (6)

Making a Pedalboard
jefftk (jkaufman) · 2024-10-25T00:10:09.149Z · comments (0)

[question] Does life actually locally *increase* entropy?
tailcalled · 2024-09-16T20:30:33.148Z · answers+comments (27)

[question] How Should We Use Limited Time to Maximize Long-Term Impact?
queelius · 2024-10-12T20:02:46.801Z · answers+comments (3)

Festival Stats 2024
jefftk (jkaufman) · 2024-11-12T02:00:04.831Z · comments (0)

A Policy Proposal
phdead · 2024-09-29T20:45:34.745Z · comments (4)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

viliam on Why would ASI share any resources with us?

In this scenario, why would ASI not do either one of the following things: 1) Exploit humans in pursuit of its own goals, while giving us the barest minimum to survive (effectively making us slaves) or 2) Take over the resources of the entire solar system for itself and leave us starving without any resources?

The ASI will do what it is programmed to do. If it means helping humans, it will help humans. If there is a bug in the program, it will... do something that is difficult to predict (and that sounds scary, because most random things are not good).

Make us slaves? We probably wouldn't be useful slaves, compared to alternatives, such as robots, or human bodies with brains replaced by computers.

Taking over the resources probably means killing us in the process, if those resources include e.g. water or oxygen of Earth.

atillayasar on AtillaYasar's Shortform

When is philosophy useful?

Meta

This post is useful to me because 1) it helped me think more clearly about whether and how exactly philosophy is useful, 2) I can read it later and again get the benefits of (1).

The problem

Doing philosophy and reading/writing is often so impractical, people do it just for the sake of doing it. When you write or read a bunch of stuff about X, get categories and lists and definitions, it feels like you're making progress on X, but are you really?

Joseph Goldstein (meditation teacher) at the beginning of his lecture about Mindfulness, jokes that they'll do an hour and a half of meditation, then after pausing for laughter points out that that would actually be more useful than anything he could say on the subject.

Criteria

The way to tell if philosophy is useful is, if it actually influences the future, if you:
- directly use the information for an action or decision
- use the material OR the wordless intuitions gained from the material in your future thinking (times the probability that you'll use it for a future action or decision)
- refute the bad ideas that you read or write (pruning is progress too!)

(slight caveat: reading and writing and thinking, makes you better at those things and it creates positive habits, even if it's not "object level useful". But still! I want to train my skills while attempting to be productive -- I don't take my mental energy for granted.)

Lost in time / deeper mental models / information bottlenecks

One really simplified way to measure utility, is whether you can remember philosophy that you did. But there's something way deeper to this. If you force yourself to do useful philosophy, given that you can't remember a lot, a solution that arises naturally is that you create deeper, or higher level, representations of things that you know.

The more abstract they get, the more information they can capture.

A way to view it (basically compression and indexing lol):
you simply replace the "main table" that was previously storing the information, with an index that point to the information (because the "main table" runs out of space, and then later an index to the indices, etc., where the main table gets increasingly abstract as the number of "leaves" at the end of the node (meaning explicit ideas and pieces of content and pieces of reasoning), keep increasing. But the "main table" is what you're mostly working with, in terms of what you perceive as your thoughts. So from your perspective, as you learn, you are simply getting increasingly abstract representations, and it takes longer to retrieve information or to put into words what you think you know, because you are spending more cycles on traversing and indexing the graph and translating between forms of representation. (Memory being reconstructive is loosely related to this but I haven't dug into that topic at all. Also isn't this analogous to how auto-encoders work?)

(Balaji Srinivasan about memory management, paraphrase: "if you just have a giant mental tree that you attach concepts to over time, you can have compounding learning and you can remember everything because it is all connected" -- similar to memory palace)

Format / readability / retrievability

This is subtle and hard to put into words but is in practice very impactful and keeps surprising me, which is that, how easy things are to find and how easy it is to read them once you find them, matters a lot. If you have a diary in Notepad++, it's essentially a flat list, which is super annoying to retrieve things from, since you can only scroll or do ctrl-f. Fancier systems can make retrieval easier and they allow formatting, but require more initial energy to start writing notes in.

Youtube comments contain almost zero back-and-forth, Twitter has more, Reddit has a lot more, 4chan has longer chains than Reddit but is weaker in terms of structure.

LessWrong is actually really good for this, because you can read a preview on-hover, and do it recursively for preview panels -- Gwern-style. It has great formatting and is pleasant to edit. Ease of discussion via comments is better than all the above, except Reddit.

About retrieval on LW: this is my first time seeing it, but this is actually pretty cool [? · GW] -- though it's still not super great. Because retrieval is very much a UI/UX problem that is super limited by this existing on the browser. Notice how you can't use the keyboard to affect how results are presented, affect filter/sort or change page -- this goes for any search engine I've ever seen.

It's hard to make dynamically changing and editable UIs. You can design nice UIs, but you can't really design a superset of UIs that is traversable with hotkeys. Because it has to work on, like, 9000 different browsers and devices and screen sizes.

Retrievability transformed programming

I believe that Substack+Google deeply transformed what it means to be a programmer. Think of the meme about expectation vs reality, that programmers spend almost all their time Googling things. It's because the search engine is so powerful, that it outsourced almost all of the memorization requirement to, Googling skills + ability to parse Substack posts + trying out suggestions + figuring out whether a suggestion will work for you and how to reshape it to your codebase.

clone-of-saturn on Heresies in the Shadow of the Sequences

Any agent that makes decisions has an implicit decision theory, it just might not be a very good one. I don't think anyone ever said advanced decision theory was required for AGI, only for robust alignment.

rauno-arike on Rauno's Shortform

[Link] Something weird is happening with LLMs and chess by dynomight [LW · GW]

dynomight stacked up 13 LLMs against Stockfish on the lowest difficulty setting and found a huge difference between the performance of GPT-3.5 Turbo Instruct and any other model:

all

People noticed already last year that RLHF-tuned models are much worse at chess than base/instruct models, so this isn't a completely new result. The gap between models from the GPT family could also perhaps be (partially) closed through better prompting: Adam Karvonen has created a repo for evaluating LLMs' chess-playing abilities and found that many of GPT-4's losses against 3.5 Instruct were caused by GPT-4 proposing illegal moves. However, dynomight notes that there isn't nearly as big of a gap between base and chat models from other model families:

instruct comparison

This is a surprising result to me—I had assumed that base models are now generally decent at chess after seeing the news about 3.5 Instruct playing at 1800 ELO level last year. dynomight proposes the following four explanations for the results:

1. Base models at sufficient scale can play chess, but instruction tuning destroys it.
2. GPT-3.5-instruct was trained on more chess games.
3. There’s something particular about different transformer architectures.
4. There’s “competition” between different types of data.

anthonyc on The Case Against Moral Realism

If I'm understanding you correctly, then I strongly disagree about what ethics and meta-ethics are for, as well as what "individual selfishness" means. The questions I care about flow from "What do I care about, and why?" and "How much do I think others should or will care about these things, and why?" Moral realism and amoral nihilism are far from the only options, and neither are ones I'm interested in accepting.

anthonyc on If I care about measure, choices have additional burden (+AI generated LW-comments)

I'm not saying it improves decision making. I'm saying it's an argument for improving our decision making in general, if mundane decisions we wouldn't normally think are all that important have much larger and long-lasting consequences. Each mundane decision affects a large number of lives that parts of me will experience, in addition to the effects on others.

avturchin on If I care about measure, choices have additional burden (+AI generated LW-comments)

My point was that only 3 is relevant. How it improves average decision making?

neil-warren on Politics are not serious by default

I've been at Sciences Po for a few months now. Do you have any general advice? I seem to have trouble taking the subjects seriously enough to any real effort in them, which you seem to point out as a failure mode you skirted. Asking as many people I can for this, as I'm going through a minor existential crisis. Thanks!

harfe on D0TheMath's Shortform

insofar as the simplest & best internal logical-induction market traders have strong beliefs on the subject, they may very well be picking up on something metaphysically fundamental. Its simply the simplest explanation consistent with the facts.

Theorem 4.6.2 in logical induction says that the "probability" of independent statements does not converge to or $0$ , but to something in-between. So even if a mathematician says that some independent statement feels true (eg some objects are "really out there"), logical induction will tell him to feel uncertain about that.

shreyans-jain on Effects of Non-Uniform Sparsity on Superposition in Toy Models

Hey, i'm controlling the sparsity when I'm creating the batch of the data, so during that time, i sample according to the probability i'm assigning for that feature.

re: features getting baked into the bias: yeah, that might be one of the intuitions we can develop but to me the interesting part is that that kind of behaviour didn't happen in any of the other cases when the importance was varying and just happened when the feature importance for all of them is equal. I don't have a concrete intuition on why that might be the case, still trying to think on it.