LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Conflict in Posthuman Literature
Martín Soto (martinsq) · 2024-04-06T22:26:04.051Z · comments (1)

Planning to build a cryptographic box with perfect secrecy
Lysandre Terrisse · 2023-12-31T09:31:47.941Z · comments (6)

Long-Term Future Fund: May 2023 to March 2024 Payout recommendations
Linch · 2024-06-12T13:46:29.535Z · comments (0)

Quantopian contest, but for food intake and weight
Lucent · 2023-11-08T05:41:35.050Z · comments (9)

List your AI X-Risk cruxes!
Aryeh Englander (alenglander) · 2024-04-28T18:26:19.327Z · comments (7)

[link] Progress links digest, 2023-11-24: Bottlenecks of aging, Starship launches, and much more
jasoncrawford · 2023-11-24T15:25:07.721Z · comments (1)

D&D.Sci(-fi): Colonizing the SuperHyperSphere [Evaluation and Ruleset]
abstractapplic · 2024-01-22T19:20:05.001Z · comments (7)

Movie posters
KatjaGrace · 2024-03-06T06:20:03.034Z · comments (0)

[link] Forecasting: the way I think about it
Molly (hickman-santini) · 2024-05-09T00:49:01.768Z · comments (4)

[link] AI Regulation is Unsafe
Maxwell Tabarrok (maxwell-tabarrok) · 2024-04-22T16:37:55.431Z · comments (41)

[link] Queuing theory: Benefits of operating at 60% capacity
ampdot · 2023-12-01T18:48:01.426Z · comments (4)

[question] Implications of China's recession on AGI development?
Eric Neyman (UnexpectedValues) · 2024-09-28T01:12:36.443Z · answers+comments (3)

instruction tuning and autoregressive distribution shift
nostalgebraist · 2024-09-05T16:53:41.497Z · comments (5)

Winners of the Essay competition on the Automation of Wisdom and Philosophy
AI Impacts (AI Imacts) · 2024-10-28T17:10:04.272Z · comments (3)

2025 Color Trends
sarahconstantin · 2024-10-07T21:20:03.962Z · comments (7)

[Linkpost] Play with SAEs on Llama 3
Tom McGrath · 2024-09-25T22:35:44.824Z · comments (2)

Neuroscience and Alignment
Garrett Baker (D0TheMath) · 2024-03-18T21:09:52.004Z · comments (25)

[link] Book review: Cuisine and Empire
eukaryote · 2024-01-21T06:15:12.969Z · comments (2)

What's up with all the non-Mormons? Weirdly specific universalities across LLMs
mwatkins · 2024-04-19T13:43:24.568Z · comments (13)

Debate, Oracles, and Obfuscated Arguments
Jonah Brown-Cohen (jonah-brown-cohen) · 2024-06-20T23:14:57.340Z · comments (2)

Applying Force to the Wrong End of a Causal Chain
silentbob · 2024-06-22T18:06:32.364Z · comments (0)

[link] "What if we could redesign society from scratch? The promise of charter cities." [Rational Animations video]
Jackson Wagner · 2024-02-18T00:57:50.444Z · comments (7)

Scaling of AI training runs will slow down after GPT-5
Maxime Riché (maxime-riche) · 2024-04-26T16:05:59.957Z · comments (5)

[link] Dequantifying first-order theories
jessicata (jessica.liu.taylor) · 2024-04-23T19:04:49.000Z · comments (9)

Californians, tell your reps to vote yes on SB 1047!
Holly_Elmore · 2024-08-12T19:50:09.817Z · comments (24)

Why did ChatGPT say that? Prompt engineering and more, with PIZZA.
Jessica Rumbelow (jessica-cooper) · 2024-08-03T12:07:46.302Z · comments (2)

"Does your paradigm beget new, good, paradigms?"
Raemon · 2024-01-25T18:23:15.497Z · comments (6)

Choosing My Quest (Part 2 of "The Sense Of Physical Necessity")
LoganStrohl (BrienneYudkowsky) · 2024-02-24T21:31:45.377Z · comments (7)

Manifund Q1 Retro: Learnings from impact certs
Austin Chen (austin-chen) · 2024-05-01T16:48:33.140Z · comments (1)

Stitching SAEs of different sizes
Bart Bussmann (Stuckwork) · 2024-07-13T17:19:20.506Z · comments (12)

[link] Understanding Gödel’s completeness theorem
jessicata (jessica.liu.taylor) · 2024-05-27T18:55:02.079Z · comments (0)

You're a Space Wizard, Luke
lsusr · 2024-08-18T05:35:39.238Z · comments (6)

Interpretability as Compression: Reconsidering SAE Explanations of Neural Activations with MDL-SAEs
Kola Ayonrinde (kola-ayonrinde) · 2024-08-23T18:52:31.019Z · comments (5)

Apply to the PIBBSS Summer Research Fellowship
Nora_Ammann · 2024-01-12T04:06:58.328Z · comments (1)

Individually incentivized safe Pareto improvements in open-source bargaining
Nicolas Macé (NicolasMace) · 2024-07-17T18:26:43.619Z · comments (2)

[Interim research report] Evaluating the Goal-Directedness of Language Models
Rauno Arike (rauno-arike) · 2024-07-18T18:19:04.260Z · comments (4)

Anthropic rewrote its RSP
Zach Stein-Perlman · 2024-10-15T14:25:12.518Z · comments (19)

Are we dropping the ball on Recommendation AIs?
Charbel-Raphaël (charbel-raphael-segerie) · 2024-10-23T17:48:00.000Z · comments (14)

[link] [Paper Blogpost] When Your AIs Deceive You: Challenges with Partial Observability in RLHF
Leon Lang (leon-lang) · 2024-10-22T13:57:41.125Z · comments (0)

Nitric oxide for covid and other viral infections
Elizabeth (pktechgirl) · 2024-02-07T21:30:03.774Z · comments (6)

How To Do Patching Fast
Joseph Miller (Josephm) · 2024-05-11T20:13:52.424Z · comments (6)

[link] Legalize butanol?
bhauth · 2023-12-20T14:24:33.849Z · comments (20)

[link] Linear infra-Bayesian Bandits
Vanessa Kosoy (vanessa-kosoy) · 2024-05-10T06:41:09.206Z · comments (5)

Instrumental deception and manipulation in LLMs - a case study
Olli Järviniemi (jarviniemi) · 2024-02-24T02:07:01.769Z · comments (13)

[link] [Paper] Language Models Don't Learn the Physical Manifestation of Language
Bruce W. Lee (bruce-lee) · 2024-02-22T18:52:32.237Z · comments (23)

Natural abstractions are observer-dependent: a conversation with John Wentworth
Martín Soto (martinsq) · 2024-02-12T17:28:38.889Z · comments (13)

Logical Line-Of-Sight Makes Games Sequential or Loopy
StrivingForLegibility · 2024-01-19T04:05:44.782Z · comments (0)

Prepsgiving, A Convergently Instrumental Human Practice
JenniferRM · 2023-11-23T17:24:56.784Z · comments (0)

Simple distribution approximation: When sampled 100 times, can language models yield 80% A and 20% B?
Teun van der Weij (teun-van-der-weij) · 2024-01-29T00:24:27.706Z · comments (5)

Medical Roundup #3
Zvi · 2024-07-09T13:10:06.862Z · comments (4)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

anders-lindstroem on Does the "ancient wisdom" argument have any validity? If a particular teaching or tradition is old, to what extent does this make it more trustworthy?

I understood why you asked, I am also interested in general why people up or down vote something. I could be a really good information and food for thought.

Yeah, who doesn't want capital T truth... But I have come to appreciate the subjective experience more and more. I like science and rational thinking, it has gotten us pretty far, but who am I the question someones experience. If someone met 'the creator' on an ayahuasca journey or think that love is the essence of universe, who am I to judge. When I see the statistics on the massive use of anti-depressants it is obvious to me that we can´t use rational and logical thinking to think our way out from our feelings. What are rationality and logical thinking good for if it in the end can't make us feel good?

kola-ayonrinde on Targeted Manipulation and Deception Emerge when Optimizing LLMs for User Feedback

Ahh sorry, I think I made this comment on an early draft of this post and didn't realise it would make it into the published version! I totally agree with you and made the above comment in a hope for this point to be be made more clear in later drafts, which I think it has!

Loved the post!

I'm going to delete these comments, if that's okay.

dagon on Quantum Immortality: A Perspective if AI Doomers are Probably Right

If quantum immortality is true

This is a big if. It may be true (though it also implies that events as unlikely as Boltzmann Brains are true as well), but it's not true in a way that has causal impact on my current predicted experiences. If so, then the VAST VAST MAJORITY of universes don't contain me in the first place, and the also-extreme majority of those that do will have me die.

Assume quantum uncertainty affects how the coins land. I survive the night only if I correctly guess the 10th digit of π and/or all seven coins land heads, otherwise I will be killed in my sleep.

In a literal experiment, where a human researcher kills you based on their observations of coins and calculation of pi, I don't think you should be confident of surviving the night. If you DO survive, you don't learn much about uncorrelated probabilities - there's a near-infinite number of worlds, and fewer and fewer of them will contain you.

I guess this is a variant of option (1) - Deny that QI is meaningful. You don't give up on probability - you can estimate a (1/2)^7 * 1/10 = 0.00078 chance of surviving.

dagon on The Case Against Moral Realism

I think there's a much simpler case against it: show me the instrument readings, or at least tell me the unit of measure.

thomas-kwa on Should CA, TX, OK, and LA merge into a giant swing state, just for elections?

I mention exactly this in paragraph 3.

startattheend on Does the "ancient wisdom" argument have any validity? If a particular teaching or tradition is old, to what extent does this make it more trustworthy?

I worded that a bit badly, I meant I had a hard time thinking of better (meaning kinder) explanations, not better (meaning more likely) explanations. Across all websites I've been on in my life, I have posted more than 100000 comments (resulting in many interactions), so while things like psychoanalyzing people, assuming intentions, and making stereotypes is "bad", I simply have too much training data, and too few incorrect guesses not to do this. I do, however, intentionally overestimate people (since I want to talk to intelligent people, I give people the benefit of doubt for as long as possible) but this means that mistakes are attributed to their intentions, personality or values, rather than careless mistakes or superficial heuristics. In this situation, I've assumed that they're offended by the idea that traditional socities rival the science method in some situations. But it may be something more superficial like "I find short comments to be effortless", "somebody else already said that" or "I didn't understand your explanation and I consider it your fault". But like I said in another comment, I remember the first downvotes being disagreements (red X) rather than regular downvotes, so I took it as meaning "this is wrong" rather than "I don't like this comment". Not that any of this matters very much, admittedly

micahcarroll on Targeted Manipulation and Deception Emerge when Optimizing LLMs for User Feedback

User feedback training reliably leads to emergent manipulation in our experimental scenarios, suggesting that it can lead to it in real user feedback settings too.

kenoubi on Requirements for a Basin of Attraction to Alignment

Sorry, I think it's entirely possible that this is just me not knowing or understanding some of the background material, but where exactly does this diverge from justifying the AI pursuing a goal of maximizing the inclusive genetic fitness of its creators? Which clearly either isn't what humans actually want (there are things humans can do to make themselves have more descendants that no humans, including the specific ones who could take those actions, want to take, because of godshatter) or is just circular (who knows what will maximize inclusive genetic fitness in an environment that is being created, in large part, by the decision of how to promote inclusive genetic fitness?). At some point, your writing started talking about "design goals", but I don't understand why tools / artifacts constructed by evolved creatures, that happen to increase the inclusive genetic fitness of the evolved creatures who constructed them by means other than the design goals of those who constructed them, wouldn't be favored by evolution, and thus part of the "purpose" of the evolved creatures in constructing them; and this doesn't seem like an "error" even in the limit of optimal pursuit of inclusive genetic fitness, this seems to be just what optimal pursuit of IGF would actually do. In other words, I don't want a very powerful human-constructed optimizer to pursue the maximization of human IGF, and I think hardly any other humans do either; but I don't understand in detail why your argument doesn't justify AI pursuit of maximizing human IGF, to the detriment of what humans actually value.

saul-munn on Saul Munn's Shortform

Active Recall and Spaced Repetition are Different Things

Epistemic status: splitting hairs.

There’s been a lot of recent work on memory. This is great, but popular communication of that progress consistently mixes up active recall and spaced repetition. That consistently bugged me — hence this piece.

If you already have a good understanding of active recall and spaced repetition, skim sections I and II, then skip to section III.

Note: this piece doesn’t meticulously cite sources, and will probably be slightly out of date in a few years. I link some great posts that have far more technical substance at the end, if you’re interested in learning more & actually reading the literature.

I. Active Recall

When you want to learn some new topic, or review something you’ve previously learned, you have different strategies at your disposal. Some examples:

Watch a YouTube video on the topic.
Do practice problems.
Review notes you’d previously taken.
Try to explain the topic to a friend.
etc

Some of these boil down to “stuff the information into your head” (YouTube video, reviewing notes) and others boil down to “do stuff that requires you to use/remember the information” (doing practice problems, explaining to a friend). Broadly speaking, the second category — doing stuff that requires you to actively recall the information — is way, way more effective.

That’s called “active recall.”

II. (Efficiently) Spaced Repetition

After you learn something, you’re likely to forget it pretty quickly:

Fortunately, reviewing the thing you learned pushes you back up to 100% retention, and this happens each time you “repeat” a review:

That’s a lot better!

…but that’s also a lot of work. You have to review the thing you learned in intervals, which takes time/effort. So, how can you do the least the number of repetitions to keep your retention as high as possible? In other words — what should be the size of the intervals? Should you space them out every day? Every week? Should you change the size of the spaces between repetitions? How?

As it turns out, efficiently spacing out repetitions of reviews is a pretty well-studied problem. The answer is “riiiight before you’re about to forget it:”

Generally speaking, you should do a review right before it crosses some threshold for retention. What that threshold actually is depends on some fiddly details, but the central idea remains the same: repeating a review riiight before you hit that threshold is the most efficient spacing possible.

This is called (efficiently) spaced repetition. Systems that use spaced repetitions — software, methods, etc — are called “spaced repetition systems” or “SRS.”

III. The difference

Active recall and spaced repetition are independent strategies. One of them (active recall) is a method for reviewing material; the other (effective spaced repetition) is a method for how to best time reviews. You can use one, the other, or both:

Examples of their independence:

You could listen to a lecture on a topic once now, and again a year from now (not active recall, very inefficiently spaced repetition)
You could watch YouTube videos on a topic in efficiently spaced intervals (not active recall, yes spaced repetition)
You could quiz yourself with flashcards once, then never again (yes active recall, no spaced repetition)
You could do flashcards on something in efficiently spaced intervals (both spaced repetition and active recall).

IV. Implications

Why does this matter?

Mostly, it doesn’t, and I’m just splitting hairs. But occasionally, it’s prohibitively difficult to use one method, but still quite possible to use the other. In these cases, the right thing to do isn’t to give up on both — it’s to use the one that works!

For example, you can do a bit of efficiently spaced repetition when learning people’s names, by saying their name aloud:

immediately after learning it (“hi, my name’s Alice” “nice to meet you, Alice!”)
partway through the conversation (“but i’m still not sure of the proposal. what do you think, Alice?”)
at the end of the conversation (“thanks for chatting, Alice!”)
that night (“who did I meet today? oh yeah, Alice!”)

…but it’s a lot more difficult to use active recall to remember people’s names. (The closest I’ve gotten is to try to first bring into my mind’s eye what their face looks like, then to try to remember their name.)

Another example in the opposite direction: learning your way around a city in a car. It’s really easy to do active recall: have Google Maps opened on your phone and ask yourself what the next direction is each time before you look down; guess what the next street is going to be before you get there; etc. But it’s much more difficult to efficiently space your reviews out: review timing ends up mostly in the hands of your travel schedule.

For more on the topic of deliberately using memory systems to quickly learn the geography of a new place, see this post.

startattheend on Does the "ancient wisdom" argument have any validity? If a particular teaching or tradition is old, to what extent does this make it more trustworthy?

That makes sense, I just evaluated the comment in isolation. But I believe that the first few downvotes were as "incorrect" (the red X) rather than regular downvotes (down arrow), which is why the feedback occured to me as simply mistaken (as the comment is not false).

I've noticed, by the way, that most comments posted tend to get downvoted initially and then return to 0 over time. There may be a few regular, highly active users with high standards or something, and less casual users with lower standards which balance them out over time. I've gone to -10 and back before.