LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Inferring the model dimension of API-protected LLMs
Ege Erdil (ege-erdil) · 2024-03-18T06:19:25.974Z · comments (3)

AI #56: Blackwell That Ends Well
Zvi · 2024-03-21T12:10:05.412Z · comments (16)

ARENA4.0 Capstone: Hyperparameter tuning for MELBO + replication on Llama-3.2-1b-Instruct
25Hour (aaron-kaufman) · 2024-10-05T11:30:11.953Z · comments (2)

Musings on Text Data Wall (Oct 2024)
Vladimir_Nesov · 2024-10-05T19:00:21.286Z · comments (2)

The Schumer Report on AI (RTFB)
Zvi · 2024-05-24T15:10:03.122Z · comments (3)

What I Learned (Conclusion To "The Sense Of Physical Necessity")
LoganStrohl (BrienneYudkowsky) · 2024-03-20T21:24:37.464Z · comments (0)

[link] GPT2, Five Years On
[deleted] · 2024-06-05T17:44:17.552Z · comments (0)

[link] My Apartment Art Commission Process
jenn (pixx) · 2024-08-26T18:36:44.363Z · comments (4)

AXRP Episode 38.2 - Jesse Hoogland on Singular Learning Theory
DanielFilan · 2024-11-27T06:30:03.821Z · comments (0)

"Which chains-of-thought was that faster than?"
Emrik (Emrik North) · 2024-05-22T08:21:00.269Z · comments (4)

Augmenting Statistical Models with Natural Language Parameters
jsteinhardt · 2024-09-20T18:30:10.816Z · comments (0)

[link] Liquid vs Illiquid Careers
vaishnav92 · 2024-10-20T23:03:49.725Z · comments (7)

Flipping Out: The Cosmic Coinflip Thought Experiment Is Bad Philosophy
Joe Rogero · 2024-11-12T23:55:46.770Z · comments (17)

Basics of Handling Disagreements with People
Camille Berger (Camille Berger) · 2024-11-12T17:55:08.143Z · comments (4)

AXRP Episode 33 - RLHF Problems with Scott Emmons
DanielFilan · 2024-06-12T03:30:05.747Z · comments (0)

Childhood and Education Roundup #7
Zvi · 2024-12-09T13:10:05.588Z · comments (10)

[link] The last era of human mistakes
owencb · 2024-07-24T09:58:42.116Z · comments (2)

[link] The Cancer Resolution?
PeterMcCluskey · 2024-07-24T00:25:17.322Z · comments (27)

[link] hydrogen tube transport
bhauth · 2024-04-18T22:47:08.790Z · comments (12)

Computational Mechanics Hackathon (June 1 & 2)
Adam Shai (adam-shai) · 2024-05-24T22:18:44.352Z · comments (5)

[link] Robin Hanson & Liron Shapira Debate AI X-Risk
Liron · 2024-07-08T21:45:40.609Z · comments (4)

(Maybe) A Bag of Heuristics is All There Is & A Bag of Heuristics is All You Need
Sodium · 2024-10-03T19:11:58.032Z · comments (17)

A Sober Look at Steering Vectors for LLMs
Joschka Braun (joschka-braun) · 2024-11-23T17:30:00.745Z · comments (0)

Monthly Roundup #20: July 2024
Zvi · 2024-07-23T12:50:07.991Z · comments (9)

Meme Talking Points
ymeskhout · 2024-11-06T15:27:54.024Z · comments (0)

Important open problems in voting
Closed Limelike Curves · 2024-07-01T02:53:44.690Z · comments (1)

Attention Output SAEs Improve Circuit Analysis
Connor Kissane (ckkissane) · 2024-06-21T12:56:07.969Z · comments (2)

Confusing the metric for the meaning: Perhaps correlated attributes are "natural"
NickyP (Nicky) · 2024-07-23T12:43:18.681Z · comments (3)

DIY LessWrong Jewelry
Fluffnutt (Pear) · 2024-08-25T21:33:56.173Z · comments (0)

How good are LLMs at doing ML on an unknown dataset?
Håvard Tveit Ihle (havard-tveit-ihle) · 2024-07-01T09:04:03.687Z · comments (4)

What AI companies should do: Some rough ideas
Zach Stein-Perlman · 2024-10-21T14:00:10.412Z · comments (10)

Templates I made to run feedback rounds for Ethan Perez’s research fellows.
Henry Sleight (ResentHighly) · 2024-03-28T19:41:15.506Z · comments (0)

Monthly Roundup #16: March 2024
Zvi · 2024-03-19T13:10:05.529Z · comments (4)

[link] Information dark matter
Logan Kieller (logan-kieller) · 2024-10-01T15:05:41.159Z · comments (4)

[link] AI Safety Memes Wiki
plex (ete) · 2024-07-24T18:53:04.977Z · comments (1)

Empathy/Systemizing Quotient is a poor/biased model for the autism/sex link
tailcalled · 2024-11-04T21:11:57.788Z · comments (0)

Experimentation (Part 7 of "The Sense Of Physical Necessity")
LoganStrohl (BrienneYudkowsky) · 2024-03-18T21:25:56.527Z · comments (0)

The slingshot helps with learning
Wilson Wu (wilson-wu) · 2024-10-31T23:18:16.762Z · comments (0)

Introducing REBUS: A Robust Evaluation Benchmark of Understanding Symbols
Arjun Panickssery (arjun-panickssery) · 2024-01-15T21:21:03.962Z · comments (0)

More on the Apple Vision Pro
Zvi · 2024-02-13T17:40:05.388Z · comments (5)

LLMs can strategically deceive while doing gain-of-function research
Igor Ivanov (igor-ivanov) · 2024-01-24T15:45:08.795Z · comments (4)

One way violinists fail
Solenoid_Entity · 2024-05-29T04:08:17.675Z · comments (5)

Rational Animations offers animation production and writing services!
Writer · 2024-03-15T17:26:07.976Z · comments (0)

[link] Vacuum: Theory and Technologies
ethanmorse · 2024-01-21T17:23:49.257Z · comments (0)

[link] FTX expects to return all customer money; clawbacks may go away
Mikhail Samin (mikhail-samin) · 2024-02-14T03:43:13.218Z · comments (1)

Deceptive agents can collude to hide dangerous features in SAEs
Simon Lermen (dalasnoin) · 2024-07-15T17:07:33.283Z · comments (2)

Proveably Safe Self Driving Cars [Modulo Assumptions]
Davidmanheim · 2024-09-15T13:58:19.472Z · comments (26)

AI #63: Introducing Alpha Fold 3
Zvi · 2024-05-09T14:20:03.176Z · comments (2)

Sparse autoencoders find composed features in small toy models
Evan Anders (evan-anders) · 2024-03-14T18:00:43.339Z · comments (12)

UDT1.01: Logical Inductors and Implicit Beliefs (5/10)
Diffractor · 2024-04-18T08:39:13.368Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

daniel-herrmann on Chance is in the Map, not the Territory

I wouldn't say that is a clear exception. There are perfectly normal, subjective probability ways to make sense of mixed strategies in game theory. For example, this paper by Aumann and Brandenburger provide epistemic conditions for Nash equilirbia, that don't require objective probabilities to randomize. From their paper:

"Mixed strategies are treated not as conscious randomizations, but as conjectures, on the part of other players, as to what a player will do." (p. 1161)

In slightly more detail:

"According to [our] view, players do not randomize; each player chooses some definite action. But other players need not know which one, and the mixture represents their uncertainty, their conjecture about his choice. This is the context of our main results, which provide sufficient conditions for a probile of conjectures to constitute a Nash equilibrium." (p. 1162)

Interestingly, this paper is very motivated by embedded agency type concerns. For example, on page 1174 they write:

"Though entirely apt, use of the term “state of the world” to include the actions of the players has perhaps caused confusion. In Savage (1954), the decision maker cannot affect the state; he can only react to it. While convenient in Savage’s one person context, this is not appropriate in the interactive, many-person world under study here. Since each player must take into account the actions of the others, the actions should be included in the description of the state. Also the plain, everyday meaning of the term “state of the world” includes one’s actions: Our world is shaped by what we do. It has been objected that prescribing what a player must do at a state takes away his freedom. This is nonsensical; the player may do what he wants. It is simply that whatever he does is part of the description of the state. If he wishes to do something else, he is heartily welcome to do it, but he thereby changes the state."

In general, getting back to reflective oracles, indeed I think that is one way that one might try to provide a formalism underlying some application of game theory! And I think it is a very interesting. But, as the Aumann and Brandenburger paper shows, there are totally normal ways to do this without fundamental chance. They have some references in their paper to other papers with this perspective, and it forms one of many motivations for the approach of epistemic game theory.

And, in general, I would resist the inference from "this kind of reasoning requires the world to be a certain way" to "the world must be a certain way.

screwtape on Thinking By The Clock

(Self review) I stand by this post, I think it's an important idea, I think not enough people are using this technique, and this adds nothing but a different way of writing something that was already in the rationalist canon.

If you do not sometimes stop, start a timer, think for five minutes, come to a conclusion and then move on, I believe you are missing an important mental skill and you should fix that. This skill helps me. I have observed some of the most effective people I know personally use this skill. You should at least try it.

You know what followup work I want? I want a dozen different modes of this idea. A youtube video. The audio version is great. The fictional version in HPMOR is great. Can we get a goofy videogame that makes you use the pause button well? (I tried to get at this with Troll Timers. https://www.lesswrong.com/posts/fCg3pLZqthXsGznHP/troll-timers) [LW · GW] I should try rewriting this as a rousing speech. It'd be cool to have it as a catchy tune. Maybe someone should tiktok the sucker.

I'm not saying it's the most important idea! Just, you know, it's broadly applicable and any mistake you make by not thinking for five minutes when you are not actually under time pressure is a stupid mistake that makes beisutsukai-san disappointed in you.

If the Best Of LessWrong collection is just for things that add to the conversation, this post doesn't belong there. I'd give it a small positive vote if I could vote on it. On the other hand if nobody else has gotten a post about this concept into the Best Of LessWrong collection yet, and some newcomers might just read the Best Of LessWrong posts, then I do kinda want something on this topic to get in there.

knight-lee on The purposeful drunkard

I think there is a typo somewhere, probably because you switched whether the vectors were rows or columns.

Based on the dimensions of the matrices, it should be $X = M_{u p d} \cdot S$

And $X_{c e n t} = M_{u p d} S C$

And I think $X_{c e n t} X_{c e n t}^{T} = M_{u p d} S C C^{T} S^{T} M_{u p d}^{T}$

Instead of $X_{c e n t}^{T} X_{c e n t} = M_{u p d}^{T} S^{T} C^{T} C S M_{u p d}$

$S$ should still be upper triangular.

Though don't trust me either, I often do math in a hand-wavy fashion.

My intuition was that PCA selects the "angle" you view the data from which stretches out the data as much as possible, forcing the random walk to appear relatively straighter.

But somehow the random walk is smooth on a over a few data points, but still turns back and forth over the duration of $T$ . This contradicts my intuition and I have no idea what's going on.

curt-tigges on How do you deal w/ Super Stimuli?

I use Freedom and Limit on my computer and Stay Focused on my Android phone. The former two allow for a combination of complete blocking during certain time windows and time limits (for any website, even across browsers and even if you open an incognito window). The latter does both for my phone.

I block all social media and content during prime working hours and implement a 30-minute limit outside of that. It works pretty well. I may make it more strict because I sometimes find myself looking at Twitter, etc. occasionally when watching a TV show in the evenings.

I also use BlockTube to get rid of YouTube Shorts entirely from my web browser. They no longer show up in search results or in the menu.

Finally, I recommend the tools here, though I haven't tried all of them: https://liamrosen.com/2023/04/18/modding-social-media-to-win-the-attention-war/

cstinesublime on How do you deal w/ Super Stimuli?

I don't want to pretend that I'm someone who is immune to Youtube binges or similar behaviors. However I am not sure why this is a problem and what meaningful work that this behavior was getting in the way of? Speaking for myself, 9/10 if I have a commitment the next morning, I won't stay up late on my computer because... I know I have a commitment at a set time. (If you forced me to hypothesize why that 1/10 times I don't, I'd guess that it is stress related anticipation means I can't sleep even if I did lay down - but that is just a wild guess).

I'm also surprised to see how most of the solutions in the comments involve removing access to anything... doing something more productive. I think there is a difference between the nebulous guilt we feel about Opportunity Cost - "oh geez I could have used that time more effectively" and specific, tangible, realistic things we could have done but didn't. I often find that Youtube Binges are caused by/as-a-result-of not being able to find those activities, they do not frustrate them.

I have perennially found that whatever vice (or as you call it 'hyperstimuli') that I remove, I just replace it with another but it's never a beneficial activity. (The one exception I can think of was when I stopped listening to music when I had a bout of insomnia and instead replaced it with lectures on Wittgenstein or Quantum Physics, because I figured "I might as well learn SOMETHING').

This has caused me an incredible amount of frustration. For all the talk of "social media detox" and even the farcically named "dopamine detox" none seem to actually result in net increases in my well being.

Going back to what I said about specific, tangible, realistic alternatives: I have found that the only way to stop mid-way through a Youtube binge or a Instagram scroll is to be excited about a project that I have a lot of faith in my ability to complete, and a viable first-step which I can do now.

This isn't fail-safe, if I'm writing a journal entry or an essay, and I have to leave in 30 minutes, you bet your bottom dollar I'll be late because I'll be so engrossed in that writing process. But that doesn't sound like a 'hyperstimuli'

screwtape on In Defense of Parselmouths

(Self review) Do I stand by this post? Eh. Kinda sorta but I think it's incomplete.

I think there's something important in truth-telling, and getting everyone on the same page about what we mean by the truth. Since everyone will not just start telling the literal truth all the time and I don't even particularly want them to, we're going to need to have some norms and social lubricant around how to handle the things people say that aren't literal truth.

The first thing I disagree with when rereading it is sometimes even if someone is obviously and straightforwardly feeding me bullshit, I keep trying to tell the truth. Sometimes I try even harder to be precise and truthful. In a conversation with friends, I might say "that game's no fun" when the true and accurate statement is "I don't find that game fun." In a heated internet argument, I think it's useful to check my stance and use the latter kind of statement, even if the other person is saying things like "everyone who doesn't like that game is a moron."

Short of a complete guide to Truth, I'd settle for a practical "Here's how Screwtape regards the truth, read it and you'll understand when he'd say false things." This essay falls short of that.

I'd love more things in this genre. Meta-Honesty: Firming Up Honesty Around Its Edge Cases and The Onion Test For Personal And Institutional Honesty are both good examples of the genre. Even personal versions seem useful.

I think that makes this a replaceable essay. It would be fine in a Best Of collection, but it's not adding too much other than a few intuition pumps.

petermccluskey on Do humans really learn from "little" data?

"OOMs faster "? Where do you get that idea?

Dreams indicate a need for more processing than what happens when we're awake, but likely less than 2x waking time.

daniel-tan on Daniel Tan's Shortform

Feedback from Max:

The definition he originally gave leaves out a notion of 'stealthiness' as opposed to 'gibberish'; 'stealthy' reasoning is more concerning.
OOCR may not meet the opaqueness criterion, since the meaning of 'f' is pretty obvious from looking at the fine-tuning examples. This is like 'reasoning in a foreign language' or 'using esoteric terminology'; it's less legible to the average human, but not outside the realm of human semantics. (But still interesting! just that maybe OOCR isn't needed)
Backdoors have a similar concern to the above. A separate (and more important) concern is that it doesn't satisfy the definition because 'steg' typically refers to meaning encoded in generated tokens (while in 'backdoors', the hidden meaning is in the prompt tokens)

saidachmiz on Don’t Legalize Drugs

I like Dalrymple’s writing, but this piece makes it clear that he’s no philosopher. His attempted rebuttal to the “philosophic argument” is sloppy and weak, full of equivocations, failures to pursue lines of reasoning to their logical endpoints or to see obvious implications, etc. I expected more, and was disappointed.

screwtape on Social Dark Matter

I think this essay is worth including in the Best Of LessWrong collection for introducing a good conceptual handle for a phenomenon it convinced me exists in a more general form than I'd thought.

It's talking about a phenomenon that's easy to overlook. I think the phenomenon is real; for a trivial example, look at any self reported graph of height and look at the conspicuous shortage at 5'11". It comes with lots of examples. Testing this is maddeningly tricky (it's hiding from you!) but doable, especially if you're willing to generalize from one or two examples you may have an unusually good vantage point on.

I've taken to thinking of this as paired with Dark Forest Theories (https://www.lesswrong.com/posts/xDNyXGCDephBuNF8c/dark-forest-theories). If you look around and notice a gap in the world, is that because there's nothing there, or because what would be there is concealed from you?

If there's a followup post I'd love to see, that post would be on how to observe or detect the dark matter. That would be an anti-inductive game in many ways, but I expect general principles might exist- I've taken to looking at survey data with an eye towards "hrm, there's a dip or break in that line there- would I expect that spot to be Social Dark Matter?"

If someone was involved more directly working with dark matter subjects, this post would be more material to them I think. For me, it's mostly overkill, but a concept I keep in my back pocket for when it's needed.