LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

A Visual Task that's Hard for GPT-4o, but Doable for Primary Schoolers
Lennart Finke (l-f) · 2024-07-26T17:51:28.202Z · comments (4)

Decent plan prize announcement (1 paragraph, $1k)
lukehmiles (lcmgcd) · 2024-01-12T06:27:44.495Z · comments (19)

Twin Peaks: under the air
KatjaGrace · 2024-05-31T01:20:04.624Z · comments (2)

[link] Beware the science fiction bias in predictions of the future
Nikita Sokolsky (nikita-sokolsky) · 2024-08-19T05:32:47.372Z · comments (20)

[link] The Best Essay (Paul Graham)
Chris_Leong · 2024-03-11T19:25:42.176Z · comments (2)

Beta Tester Request: Rallypoint Bounties
lukemarks (marc/er) · 2024-05-25T09:11:11.446Z · comments (4)

[link] MIRI's July 2024 newsletter
Harlan · 2024-07-15T21:28:17.343Z · comments (2)

[link] Executive Dysfunction 101
DaystarEld · 2024-05-23T12:43:13.785Z · comments (1)

[link] Transformer Debugger
Henk Tillman (henk-tillman) · 2024-03-12T19:08:56.280Z · comments (0)

[link] what becoming more secure did for me
Chipmonk · 2024-08-22T17:44:48.525Z · comments (5)

Using an LLM perplexity filter to detect weight exfiltration
Adam Karvonen (karvonenadam) · 2024-07-21T18:18:05.612Z · comments (11)

Housing Roundup #9: Restricting Supply
Zvi · 2024-07-17T12:50:05.321Z · comments (8)

Useful starting code for interpretability
eggsyntax · 2024-02-13T23:13:47.940Z · comments (2)

Building Trust in Strategic Settings
StrivingForLegibility · 2023-12-28T22:12:24.024Z · comments (0)

Scientific Method
Andrij “Androniq” Ghorbunov (andrij-androniq-ghorbunov) · 2024-02-18T21:06:45.228Z · comments (4)

[link] Clickbait Soapboxing
DaystarEld · 2024-03-13T14:09:29.890Z · comments (15)

I didn't think I'd take the time to build this calibration training game, but with websim it took roughly 30 seconds, so here it is!
mako yass (MakoYass) · 2024-08-02T22:35:21.136Z · comments (2)

[link] The absence of self-rejection is self-acceptance
Chipmonk · 2023-12-21T21:54:52.116Z · comments (1)

[link] Was Partisanship Good for the Environmental Movement?
Jeffrey Heninger (jeffrey-heninger) · 2024-05-15T17:30:54.796Z · comments (0)

Population ethics and the value of variety
cousin_it · 2024-06-23T10:42:21.402Z · comments (11)

[link] Secret US natsec project with intel revealed
Nathan Helm-Burger (nathan-helm-burger) · 2024-05-25T04:22:11.624Z · comments (0)

Language and Capabilities: Testing LLM Mathematical Abilities Across Languages
Ethan Edwards · 2024-04-04T13:18:54.909Z · comments (2)

[link] Altruism and Vitalism Aren't Fellow Travelers
Arjun Panickssery (arjun-panickssery) · 2024-08-09T02:01:11.361Z · comments (2)

Best-of-n with misaligned reward models for Math reasoning
Fabien Roger (Fabien) · 2024-06-21T22:53:21.243Z · comments (0)

[link] Robert Caro And Mechanistic Models In Biography
adamShimi · 2024-07-14T10:56:42.763Z · comments (5)

Technology path dependence and evaluating expertise
bhauth · 2024-01-05T19:21:23.302Z · comments (2)

[link] Alignment work in anomalous worlds
Tamsin Leake (carado-1) · 2023-12-16T19:34:26.202Z · comments (4)

[link] Compensating for Life Biases
Jonathan Moregård (JonathanMoregard) · 2024-01-09T14:39:14.229Z · comments (6)

Trying to be rational for the wrong reasons
Viliam · 2024-08-20T16:18:06.385Z · comments (8)

[link] Truth is Universal: Robust Detection of Lies in LLMs
Lennart Buerger · 2024-07-19T14:07:25.162Z · comments (3)

A brief review of China's AI industry and regulations
Elliot Mckernon (elliot) · 2024-03-14T12:19:00.775Z · comments (0)

UDT1.01: Local Affineness and Influence Measures (2/10)
Diffractor · 2024-03-31T07:35:52.831Z · comments (0)

[link] Let's Design A School, Part 2.3 School as Education - The Curriculum (Phase 2, Specific)
Sable · 2024-05-15T20:58:50.981Z · comments (0)

[link] The Living Planet Index: A Case Study in Statistical Pitfalls
Jan_Kulveit · 2024-06-24T10:05:55.101Z · comments (0)

[link] Scenario planning for AI x-risk
Corin Katzke (corin-katzke) · 2024-02-10T00:14:11.934Z · comments (12)

Paper Summary: The Koha Code - A Biological Theory of Memory
jakej (jake-jenks) · 2023-12-30T22:37:13.865Z · comments (2)

Even if we lose, we win
Morphism (pi-rogers) · 2024-01-15T02:15:43.447Z · comments (17)

Foresight Institute: 2023 Progress & 2024 Plans for funding beneficial technology development
Allison Duettmann (allison-duettmann) · 2023-11-22T22:09:16.956Z · comments (1)

A bet on critical periods in neural networks
kave · 2023-11-06T23:21:17.279Z · comments (1)

[link] Cellular respiration as a steam engine
dkl9 · 2024-02-25T20:17:38.788Z · comments (1)

An evaluation of Helen Toner’s interview on the TED AI Show
PeterH · 2024-06-06T17:39:40.800Z · comments (2)

aintelope project update
Gunnar_Zarncke · 2024-02-08T18:32:00.000Z · comments (2)

How Congressional Offices Process Constituent Communication
Tristan Williams (tristan-williams) · 2024-07-02T12:38:41.472Z · comments (0)

My Alignment "Plan": Avoid Strong Optimisation and Align Economy
VojtaKovarik · 2024-01-31T17:03:34.778Z · comments (9)

[question] Would you have a baby in 2024?
martinkunev · 2023-12-25T01:52:04.358Z · answers+comments (76)

Distinctions when Discussing Utility Functions
ozziegooen · 2024-03-09T20:14:03.592Z · comments (7)

[link] Eric Schmidt on recursive self-improvement
nikola (nikolaisalreadytaken) · 2023-11-05T19:05:15.416Z · comments (3)

[link] Extinction Risks from AI: Invisible to Science?
VojtaKovarik · 2024-02-21T18:07:33.986Z · comments (7)

Evolution did a surprising good job at aligning humans...to social status
Eli Tyre (elityre) · 2024-03-10T19:34:52.544Z · comments (37)

Utility is not the selection target
tailcalled · 2023-11-04T22:48:20.713Z · comments (1)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

going-durden on When do "brains beat brawn" in Chess? An experiment

A related thought: an intelligence can only work on the information that it has, regardless of its veracity, and it can only work on information that actually exists.

My hunch is that the plan of "AI boostraps itself to superintelligence, then superpower, then wipes out humanity" relies on it having access to information that is too well hidden to divine through sheer calculation and infogathering, regardless of its intelligence (ex: the location of all the military bunkers, and nuclear submarines humanity has), or simply does not exist (ex: future Human strategic choices based on coin-flips).

Most AI Apocalypse scenarios depend not only on the AI being superhumanly smart, but being inexplicably Omniscient about things that nobody could be plausibly Omniscient about.

lone17 on Refusal in LLMs is mediated by a single direction

Thanks for the insight on the locality check experiment.

For inducing refusal, I used the code from the demo notebook provided in your post. It doesn't have a section on inducing refusal but I just invert the difference-in-means vector and set the intervention layer to the single layer where said vector was extracted. I believe this has the same effect as what you described, which is to apply the intervention to every token at a single layer. Will checkout your repo to see if I missed something. Thank you for the discussion.

going-durden on Bitter lessons about lucid dreaming

this might not actually be always beneficial. Lucid dreaming also means you remember much more from the dreams, which can extend the lifespan of your recurring nightmares. Not to mention, if you dream lucidly, your consciousness is not resting, and intrusive thoughts will pile up.

christian-z-r on D&D.Sci September 2022: The Allocation Helm

Just putting a guess in here, before I go check if it is true:

Actually the 'Houses' have no effect, they are just the names of the different groups. In order to get a good rating, the members of each house should be as close as possible in Stat-space, or perhaps all be high in one stat (still experimenting with this). Since the early students were all placed by a functioning hat, each house had a well defining place in Stat space that it would carry on with. But since all current students have been randomly selected, we don't have to worry about this historical data. Instead, we should try to get the new students as close as possible to the randomly generated spot in Stat space for the current students. As such, I think Serpentyne might become the new House of Integrity. (I do believe a strange thing like this is also happening in real life, and is one of the main ways that political parties gradually change their positions in Stat space).

going-durden on Bitter lessons about lucid dreaming

My hypothesis is that a lot of things that seem impossible or very hard in a dream, are simply too boring to focus on. Its totally possible to consciously dream up a page of text, but who would really want to waste precious dreamtime to type?

tiago-macedo on Conservation of Expected Evidence and Random Sampling in Anthropics

But Heads outcome in Incubator Sleeping Beauty is not. You are not randomly selected among two immaterial souls to be instantiated. You are a sample of one. And as there is no random choice happening, you are not twice as likely to exist when the coin is Tails and there is no new information you get when you are created.

I am twice as likely to exist when the coin is Tails! After all, if the coin is Tails, then there are two of me. I understand how this can lead to a thirder conclusion:

Heads implies one chance for me to exist.
Tails implies two chances for me to exist.
I observe that I exist. This is predicted "twice as much" by the coin being Tails then Heads, so the probability of Tails is 2/3.

However, this there is a mistake happening in this reasoning. The correct one is the following:

Heads implies the the number of "mes" will be 1.
Tails implies the number of "mes" will be 2.
I observe that I exist. Does this mean that there is 1 of me, or 2 of me? I don't know.

So we can't extract information from my existence, and we're back to normalcy: 1/2 chance of Head or Tails.

going-durden on Bitter lessons about lucid dreaming

I have a suspicion that "flying dreams" have more to do with the state of your physical body than just your mind. I noticed I only dream of flight (or rather, levitation) if my muscles are very relaxed, like after a good massage, long hot bath, or good stretching. If im physically tense, either from effort or from stress, then I either cannot fly in a dream at all, or I keep losing the ability and falling, often with enough distress to wake myself up.

going-durden on Bitter lessons about lucid dreaming

In my experience, conscious Daydreaming can achieve the same results but more consistently. But then again, my imagination is extremely visual, I tend to "think in VR movies", so Lucid Daydreaming comes easier than Lucid Dreaming, and is far more controllable.

going-durden on Bitter lessons about lucid dreaming

I noticed that the ability to LD is strongly correlated with the condition known as "Maladaptive Daydreaming" (the "maladaptive" part here is subjective and situational, but it basically means the ability and need to have very addctive, vivid, VR-like daydreams that obscure waking reality).

I used to suffer from MD, until I learned to control it well enough to just be benign Daydreaming. Simultaneously, I achieved the ability to LD, which works on very similar principles to controlled Daydreaming.

The trick to LD if you are a person who daydreams visually, is to focus on plausibility. Trying to consciously train your daydreaming mind to enforce realistic, plausible daydream scenarios leads to the same mental need to "fix" unrealistic dreams, which either wakes you up from the dream or makes it Lucid.

Now, all that being said, LDs rarely approach the quality of Daydreams. Its extremely hard to make a Lucid Dream realistic and detailed enough not to feel trippy. Moreover, while most Daydreamers can make their Daydreams simulate tactile sensations, you cannot do the same in an actual dream. For one, erotic Lucid Dreaming is almost always pointless, because your lucid mind cannot force your sleeping body to actually experience sexual pleasure, let alone orgasm. If you are a bio male, it is likely you won't even achieve erection, so LD sex feels like trying to play pool with a rope.

The only good use I ever got from LDs is that it lets you remember bits of your dreams better and use it as raw footage to edit into your Daydreams.

khafra on The salt in pasta water fallacy

Note also that there are several free parameters in this example. E.g., I just moved to Germany, and now have wimpy German burners on my stove. If I put on a large container with 6L or more of water, and I do not cover it, the water will never go beyond bubble formation into a light simmer, let alone a rolling boil. If I cover the container at this steady state, it reaches a rolling boil in about another 90s.