LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Found Paper: "FDT in an evolutionary environment"
the gears to ascension (lahwran) · 2023-11-27T05:27:50.709Z · comments (47)

EA Infrastructure Fund's Plan to Focus on Principles-First EA
Linch · 2023-12-06T03:24:55.844Z · comments (0)

[link] align your latent spaces
bhauth · 2023-12-24T16:30:09.138Z · comments (8)

[link] Link Collection: Impact Markets
Saul Munn (saul-munn) · 2023-12-26T09:01:48.815Z · comments (0)

Without Fundamental Advances, Rebellion and Coup d'État are the Inevitable Outcomes of Dictators & Monarchs Trying to Control Large, Capable Countries
Roko · 2024-01-31T10:14:02.042Z · comments (34)

Fifteen Lawsuits against OpenAI
Remmelt (remmelt-ellen) · 2024-03-09T12:22:09.715Z · comments (4)

On the 2nd CWT with Jonathan Haidt
Zvi · 2024-04-05T17:30:05.223Z · comments (3)

Scientific Notation Options
jefftk (jkaufman) · 2024-05-18T15:10:02.181Z · comments (13)

Incentive Learning vs Dead Sea Salt Experiment
Steven Byrnes (steve2152) · 2024-06-25T17:49:01.488Z · comments (1)

[question] Me & My Clone
SimonBaars (simonbaars) · 2024-07-18T16:25:40.770Z · answers+comments (22)

Cheap Whiteboards!
Johannes C. Mayer (johannes-c-mayer) · 2024-08-08T13:52:59.627Z · comments (2)

Interpretability of SAE Features Representing Check in ChessGPT
Jonathan Kutasov (jonathan-kutasov) · 2024-10-05T20:43:36.679Z · comments (2)

European Progress Conference
Martin Sustrik (sustrik) · 2024-10-06T11:10:03.819Z · comments (11)

Why is there Nothing rather than Something?
Logan Zoellner (logan-zoellner) · 2024-10-26T12:37:50.204Z · comments (3)

Domain-specific SAEs
jacob_drori (jacobcd52) · 2024-10-07T20:15:38.584Z · comments (0)

[link] Generic advice caveats
Saul Munn (saul-munn) · 2024-10-30T21:03:07.185Z · comments (1)

[question] Any real toeholds for making practical decisions regarding AI safety?
lemonhope (lcmgcd) · 2024-09-29T12:03:08.084Z · answers+comments (6)

An AI crash is our best bet for restricting AI
Remmelt (remmelt-ellen) · 2024-10-11T02:12:03.491Z · comments (3)

Distinguishing ways AI can be "concentrated"
Matthew Barnett (matthew-barnett) · 2024-10-21T22:21:13.666Z · comments (2)

[question] What prevents SB-1047 from triggering on deep fake porn/voice cloning fraud?
ChristianKl · 2024-09-26T09:17:39.088Z · answers+comments (21)

[link] Evaluating Synthetic Activations composed of SAE Latents in GPT-2
Giorgi Giglemiani (Rakh) · 2024-09-25T20:37:48.227Z · comments (0)

There aren't enough smart people in biology doing something boring
Abhishaike Mahajan (abhishaike-mahajan) · 2024-10-21T15:52:04.482Z · comments (13)

[link] Predicting Influenza Abundance in Wastewater Metagenomic Sequencing Data
jefftk (jkaufman) · 2024-09-23T17:25:58.380Z · comments (0)

Bay Winter Solstice 2024: song leading auditions
tcheasdfjkl · 2024-11-10T23:59:08.199Z · comments (0)

the Daydication technique
chaosmage · 2024-10-18T21:47:46.448Z · comments (0)

[link] overengineered air filter shelving
bhauth · 2024-11-08T22:04:39.987Z · comments (2)

Sleeping on Stage
jefftk (jkaufman) · 2024-10-22T00:50:07.994Z · comments (3)

[link] A brief history of the automated corporation
owencb · 2024-11-04T14:35:04.906Z · comments (1)

Do Sparse Autoencoders (SAEs) transfer across base and finetuned language models?
Taras Kutsyk · 2024-09-29T19:37:30.465Z · comments (8)

Standard SAEs Might Be Incoherent: A Choosing Problem & A “Concise” Solution
Kola Ayonrinde (kola-ayonrinde) · 2024-10-30T22:50:45.642Z · comments (0)

Option control
Joe Carlsmith (joekc) · 2024-11-04T17:54:03.073Z · comments (0)

SAE features for refusal and sycophancy steering vectors
neverix · 2024-10-12T14:54:48.022Z · comments (4)

[link] Care Doesn't Scale
stavros · 2024-10-28T11:57:38.742Z · comments (1)

Thinking in 2D
sarahconstantin · 2024-10-20T19:30:05.842Z · comments (0)

SAEs you can See: Applying Sparse Autoencoders to Clustering
Robert_AIZI · 2024-10-28T14:48:16.744Z · comments (0)

The Foraging (Ex-)Bandit [Ruleset & Reflections]
abstractapplic · 2024-11-14T20:16:21.535Z · comments (3)

[question] Seeking AI Alignment Tutor/Advisor: $100–150/hr
MrThink (ViktorThink) · 2024-10-05T21:28:16.491Z · answers+comments (3)

[link] Emotional issues often have an immediate payoff
Chipmonk · 2024-06-10T23:39:40.697Z · comments (2)

Ideas for Next-Generation Writing Platforms, using LLMs
ozziegooen · 2024-06-04T18:40:24.636Z · comments (4)

Optimizing Repeated Correlations
SatvikBeri · 2024-08-01T17:33:23.823Z · comments (1)

Am I going insane or is the quality of education at top universities shockingly low?
ChrisRumanov (pseudonymous-ai) · 2023-11-20T03:53:30.056Z · comments (30)

Taking Into Account Sentient Non-Humans in AI Ambitious Value Learning: Sentientist Coherent Extrapolated Volition
Adrià Moret (Adrià R. Moret) · 2023-12-02T14:07:29.992Z · comments (31)

Quick takes on "AI is easy to control"
So8res · 2023-12-02T22:31:45.683Z · comments (49)

How do LLMs give truthful answers? A discussion of LLM vs. human reasoning, ensembles & parrots
Owain_Evans · 2024-03-28T02:34:21.799Z · comments (0)

Agent membranes/boundaries and formalizing “safety”
Chipmonk · 2024-01-03T17:55:21.018Z · comments (46)

AI #57: All the AI News That’s Fit to Print
Zvi · 2024-03-28T11:40:05.435Z · comments (14)

[link] Arrogance and People Pleasing
Jonathan Moregård (JonathanMoregard) · 2024-02-06T18:43:09.120Z · comments (7)

Exploring OpenAI's Latent Directions: Tests, Observations, and Poking Around
Johnny Lin (hijohnnylin) · 2024-01-31T06:01:27.969Z · comments (4)

The Sequences on YouTube
Neil (neil-warren) · 2024-01-07T01:44:39.663Z · comments (9)

Improving SAE's by Sqrt()-ing L1 & Removing Lowest Activating Features
Logan Riggs (elriggs) · 2024-03-15T16:30:00.744Z · comments (5)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

viliam on D0TheMath's Shortform

if you ask mathematicians whether ZFC + not Consistent(ZFC) is consistent, they will say "no, of course not!"

I suspect than many people's intuitive interpretation of "consistent" is ω-consistent, especially if they are not aware of the distinction.

viliam on Lalit Shankar Chowdhury's Shortform

I find it difficult to make distinct categories, but there seem to be two dimensions along which to classify relations:

How intense is the relation / how much we "click" emotionally and intellectually.
Whether the relation is expected to survive the change of current context.

(Even this is not a clear distinction, because "my relatives" is kinda contextual, but the context is there forever.)

Mapping to your system: close friends = high intensity context independent; friendly acquaintances = high intensity contextual; acquaintances = low intensity contextual.

One quadrant seems to be missing, but maybe that makes sense: if the relation is low intensity, why would people bother to keep it outside of the context where it originated.

egor-timatkov on "It's a 10% chance which I did 10 times, so it should be 100%"

It's a great idea. I ended up bolding the one line that states my conclusion to make it easier to spot.

kilotaras on "It's a 10% chance which I did 10 times, so it should be 100%"

Strong agree.

Hiding the main result beyond spoiler makes for a great reveal, but less useful for skimming.

egor-timatkov on "It's a 10% chance which I did 10 times, so it should be 100%"

That's crazy how close that is. (to the nearest half a percent) will be a fun fact that I remember now!

turntrout on Announcing turntrout.com, my new digital home

Thanks for the Quenya tip. I tried Artano and it didn't work very quickly. Given that apparently it does in fact work, I can try that again.

q-home on Q Home's Shortform

Creating an inhumanly good model of a human is related to formulating their preferences.

How does this relate to my idea? I'm not talking about figuring out human preferences.

Thus it's a step towards eliminating path-dependence of particular life stories

What is "path-dependence of particular life stories"?

I think things (minds, physical objects, social phenomena) should be characterized by computations that they could simulate/incarnate.

Are there other ways to characterize objects? Feels like a very general (or even fully general) framework. I believe my idea can be framed like this, too.

saidachmiz on Announcing turntrout.com, my new digital home

Do you think a 3-state dark mode selector is better than a 1-state (where “auto” is the only state)? My website is 1-state, on the assumption that auto will work for almost everyone and it lets me skip the UI clutter of having a lighting toggle that most people won’t use.

Gwern discusses this on his “Design Graveyard” page:

Auto-dark mode: a good idea but “readers are why we can’t have nice things”.

OSes/browsers have defined a ‘global dark mode’ toggle the reader can set if they want dark mode everywhere, and this is available to a web page; if you are implementing a dark mode for your website, it then seems natural to just make it a feature and turn on iff the toggle is on. There is no need for complicated UI-cluttering widgets with complicated implementations. And yet—if you do do that, readers will regularly complain about the website acting bizarre or being dark in the daytime, having apparently forgotten that they enabled it (or never understood what that setting meant).

A widget is necessary to give readers control, although even there it can be screwed up: many websites settle for a simple negation switch of the global toggle, but if you do that, someone who sets dark mode at day will be exposed to blinding white at night… Our widget works better than that. Mostly.

Is it possible that someday dark-mode will become so widespread, and users so educated, that we could quietly drop the widget? Yes, even by 2023 dark-mode had become quite popular, and I suspect that an auto-dark-mode would cause much less confusion in 2024 or 2025. However, we are stuck with the widget—once we had a widget, the temptation to stick in more controls (for reader-mode and then disabling/enabling popups) was impossible to resist, and who knows, it may yet accrete more features (site-wide fulltext search?), rendering removal impossible.

(The site-wide fulltext search feature has since been added, of course.)

ape-in-the-coat on Quantum Immortality: A Perspective if AI Doomers are Probably Right

You are right, and it's a serious counterargument to consider.
You are also right that the Anthropic Trilemma and Magic by Forgetting do not work with path-dependent identity.

Okay, glad we are on the same page here.

However, we can almost recreate the magic machine from the Anthropic Trilemma using path-based identity

I'm not sure I understand your example and how it recreates the magic. Let me try to describe to it with my own words, and then correct me if I got something wrong.

You are put to sleep. Then you are splitted into two people. Then, on random, one of them is put into red room and one into green room. Let's say that person 1 is in red room and 2 in green room. Then the person 2 is splitted into two people: 21 and 22. Both of them are keept in green rooms. Then everyone is awaken. What should be your credence to awake in a red room?

Here there are three possibilities: 50% to be 1 in a red room and 25% chance to be either 21 or 22 in green rooms. No matter how much a person in a green room is split, the total probability for greenness stays the same. All is quite normal and there is no magic.

Now let's add a twist.

Instead of putting both 21 and 22 in green rooms, one of them - let it be 21 - is put in a red room.

In this situation, total probability for red room is P(1) + P(21) = 75%. And if we split the 2 more and put more of its parts in red rooms we get highter and highter probability to be in red room. Therefore we get magical ability to manipulate probability.

Am I getting you correctly?

I do not see anything problematic with such "manipulation of probability". We do not change our estimate just because more people with the same experience are created. We change the estemate because different fraction of people get different experience. This is no more magical than putting both 1 and 2 into red rooms and noticing that suddenly the probability for being in red room reached 100%, compared to the initial formulation where it was mere 50%. Of course it did! That's completely lawful behaviour of probability theoretic reasoning.

Notice that we can't actually recreate the anthropic trilemma and be certain to win lottery this way. Because we can't move people between branches. Therefore everything adds up to normality.

Also, path-dependent identity opens the door to back-causation and premonition, because if we normalize outputs of some black box where paths are mixed, similar to the magic machine discussed above

We just need to restrict the mixing of the paths, which is the restriction of QM anyway. Or maybe I'm missing something? Could you give me an example with such backwards causality? Because as far as I see, everything is quite straightforward.

The main problem of path-dependent identity is that we assume the existence of a "global hidden variable" for any observer. It is hidden as it can't be measured by an outside viewer and only represents the subjective chances of the observer to be one copy and not another. And it is global as it depends on the observer's path, not their current state. It therefore contradicts the view that mind is equal to a Turing computer (functionalism) and requires the existence of some identity carrier which moves through paths (qualia, quantum continuity, or soul).

Seems like we are just confused about this "identity" thingy and therefore don't know how to correctly reason about it. In such situations we are supposed to

Acknowledge that we are are confused
Stop speculating on top of our confusion and jumping to conclusions based on it
Outline the possible options to the best of our understanding and keep an open mind until we manage to resolve the confusion

It's already clear that "mind" and "identity" are not the same thing. We can talk about identities of things that do not possess a mind, and identities are unique while, there can exist copies of the same mind.So minds can very well be Turing computers, but identities are something else, or even not a thing at all.

Our intuitive desire to drag in consciousness/qualia/soul also appears completely unhelpful after thinking about it for the first five minutes. Non-conscious minds can do the same probability theoretic reasonings as conscious ones. Nothing changes if 1, 21 and 22 from the problem above are not humans but programs executed on different computers.

Whatever extra variable we need it seems to be something that a Laplace's demon would know. It's a knowledge about whether a mind was split into n instances simultaneously or through multiple steps. It indeed means that something else except the immediate state of the mind is important for "indentity" considerations, but this something can very well be completely physical - just the past history of causes and effects that led to this state of the mind.

yitz on Ayn Rand’s model of “living money”; and an upside of burnout

Reminds me of Internal Family Systems, which has a nice amount of research behind it if you want to learn more.