LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Abstractions are not Natural
Alfred Harwood · 2024-11-04T11:10:09.023Z · comments (21)

AXRP Episode 36 - Adam Shai and Paul Riechers on Computational Mechanics
DanielFilan · 2024-09-29T05:50:02.531Z · comments (0)

[question] Which things were you surprised to learn are metaphors?
Gordon Seidoh Worley (gworley) · 2024-11-22T03:46:02.845Z · answers+comments (13)

[question] When engaging with a large amount of resources during a literature review, how do you prevent yourself from becoming overwhelmed?
corruptedCatapillar · 2024-11-01T07:29:49.262Z · answers+comments (2)

Fun With The Tabula Muris (Senis)
sarahconstantin · 2024-09-20T18:20:01.901Z · comments (0)

[link] Introduction to Super Powers (for kids!)
Shoshannah Tekofsky (DarkSym) · 2024-09-20T17:17:27.070Z · comments (0)

The case for more Alignment Target Analysis (ATA)
Chi Nguyen · 2024-09-20T01:14:41.411Z · comments (13)

[link] Linkpost: "Imagining and building wise machines: The centrality of AI metacognition" by Johnson, Karimi, Bengio, et al.
Chris_Leong · 2024-11-11T16:13:26.504Z · comments (6)

No Electricity in Manchuria
winstonBosan · 2024-11-19T01:11:58.661Z · comments (0)

AI Safety University Organizing: Early Takeaways from Thirteen Groups
agucova · 2024-10-02T15:14:00.137Z · comments (0)

Thoughts after the Wolfram and Yudkowsky discussion
Tahp · 2024-11-14T01:43:12.920Z · comments (13)

[question] When can I be numerate?
FinalFormal2 · 2024-09-12T04:05:27.710Z · answers+comments (3)

[link] Conventional footnotes considered harmful
dkl9 · 2024-10-01T14:54:01.732Z · comments (16)

A suite of Vision Sparse Autoencoders
Louka Ewington-Pitsos (louka-ewington-pitsos) · 2024-10-27T04:05:20.377Z · comments (0)

[link] Fictional parasites very different from our own
Abhishaike Mahajan (abhishaike-mahajan) · 2024-09-08T14:59:39.080Z · comments (0)

Improving Model-Written Evals for AI Safety Benchmarking
Sunishchal Dev (sunishchal-dev) · 2024-10-15T18:25:08.179Z · comments (0)

[link] SB 1047 gets vetoed
ryan_b · 2024-09-30T15:49:38.609Z · comments (1)

[link] A Theory of Equilibrium in the Offense-Defense Balance
Maxwell Tabarrok (maxwell-tabarrok) · 2024-11-15T13:51:33.376Z · comments (3)

A Straightforward Explanation of the Good Regulator Theorem
Alfred Harwood · 2024-11-18T12:45:48.568Z · comments (3)

[question] How should vegans think about Methionine needs?
ChristianKl · 2024-11-10T09:28:47.655Z · answers+comments (1)

Action derivatives: You’re not doing what you think you’re doing
PatrickDFarley · 2024-11-21T16:24:04.044Z · comments (0)

[link] Tokyo AI Safety 2025: Call For Papers
Blaine (blaine-rogers) · 2024-10-21T08:43:38.467Z · comments (0)

[link] "25 Lessons from 25 Years of Marriage" by honorary rationalist Ferrett Steinmetz
CronoDAS · 2024-10-02T22:42:30.509Z · comments (2)

Seeking Mechanism Designer for Research into Internalizing Catastrophic Externalities
c.trout (ctrout) · 2024-09-11T15:09:48.019Z · comments (2)

[link] A Defense of Peer Review
Niko_McCarty (niko-2) · 2024-10-22T16:16:49.982Z · comments (1)

Complete Feedback
abramdemski · 2024-11-01T16:58:50.183Z · comments (7)

[link] Foundations - Why Britain has stagnated [crosspost]
Nathan Young · 2024-09-23T10:43:20.411Z · comments (1)

[link] The Offense-Defense Balance of Gene Drives
Maxwell Tabarrok (maxwell-tabarrok) · 2024-09-27T16:47:25.976Z · comments (1)

GPT-3.5 judges can supervise GPT-4o debaters in capability asymmetric debates
Charlie George (charlie-george) · 2024-08-27T20:44:08.683Z · comments (7)

[link] The unreasonable effectiveness of plasmid sequencing as a service
Abhishaike Mahajan (abhishaike-mahajan) · 2024-10-08T02:02:55.352Z · comments (2)

Apply to the Cooperative AI PhD Fellowship by October 14th!
Lewis Hammond (lewis-hammond-1) · 2024-10-05T12:41:24.093Z · comments (0)

Launching Adjacent News
Lucas Kohorst (lucas-kohorst) · 2024-10-16T17:58:10.289Z · comments (0)

Evolution's selection target depends on your weighting
tailcalled · 2024-11-19T18:24:53.117Z · comments (22)

Boring & straightforward trauma explanation
lemonhope (lcmgcd) · 2024-11-08T09:45:19.486Z · comments (7)

[link] AI & wisdom 1: wisdom, amortised optimisation, and AI
L Rudolf L (LRudL) · 2024-10-28T21:02:51.215Z · comments (0)

Rashomon - A newsbetting site
ideasthete · 2024-10-15T18:15:02.476Z · comments (8)

Brief analysis of OP Technical AI Safety Funding
22tom (thomas-barnes) · 2024-10-25T19:37:41.674Z · comments (3)

[link] Miles Brundage: Finding Ways to Credibly Signal the Benignness of AI Development and Deployment is an Urgent Priority
Zach Stein-Perlman · 2024-10-28T17:00:18.660Z · comments (3)

[link] Day Zero Antivirals for Future Pandemics
Niko_McCarty (niko-2) · 2024-08-26T15:18:33.858Z · comments (2)

[link] AI safety tax dynamics
owencb · 2024-10-23T12:18:32.243Z · comments (0)

[link] How to choose what to work on
jasoncrawford · 2024-09-18T20:39:12.316Z · comments (6)

Plausibly Factoring Conjectures
Quinn (quinn-dougherty) · 2024-11-22T20:11:56.479Z · comments (1)

[link] Hyperpolation
Gunnar_Zarncke · 2024-09-15T21:37:00.002Z · comments (6)

Why I Think All The Species Of Significantly Debated Consciousness Are Conscious And Suffer Intensely
omnizoid · 2024-11-20T16:48:44.859Z · comments (2)

Geoffrey Hinton on the Past, Present, and Future of AI
Stephen McAleese (stephen-mcaleese) · 2024-10-12T16:41:56.796Z · comments (5)

[link] Social events with plausible deniability
Chipmonk · 2024-11-18T18:25:17.339Z · comments (22)

Boston Secular Solstice 2024: Call for Singers and Musicans
jefftk (jkaufman) · 2024-11-15T13:50:07.827Z · comments (0)

[link] on Science Beakers and DDT
bhauth · 2024-09-05T03:21:21.382Z · comments (13)

[question] What should OpenAI do that it hasn't already done, to stop their vacancies from being advertised on the 80k Job Board?
WitheringWeights (EZ97) · 2024-10-21T13:57:30.934Z · answers+comments (0)

Alignment by default: the simulation hypothesis
gb (ghb) · 2024-09-25T16:26:00.552Z · comments (39)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

williamkiely on A very strange probability paradox

My intuition was that B is bigger.

The justification was more or less the following: any time you roll until reaching two in a row, you will have also hit your second 6 at or before then. So regardless what the conditions are, $A$ must be larger than $B$ .

This seems obviously wrong. The conditions matter a lot. Without conditions that would be adequate to explain why it takes more rolls to get two 6s in a row than it does to get two 6s, but given the conditions that doesn't explain anything.

The way I think about it is that you are looking at a very long string of digits 1-6 and (for A) selecting the sequences of digits that end with two 6s in a row going backwards until just before you hit an odd number (which is not very far, since half of rolls are odd). If you ctrl+f "66" in your mind you might see that it's "36266" for a length of 4, but probably not. Half of your "66"s will be proceeded by an odd number, making half of the two-6s-in-a-row sequences length 2.

For people that didn't intuit that B is bigger, I wonder if you'd find it more intuitive if you imagine a D100 is used rather than a D6.

While two 100s in a row only happens once in 10,000 times, when they do happen they are almost always part of short sequences like "27,100,100" or "87,62,100,100" rather than "53,100,14,100,100".

On the other hand, when you ctrl+f for a single "100" in your mind and count backwards until you get another 100, you'll almost always encounter an odd number first before encountering another "100" and have to disregard the sequence. But occasionally the 100s will appear close together and by chance there won't be any odd numbers between them. So you might see "9,100,82,62,100" or "13,44,100,82,100" or "99,100,28,100" or "69,12,100,100".

Another way to make it more intuitive might be to imagine that you have to get several 100s in a row / several 100s rather than just two. E.g. For four 10ps: Ctrl+f "100,100,100,100" in your mind. Half the time it will be proceeded by an odd number for length 4, a quarter of the time it will be length 5, etc. Now look for all of the times that four 100s appear without there being any odd numbers between them. Some of these will be "100,100,100,100", but far more will be "100,32,100,100,88,100" and similar. And half the time there will be an odd number immediately before, a quarter of the time it will be odd-then-even before, etc.

kaj_sotala on Which things were you surprised to learn are not metaphors?

I have a friend with eidetic imagination who says that for her, there is literally no difference between seeing something and imagining it. Sometimes she's worried about losing track of reality if she were to imagine too much.

michael-latowicki on Missing forecasting tools: from catalogs to a new kind of prediction market

Thank you for pointing me that way. No I have not!

So I took a quick look. This looks a lot like HuggingFace. It's good to be reminded that these things exist and they do have some things in common with what I propose. As they stand, though, it's not it. Notice I'm talking about scientific models here. The mindset with which I approach this is one of theoretically-motivated, sparsely connected models, the kind you learn about when you take a university course in say, psychology or economics, not the kind you train with neural networks.

tailcalled on Benito's Shortform Feed

A possible model is that while good startups have an elevation in the "cult-factor", they have an even greater elevation in the unique factor related to the product they are building. Like SpaceX has cult-like elements but SpaceX also has Mars and Mars is much bigger than the cult-like elements, so if we define a cult to require that the biggest thing going on for them is cultishness then SpaceX is not a cult.

This is justified by LDSL (I really should write up the post explaining it...).

kaj_sotala on Which things were you surprised to learn are not metaphors?

Oh yeah, this. I used to think that "argh" or "it hurts" were just hyperbolic compliments for an excellent pun. Turns out, puns actually are painful to some people.

unexpectedvalues on Which things were you surprised to learn are not metaphors?

It took until I was today years old to realize that reading a book and watching a movie are visually similar experiences for some people!

dakara on How can we prevent AGI value drift?

What's most worrying is the fact that in your post If we solve alignment, do we die anyway? [LW · GW] you mentioned your worries about multipolar scenarios. However, I am not sure we'd be much better off in a unipolar scenario, though. If there is one group of people controlling AGI, then it might be actually even harder to make them give it up. They'd have a large amount of power and no real threat to it (no multipolar AGIs threatening to launch an attack).

However, I am not well-versed in literature on this topic, so if there is any plan for how we can safeguard ourself in such a scenario (unipolar AGI control), then I'd be very very happy to learn about it.

daniel-kokotajlo on Daniel Kokotajlo's Shortform

Maybe about a year longer? But then the METR R&D benchmark results came out around the same time and shortened them. Idk what to think now.

deepthoughtlife on LLM chatbots have ~half of the kinds of "consciousness" that humans believe in. Humans should avoid going crazy about that.

You might believe that the distinctions I make are idiosyncratic, though the meanings are in fact clearly distinct in ordinary usage, but I clearly do not agree with your misleading use of what people would be lead to think are my words and you should take care to not conflate things. You want people to precisely match your own qualifiers in cases where that causes no difference in the meaning of what is said (which makes enough sense), but will directly object to people pointing out a clear miscommunication of yours because you do not care about a difference in meaning. And you are continually asking me to give in on language regardless of how correct I may be while claiming it is better to privilege. That is not a useful approach.

(I take no particular position on physicalism at all.) Since you are a not a panpsychist, you would likely believe that consciousness is not common to the vast majority of things. That means the basic prior for if an item is conscious is, 'almost certainly not' unless we have already updated it based on other information. Under what reference class or mechanism should we be more concerned about the consciousness of an LLM than an ordinary computer running ordinary programs? There is nothing that seems particularly likely to lead to consciousness in its operating principles.

There are many people, including the original poster of course, trying to use behavioral evidence to get around that, so I pointed out how weak that evidence is.

An important distinction you seem to not see in my writing (whether because I wrote unclearly or you missed it doesn't really matter) is that when I speak of knowing the mechanisms by which an llm works is that I mean something very fundamental. We know these two things: 1)exactly what mechanisms are used in order to do the operations involved in executing the program (physically on the computer and mathematically) and 2) the exact mechanisms through which we determine which operations to perform.

As you seem to know, LLMs are actually extremely simple programs of extremely large matrices with values chosen by the very basic system of gradient descent. Nothing about gradient descent is especially interesting from a consciousness point of view. It's basically a massive use very simplified ODE solvers in a chain, which are extremely well understood and clearly have no consciousness at all if anything mathematical doesn't. It could also be viewed as just a very large number of variables in a massive but simple statistical regression. Notably, if gradient descent were related to consciousness directly, we would still have no reason to believe that an LLM doing inference rather than training would be conscious. Simple matrix math also doesn't seem like much of a candidate for consciousness either.

Someone trying to make the case for consciousness would thus need to think it likely that one of the other mechanisms in LLMs are related to consciousness, but LLMs are actually missing a great many mechanisms that would enable things like self-reflection and awareness (including a number that were included in primitive earlier neural networks such as recursion and internal loops). The people trying to make up for those omissions do a number of things to attempt to recreate it (with 'attention' being the built-in one, but also things like adding in the use of previous outputs), but those very simple approaches don't seem like likely candidates for consciousness (to me).

Thus, it remains extremely unlikely that an LLM is conscious.

When you say we don't know what mechanisms are used, you seem to be talking about not understanding a completely different thing than I am saying we understand. We don't understand exactly what each weight means (except in some rare cases that some researchers have seemingly figured out) and why it was chosen to be that rather than any number of other values that would work out similarly, but that is most likely unimportant to my point about mechanisms. This is, as far as I can tell, an actual ambiguity in the meaning of 'mechanism' that we can be talking about completely different levels at which mechanisms could operate, and I am talking about the very lowest ones.

Note that I do not usually make a claim about the mechanisms underlying consciousness in general except that it is unlikely to be these extremely basic physical and mathematical ones. I genuinely do not believe that we know enough about consciousness to nail it down to even a small subset of theories. That said, there are still a large number of theories of consciousness that either don't make internal sense, or seem like components even if part of it.

Pedantically, if consciousness is related to 'self-modeling' the implications involve it needing to be internal for the basic reason that it is just 'modeling' otherwise. I can't prove that external modeling isn't enough for consciousness, (how could I?) but I am unaware of anyone making that contention.

So, would your example be 'self-modeling'? Your brief sentence isn't enough for me to be sure what you mean. But if it is related to people's recent claims related to introspection on this board, then I don't think so. It would be modeling the external actions of an item that happened to turn out to be itself. For example, if I were to read the life story of a person I didn't realize was me, and make inferences about how the subject would act under various conditions, that isn't really self-modeling. On the other hand, in the comments to that, I actually proposed that you could train it on its own internal states, and that could maybe have something to do with this (if self-modeling is true). This is something we do not train current llms on at all though.

As far as I can tell (as someone who finds the very idea of illusionism strange), illusionism is itself not a useful point of view in regards to this dispute, because it would make the question of whether an LLM was conscious pretty moot. Effectively, the answer would be something to the effect of 'why should I care?' or 'no.' or even 'to the same extent as people.' regardless of how an LLM (or ordinary computer program, almost all of which process information heavily) works depending on the mood of the speaker. If consciousness is an illusion, we aren't talking about anything real, and it is thus useful to ignore illusionism when talking about this question.

As I mentioned before, I do not have a particularly strong theory for what consciousness actually is or even necessarily a vague set of explanations that I believe in more or less strongly.

I can't say I've heard of 'attention schema theory' before nor some of the other things you mention next like 'efference copy' (but the latter seems to be all about the body which doesn't seem all that promising a theory for what consciousness may be, though I also can't rule out that it being part of it since the idea is that it is used in self-modeling which I mentioned earlier I can't actually rule out either.).

My pet theory of emotions is that they are simply a shorthand for 'you should react in ways appropriate ways to a situation that is...' a certain way. For example (and these were not carefully chosen examples) anger would be 'a fight', happiness would be 'very good', sadness would be 'very poor' and so on. And more complicated emotions might obviously include things like it being a good situation but also high stakes. The reason for using a shorthand would be because our conscious mind is very limited in what it can fit at once. Despite this being uncertain, I find this a much more likely than emotions themselves being consciousness.

I would explain things like blindsight (from your ipsundrum link) through having a subconscious mind that gathers information and makes a shorthand before passing it to the rest of the mind (much like my theory of emotions). The shorthand without the actual sensory input could definitely lead to not seeing but being able to use the input to an extent nonetheless. Like you, I see no reason why this should be limited to the one pathway they found in certain creatures (in this case mammals and birds). I certainly can't rule out that this is related directly to consciousness, but I think it more likely to be another input to consciousness rather than being consciousness.

Side note, I would avoid conflating consciousness and sentience (like the ipsundrum link seems to). Sensory inputs do not seem overly necessary to consciousness, since I can experience things consciously that do not seem related to the senses. I am thus skeptical of the idea that consciousness is built on them. (If I were really expounding my beliefs, I would probably go on a diatribe about the term 'sentience' but I'll spare you that. As much as I dislike sentience based consciousness theories, I would admit them as being theories of consciousness in many cases.)

Again, I can't rule out global workspace theory, but I am not sure how it is especially useful. What makes a globabl workspace conscious that doesn't happen in an ordinary computer program I could theoretically program myself? A normal program might take a large number of inputs, process them separately, and then put it all together in a global workspace. It thus seems more like a theory of 'where does it occur' than 'what it is'.

'Something to do with electrical flows in the brain' is obviously not very well specified, but it could possibly be meaningful if you mean the way a pattern of electrical flows causes future patterns of electrical flows as distinct from the physical structures the flows travel through.

Biological nerves being the basis of consciousness directly is obviously difficult to evaluate. It seems too simple, and I am not sure whether there is a possibility of having such a tiny amount of consciousness that then add up to our level of consciousness. (I am also unsure about whether there is a spectrum of consciousness beyond the levels known within humans).

I can't say I would believe a slime mold is conscious (but again, can't prove it is impossible.) I would probably not believe any simple animals (like ants) are either though even if someone had a good explanation for why their theory says the ant would be. Ants and slime molds still seem more likely to be conscious to me than current LLM style AI though.

mishka on mishka's Shortform

METR releases a report, Evaluating frontier AI R&D capabilities of language model agents against human experts: https://metr.org/blog/2024-11-22-evaluating-r-d-capabilities-of-llms/

Daniel Kokotajlo and Eli Lifland both feel that one should update towards shorter timelines remaining until the start of rapid acceleration via AIs doing AI research based on this report:

https://x.com/DKokotajlo67142/status/1860079440497377641

https://x.com/eli_lifland/status/1860087262849171797