LessWrong 2.0 Reader

View: New · Old · Top

← previous page (newer posts) · next page (older posts) →

← previous page (newer posts) · next page (older posts) →

Recent comments

whestler on Did Bengio and Tegmark lose a debate about AI x-risk against LeCun and Mitchell?

I wasn't able to find the full video on the site you linked, but I found it here, if anyone else has the same issue:

the-gears-to-ascension on Deep Honesty

being able to credibly commit to doing this at appropriate times seems useful. I wouldn't want to commit to doing it at all times; becoming cooperatebot makes it rational for cooperative-but-preference-misaligned actors to exploit you. Shallow honesty seems like a good starting point for being able to say when you are attempting to be deep honest, perhaps. But for example, I would sure appreciate it if people could be less deeply honest about the path to ai capabilities. I do think the "deeply honest at the meta level" thing has some promise.

mikbp on Extra Tall Crib

at that point you should just move to something optimized for being easy to get in and out of, like a bed

yes, yes. Exactly. Isn't it much more practical to put her in a bet/mattress on the floor? That's what we do. Just using the mattress from the crib, for example.

abhimanyu-pallavi-sudhir on Some Experiments I'd Like Someone To Try With An Amnestic

it's extremely high immediate value -- it solves IP rights entirely.

It's the barbed wire for IP rights

yori-92 on Can Kauffman's NK Boolean networks make humans swarm?

Nice! I actually had this as a loose idea in the back of my mind for a while, to have a network of people connected like this and have them signal to each other their track of the day, which could be actual fun. It is a feasible use case as well. The underlying reasoning is also that (at least for me) I would be more open to adopt an idea from a person with whom you feel a shared sense of collectivity, instead of an algorithm that thinks it knows me. Intrinsically, I want such an algorithm to be wrong, for the sake of my own autonomy :)

The way I see it, the relevance for alignment is to ask: what do we actually mean when saying that two intelligent agents are aligned? Are you and I aligned if we would make the same decision in a trolley problem? Or if we motivate our decisions in the same way? Or if we just don't kill each other? None of these are meaningful indicators of two people being aligned, let alone humans and AI. And with unreliable indicators, will we ever succeed in solving the issue? I'd say two agents are aligned when one agent's most rewarding decision results in a benefit of the other as well. Generalizing and scaling that alignment to many situations and many agents/people necessitates a 'theory of mind' mechanism, as well as a way to keep certain properties invariant under scaling and translation in complex networks. This is really a physicist's way of thinking about the problem and I am just slowly getting into the language that others in the AI/alignment fields use.

jacques-thibodeau on jacquesthibs's Shortform

Do we expect future model architectures to be biased toward out-of-context reasoning (reasoning internally rather than in a chain-of-thought)? As in, what kinds of capabilities would lead companies to build models that reason less and less in token-space?

I mean, the first obvious thing would be that you are training the model to internalize some of the reasoning rather than having to pay for the additional tokens each time you want to do complex reasoning.

The thing is, I expect we'll eventually move away from just relying on transformers with scale. And so I'm trying to refine my understanding of the capabilities that are simply bottlenecked in this paradigm, and that model builders will need to resolve through architectural and algorithmic improvements. (Of course, based on my previous posts, I still think data is a big deal.)

Anyway, this kind of thinking eventually leads to the infohazardous area of, "okay then, what does the true AGI setup look like?" This is really annoying because it has alignment implications. If we start to move increasingly towards models that are reasoning outside of token-space, then alignment becomes harder. So, are there capability bottlenecks that eventually get resolved through something that requires out-of-context reasoning?

So far, it seems like the current paradigm will not be an issue on this front. Keep scaling transformers, and you don't really get any big changes in the model's likelihood of using out-of-context reasoning.

This is not limited to out-of-context reasoning. I'm trying to have a better understanding of the (dangerous) properties future models may develop simply as a result of needing to break a capability bottleneck. My worry is that many people end up over-indexing on the current transformer+scale paradigm (and this becomes insufficient for ASI), so they don't work on the right kinds of alignment or governance projects.

---

I'm unsure how big of a deal this architecture will end up being, but the rumoured xLSTM just dropped. It seemingly outperforms other models at the same size:

Maybe it ends up just being another drop in the bucket, but I think we will see more attempts in this direction.

Claude summary:

The key points of the paper are:

The authors introduce exponential gating with memory mixing in the new sLSTM variant. This allows the model to revise storage decisions and solve state tracking problems, which transformers and state space models without memory mixing cannot do.
They equip the mLSTM variant with a matrix memory and covariance update rule, greatly enhancing the storage capacity compared to the scalar memory cell of vanilla LSTMs. Experiments show this matrix memory provides a major boost.
The sLSTM and mLSTM are integrated into residual blocks to form xLSTM blocks, which are then stacked into deep xLSTM architectures.
Extensive experiments demonstrate that xLSTMs outperform state-of-the-art transformers, state space models, and other LSTMs/RNNs on language modeling tasks, while also exhibiting strong scaling behavior to larger model sizes.

This work is important because it presents a path forward for scaling LSTMs to billions of parameters and beyond. By overcoming key limitations of vanilla LSTMs - the inability to revise storage, limited storage capacity, and lack of parallelizability - xLSTMs are positioned as a compelling alternative to transformers for large language modeling.

Instead of doing all computation step-by-step as tokens are processed, advanced models might need to store and manipulate information in a compressed latent space, and then "reason" over those latent representations in a non-sequential way.

The exponential gating with memory mixing introduced in the xLSTM paper directly addresses this need. Here's how:

Exponential gating allows the model to strongly update or forget the contents of each memory cell based on the input. This is more powerful than the simple linear gating in vanilla LSTMs. It means the model can decisively revise its stored knowledge as needed, rather than being constrained to incremental changes. This flexibility is crucial for reasoning, as it allows the model to rapidly adapt its latent state based on new information.
Memory mixing means that each memory cell is updated using a weighted combination of the previous values of all cells. This allows information to flow and be integrated between cells in a non-sequential way. Essentially, it relaxes the sequential constraint of traditional RNNs and allows for a more flexible, graph-like computation over the latent space.
Together, these two components endow the xLSTM with a dynamic, updateable memory that can be accessed and manipulated "outside" the main token-by-token processing flow. The model can compress information into this memory, "reason" over it by mixing and gating cells, then produce outputs guided by the updated memory state.

In this way, the xLSTM takes a significant step towards the kind of "reasoning outside token-space" that I suggested would be important for highly capable models. The memory acts as a workspace for flexible computation that isn't strictly tied to the input token sequence.

Now, this doesn't mean the xLSTM is doing all the kinds of reasoning we might eventually want from an advanced AI system. But it demonstrates a powerful architecture for models to store and manipulate information in a latent space, at a more abstract level than individual tokens. As we scale up this approach, we can expect models to perform more and more "reasoning" in this compressed space rather than via explicit token-level computation.

joseph-bloom on Announcing Neuronpedia: Platform for accelerating research into Sparse Autoencoders

Neuronpedia has an API (copying from a recent message Johnny wrote to someone else recently.):

"Docs are coming soon but it's really simple to get JSON output of any feature. just add "/api/feature/" right after "neuronpedia.org".for example, for this feature: https://neuronpedia.org/gpt2-small/0-res-jb/0
the JSON output of it is here: https://www.neuronpedia.org/api/feature/gpt2-small/0-res-jb/0
(both are GET requests so you can do it in your browser)note the additional "/api/feature/"i would prefer you not do this 100,000 times in a loop though - if you'd like a data dump we'd rather give it to you directly."

Feel free to join the OSMI slack and post in the Neuronpedia or Sparse Autoencoder channels if you have similar questions in the future :) https://join.slack.com/t/opensourcemechanistic/shared_invite/zt-1qosyh8g3-9bF3gamhLNJiqCL_QqLFrA

yori-92 on The Social Impact of Trolley Problems

Although I somewhat agree with the comment about style, I feel that the point you're making could be received with some more enthusiasm. How well-recognized is this trolley problem fallacy? The way I see it, the energy spent on thinking about the trolley problem in isolation illustrates innate human short-sightedness and perhaps a clear limit of human intelligence as well. 'Correctly' solving one trolley problem does not prevent that you or someone else will be confronted with the next. My line of arguing is that the question of ethical decision making requires an agent to also have a proper 'theory of mind': if I am making this decision, what decision will a next person or agent have to deal with? If my car with four passengers chooses to avoid running over five people to just hit one, could it also put another oncoming car in the position where they have to choose between a collision with 8 people and evading and killing 5? And of course: whose decisions resulted in the trolley problem I'm currently facing and what is their responsibility? I recently contributed a piece that is essentially about propagating consequences of decisions and I'm curious how it will be received. Could it be that this is a bit of a blind spot in ethics and/or AI safety? Given the situations we've gotten ourselves in as a society, I feel this also is an area in which humans can very easily be outsmarted...

johannes-c-mayer on Atoms to Agents Proto-Lectures

I made a slightly improved version that adds subtitles and skips silence.

johannes-c-mayer on Applied Linear Algebra Lecture Series

Made a slightly improved version.

LessWrong 2.0 Reader

Archive

Recent comments