LessWrong 2.0 Reader

View: New · Old · Top

← previous page (newer posts) · next page (older posts) →

← previous page (newer posts) · next page (older posts) →

Recent comments

vanessa-kosoy on Which skincare products are evidence-based?

Can you say more? What are "anabolic effects"? What does "cycling" mean in this context?

david-fendrich on Which skincare products are evidence-based?

David Sinclair mentioned in a podcast that he is also a bit worried about the long term anabolic effects of the retinoids. He suggested cycling it, possibly synchronized with other catabolic cycling such as fasting.

tailcalled on Ironing Out the Squiggles

(…Unless you do conditional sampling of a learned distribution, where you constrain the samples to be in a specific a-priori-extremely-unlikely subspace, in which case sampling becomes isomorphic to optimization in theory. (Because you can sample from the distribution of (reward, trajectory) pairs conditional on high reward.))

Does this isomorphism actually go through? I know decision transformers kinda-sorta show how you can do optimization-through-conditioning in practice, but in theory the loss function which you use to learn the distribution doesn't constrain the results of conditioning off-distribution, so I'd think you're mainly relying on having picked a good architecture which generalizes nicely out of distribution.

tailcalled on Ironing Out the Squiggles

One could reply, "Oh, sure, it's obvious that you can conditionally sample a learned distribution to safely do all sorts of economically valuable cognitive tasks, but that's not the danger of true AGI." And I ultimately think you're correct about that. But I don't think the conditional-sampling thing was obvious in 2004.

Idk. We already knew that you could use basic regression and singular vector methods to do lots of economically valuable tasks, since that was something that was done in 2004. Conditional-sampling "just" adds in the noise around these sorts of methods, so it goes to say that this might work too.

Adding noise obviously doesn't matter in 1 dimension except for making the outcomes worse. The reason we use it for e.g. images is that adding the noise does matter in high-dimensional spaces because without the noise you end up with the highest-probability outcome, which is out of distribution [LW · GW]. So in a way it seems like a relatively minor fix to generalize something we already knew was profitable in lots of cases.

On the other hand, I didn't learn the probability thing until playing with some neural network ideas for outlier detection and learning they didn't work. So in that sense it's literally true that it wasn't obvious (to a lot of people) back before deep learning took off.

And I can't deny that people were surprised that neural networks could learn to do art. To me this became relatively obvious with early GANs, which were later than 2004 but earlier than most people updated on it.

So basically I don't disagree but in retrospect [LW · GW] it doesn't seem that shocking [LW · GW].

cousin_it on Let's Design A School, Part 1

If a student is genuinely acting in bad faith—attending a class and ruining it for their peers—then they should be removed from the class and sent to a counselor/social worker.

The number of such students is larger than you think. But the more important question is what the social worker would do with the student - what tools would be available to them. Because by default the student will just disrupt another class tomorrow and so on. There isn't any magic method to make disruptive students non-disruptive; schools would love to have access to such magic if it existed.

richard_kennaway on AI #62: Too Soon to Tell

Is it possible that ethics-motivated laws will strange generative AI

"Strangle"?

florian_dietz on Mechanistic Interpretability Workshop Happening at ICML 2024!

Would a tooling paper be appropriate for this workshop?

I wrote a tool that helps ML researchers to analyze the internals of a neural network: https://github.com/FlorianDietz/comgra

It is not directly research on mechanistic interpretability, but this could be useful for many people working in the field.

review-bot on Shutting down AI is not enough. We need to destroy all technology.

The LessWrong Review [? · GW] runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2024. The top fifty or so posts are featured prominently on the site throughout the year. Will this post make the top fifty?

tailcalled on Mechanistically Eliciting Latent Behaviors in Language Models

Fair, it's eigenvectors should be equivalent to the singular vectors of the Jacobian.

mir on European Soylent alternatives

some metabolic pathways cannot be done at the same time

Have you updated on this since you made this comment (I ask to check whether I should invest in doing a search)? If not, do you now recall any specific examples?

LessWrong 2.0 Reader

Archive

Recent comments