LessWrong 2.0 Reader
View: New · Old · TopRestrict date range: Today · This week · This month · Last three months · This year · All time
next page (older posts) →
next page (older posts) →
Nice!! I don't know much about that moisturizer but the rest looks good to me
rosiecam on Which skincare products are evidence-based?Seems like the evidence is overwhelmingly in favor of sunscreen, the studies I've seen against it generally seem to not address the obvious confounder that people who tend to wear sunscreen more are also the ones who have a lifestyle that involves being in the sun a lot more.
rosiecam on Which skincare products are evidence-based?Dermatica prompts you to send them photos every few months so they can check how your skin is reacting, but it's also convenient because you can look back and see the improvement.
logan-zoellner on an effective ai safety initiativeIt's not trying to address present harms, it's trying to address future harms, which are the important ones.
A real AI system that kills literally everyone will do so by gaining power/resources over a period of time. Most likely it will do so the same way existing bad-agents accumulate power and resources.
Unless you're explicitly committing to the Diamondoid bacteria thing, stopping hacking is stopping AI from taking over the world.
logan-zoellner on an effective ai safety initiativePoint taken. "$$$" was not the correct framing (if we're specifically talking about the Gwern story). I will edit to say "it accumulates 'resources'".
The Gwern story has faster takeoff than I would expect (especially if we're talking a ~GPT4.5 autoGPT agent), but the focus on money vs just hacking stuff is not the point of my essay.
bogdan-ionut-cirstea on Mechanistically Eliciting Latent Behaviors in Language ModelsIn future work, one could imagine automating the evaluation of the coherence and generalization of learned steering vectors, similarly to how Bills et al. (2023) automate interpretability of neurons in language models. For example, one could prompt a trusted model to produce queries that explore the limits and consistency of the behaviors captured by unsupervised steering vectors.
Probably even better to use interpretability agents (e.g. MAIA, AIA) for this, especially since they can do (iterative) hypothesis testing.
fabien-roger on Fabien's ShortformI also listened to How to Measure Anything in Cybersecurity Risk 2nd Edition by the same author. I had a huge amount of overlapping content with The Failure of Risk Management (and the non-overlapping parts were quite dry), but I still learned a few things:
I'd like to find a good resource that explains how red teaming (including intrusion tests, bug bounties, ...) can fit into a quantitative risk assessment.
the-gears-to-ascension on Biorisk is an Unhelpful Analogy for AI RiskIn other words, AI risk looks at least as bad as bio risk, but in many ways much worse. Agree, but I think trying to place these things in a semantically meaningful hand-designed multidimensional space of factors is probably a useful exercise, along with computer security. Your axes of comparison are an interesting starting point.
bogdan-ionut-cirstea on Bogdan Ionut Cirstea's ShortformI wonder how much near-term interpretability [V]LM agents (e.g. MAIA, AIA) might help with finding better probes and better steering vectors (e.g. by iteratively testing counterfactual hypotheses against potentially spurious features, a major challenge for Contrast-consistent search (CCS) [LW · GW]).
This seems plausible since MAIA can already find spurious features, and feature interpretability [V]LM agents could have much lengthier hypotheses iteration cycles (compared to current [V]LM agents and perhaps even to human researchers).
cousin_it on Accidental Electronic InstrumentCrosstalk is definitely a problem, e-drums and pads have it too. But are you sure the tradeoff is inescapable? Here's a thought experiment: imagine the tines sit on separate pads, or on the same pad but far from each other. (Or physically close, but sitting on long rods or something, so that the distance through the connecting material is large.) Then damping and crosstalk can be small at the same time. So maybe you can reduce damping but not increase crosstalk, by changing the instrument's shape or materials.