LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

next page (older posts) →

Recent comments

nim on nim's Shortform

I've found an interesting "bug" in my cognition: a reluctance to rate subjective experiences on a subjective scale useful for comparing them. When I fuzz this reluctance against many possible rating scales, I find that it seems to arise from the comparison-power itself.

The concrete case is that I've spun up a habit tracker on my phone and I'm trying to build a routine of gathering some trivial subjective-wellbeing and lifestyle-factor data into it. My prototype of this system includes tracking the high and low points of my mood through the day as recalled at the end of the day. This is causing me to interrogate the experiences as they're happening to see if a particular moment is a candidate for best or worst of the day, and attempt to mentally store a score for it to log later.

I designed the rough draft of the system with the ease of it in mind -- I didn't think it would induce such struggle to slap a quick number on things. Yet I find myself worrying more than anticipated about whether I'm using the scoring scale "correctly", whether I'm biased by the moment to perceive the experience in a way that I'd regard as inaccurate in retrospect, and so forth.

Fortunately it's not a big problem, as nothing particularly bad will happen if my data is sloppy, or if I don't collect it at all. But it strikes me as interesting, a gap in my self-knowledge that wants picking-at like peeling the inedible skin away to get at a tropical fruit.

matthew-barnett on AI Regulation is Unsafe

non-consensually killing vast amounts of people and their children for some chance of improving one's own longevity.

I think this misrepresents the scenario since AGI presumably won't just improve my own longevity: it will presumably improve most people's longevity (assuming it does that at all), in addition to all the other benefits that AGI would provide the world. Also, both potential decisions are "unilateral": if some group forcibly stops AGI development, they're causing everyone else to non-consensually die from old age, by assumption.

I understand you have the intuition that there's an important asymmetry here. However, even if that's true, I think it's important to strive to be accurate when describing the moral choice here.

amalthea on AI Regulation is Unsafe

I think the perspective that you're missing regarding 2. is that by building AGI one is taking the chance of non-consensually killing vast amounts of people and their children for some chance of improving one's own longevity.

Even if one thinks it's a better deal for them, a key point is that you are making the decision for them by unilaterally building AGI. So in that sense it is quite reasonable to see it as an "evil" action to work towards that outcome.

jam_brand on dirk's Shortform

Here's an example for you: I used to turn the faucet on while going to the bathroom, thinking it was due simply to having a preference for somewhat-masking the sound of my elimination habits from my housemates, then one day I walked into the bathroom listening to something-or-other via earphones and forgetting to turn the faucet on only to realize about halfway through that apparently I actually didn't much care about such masking, previously being able to hear myself just seemed to trigger some minor anxiety about it I'd failed to recognize, though its absence was indeed quite recognizable—no aural self-perception, no further problem (except for a brief bit of disorientation from the mental-whiplash of being suddenly confronted with the reality that in a small way I wasn't actually quite the person I thought I was), not even now on the rare occasion that I do end up thinking about such things mid-elimination anyway.

faul_sname on LLMs seem (relatively) safe

Or to point to a situation where LLMs exhibit unsafe behavior in a realistic usage scenario. We don't say

a problem with discussions of fire safety is that a direct counterargument to "balloon-framed wood buildings are safe" is to tell arsonists the best way that they can be lit on fire

faul_sname on Duct Tape security

BTW as a concrete note, you may want to sub in 15 - ceil(log10(n)) instead of just "15", which really only matters if you're dealing with numbers above 10 (e.g. 1000 is represented as 0x408F400000000000, while the next float 0x408F400000000001 is 1000.000000000000114, which differs in the 13th decimal place).

jason-gross on Sparsify: A mechanistic interpretability research agenda

We propose a simple fix: Use instead of $L_{1}$ , which seems to be a Pareto improvement over $L_{1}$ (at least in some real models, though results might be mixed) in terms of the number of features required to achieve a given reconstruction error.

When I was discussing better sparsity penalties with Lawrence, and the fact that I observed some instability in $L_{0 < p < 1}$ in toy models of super-position, he pointed out that the gradient of $L_{0 < p < 1}$ norm explodes near zero, meaning that features with "small errors" that cause them to have very small but non-zero overlap with some activations might be killed off entirely rather than merely having the overlap penalized.

See here for some brief write-up and animations.

kingsupernova on Duct Tape security

Hmm, interesting. The exact choice of decimal place at which to cut off the comparison is certainly arbitrary, and that doesn't feel very elegant. My thinking is that within the constraint of using floating point numbers, there fundamentally isn't a perfect solution. Floating point notation changes some numbers into other numbers, so there are always going to be some cases where number comparisons are wrong. What we want to do is define a problem domain and check if floating point will cause problems within that domain; if it doesn't, go for it, if it does, maybe don't use floating point.

In this case my fix solves the problem for what I think is the vast majority of the most likely inputs (in particular it solves it for all the inputs that my particular program was going to get), and while it's less fundamental than e.g. using arbitrary-precision arithmetic, it does better on the cost-benefit analysis. (Just like how "completely overhaul our company" addresses things on a more fundamental level than just fixing the structural simulation, but may not be the best fix given resource constraints.)

The main purpose of my example was not to argue that my particular approach was the "correct" one, but rather to point out the flaws in the "multiply by an arbitrary constant" approach. I'll edit that line, since I think you're right that it's a little more complicated than I was making it out to be, and "trivial" could be an unfair characterization.

justismills on LLMs seem (relatively) safe

Maybe worth a slight update on how the AI alignment community would respond? Doesn't seem like any of the comments on this post are particularly aggressive. I've noticed an effect where I worry people will call me dumb when I express imperfect or gestural thoughts, but it usually doesn't happen. And if anyone's secretly thinking it, well, that's their business!

andrew-burns on Andrew Burns's Shortform

So the usual refrain from Zvi and others is that the specter of China beating us to the punch with AGI is not real because limits on compute, etc. I think Zvi has tempered his position on this in light of Meta's promise to release the weights of its 400B+ model. Now there is word that SenseTime just released a model that beats GPT-4 Turbo on various metrics. Of course, maybe Meta chooses not to release its big model, and maybe SenseTime is bluffing--I would point out though that Alibaba's Qwen model seems to do pretty okay in the arena...anyway, my point is that I don't think the "what if China" argument can be dismissed as quickly as some people on here seem to be ready to do.

LessWrong 2.0 Reader

Archive

Recent comments