LessWrong 2.0 Reader

View: New · Old · Top

← previous page (newer posts) · next page (older posts) →

← previous page (newer posts) · next page (older posts) →

Recent comments

russellthor on Against "argument from overhang risk"

In terms of the big labs being inefficient, with hindsight perhaps. Anyway I have said that I can't understand why they aren't putting much more effort into Dishbrain etc. If I had ~$1B and wanted to get ahead on a 5 year timescale I would give it more probability expectation etc.

For

I am here for credibility. I am sufficiently highly confident they are not X-risk to not want to recommend stopping. I want the field to have credibility for later.
Yes, but I don't think stopping the training runs is much of an otherwise good thing if at all. To me it seems more like inviting a fire safety expert and they recommend a smoke alarm in your toilet but not kitchen. If we can learn alignment stuff from such training runs, then stopping is an otherwise bad thing.
OK I'm not up with the details but some experts sure think we learnt a lot from 3.5/4.0. Also my belief about it often being a good idea to deploy the most advanced non X-risk AI as defense. (This is somewhat unclear, usually what doesn't kill makes stronger, but I am concerned about AI companion/romantic partner etc. That could weaken society in a way to make it more likely to make bad decisions later. But that seems to have already happened and very large models being centralized could be secured against more capable/damaging versions.)

teatieandhat on Should I Finish My Bachelor's Degree?

I’m probably typical-minding a bit here, but: you say you have had mental health issues in the past (which, based on how you describe them, sound at least superficially similar to my own), and that you feel like you’ve outlived yourself. Which, although it is a feeling I recognise, is still a surprising thing to say: even a high P(doom) only tells you that your life might soon have to stop, not that it already has! My wild-ass guess would be that, in addition to maybe having something to prove intellectually and psychologically, you feel lost, with the ability to do things (btw, I didn’t know your blog and it’s pretty neat) but nothing in particular to do. Maybe you’re considering finishing your degree because it gives you a medium-term goal with some structure in the tasks associated with it?

aaron-bergman on quila's Shortform

Thank you, that is all very kind! ☺️☺️☺️

I expect if he continues being what he is, he'll produce lots of cool stuff which I'll learn from later.

I hope so haha

jett on Transformers Represent Belief State Geometry in their Residual Stream

For the two sets of mess3 parameters I checked the stationary distribution was uniform.

ben-lang on Losing Faith In Contrarianism

Nice post. Gets at something real.

My feeling is that a lot of contrarians get "pulled into" a more contrarian view. I have noticed myself in discussions propose a (specific, technical point correcting a detail of a particular model). Then, when I talk to people about it I feel like they are trying to pull me towards the simpler position (all those idiots are wrong, its completely different from that). This happens with things like "ah, so you mean...", which is very direct. But also through a much more subtle process, where I talk to many people, and most of them go away thinking "Ok, specific technical correction on a topic I don't care about that much." and most of them never talk or think about it again. But the people who get the exaggerated idea are more likely to remember.

russellthor on Against "argument from overhang risk"

If you are referring to this:

If we institute a pause, we should expect to see (counterfactually) reduced R&D investment in improving hardware capabilities, reduced investment in scaling hardware production, reduced hardware production, reduced investment in research, reduced investment in supporting infrastructure, and fewer people entering the field.

This seems an extreme claim to me (if these effects are argued to be meaningful), especially "fewer people entering the field"! Just how long do you think you would need a pause to make fewer people enter the field? I would expect that not only would the pause have to have lasted say 5+ years but there would have to be a worldwide expectation that it would go on for longer to actually put people off.

Because of flow on effects and existing commitments, reduced hardware R&D investment wouldn't start for a few years either. Its not clear that it will meaningfully happen at all if we want to deploy existing LLM everywhere also. For example in robotics I expect there will be substantial demand for hardware even without AI advances as our current capabilities havn't been deployed there yet.

As I have said here, and probably in other places, I am quite a bit more in favor of directly going for a hardware pause specifically for the most advanced hardware. I think it is achievable, impactful, and with clearer positive consequences (and not unintended negative ones) than targeting training runs of an architecture that already seems to be showing diminishing returns.

If you must go for after FLOPS for training, then build in large factors of safety for architectures/systems that are substantially different from what is currently done. I am not worried about unlimited FLOPS on GPT-X but could be for >100* less on something that clearly looks like it has very different scaling laws.

connor-kissane on Sparse Autoencoders Work on Attention Layer Outputs

Thanks for the comment! We always use the pre-ReLU feature activation, which is equal to the post-ReLU activation (given that the feature is activate), and is purely linear function of z. Edited the post for clarity.

soren-elverlin-1 on AstralCodexTen / LessWrong Meetup

You are very welcome, and I think you'll fit right in. It's quite a coincidence that you're interested in documentary productions, as a documentary producer is visiting us for the first hour.

There's a symbolic "AI Box" to contain AI discussion. I'd like to talk about RUF and the transportation infrastructure of Dath Ilan with you, but I usually end up in the box no matter what I do. :)

no77e-noi on Feeling (instrumentally) Rational

Eliezer decided to apply the label "rational" to emotions resulting from true beliefs. I think this is an understandable way to apply that word. I don't think you and Eliezer disagree with anything substantive except the application of that label.

That said, your point about keeping the label "rational" for things strictly related to the fundamental laws regulating beliefs is good. I agree it might be a better way to use the word.

My reading of Eliezer's choice is this: you use the word "rational" for the laws themselves. But you also use the word "rational" for beliefs and actions that are correct according to the laws (e.g., "It's rational to believe x!). In the same way, you can also use the word "rational" for emotion directly caused by rational beliefs, whatever those emotions might be.

About the instrumental rationality part: if you are strict about only applying the word "rational" to the laws of thinking, then you shouldn't use it to describe emotions even when you are talking about instrumental rationality, although I agree it seems to be closer to the original meaning, as there isn't the additional causal step. It's closer in the way that "rational belief" is closer to the original meaning. But note that this is true insofar as you can control your emotions, and you treat them at the same level of actions. Otherwise, it would be as saying "state of the world x that helps me achieve my goals is rational", which I haven't heard anywhere.

philosophicalsoul on Ilya Sutskever and Jan Leike resign from OpenAI

In my opinion, a class action filed by all employees allegedly prejudiced (I say allegedly here, reserving the right to change 'prejudiced' in the event that new information arises) by the NDAs and gag orders would be very effective.

Were they to seek termination of these agreements on the basis of public interest in an arbitral tribunal, rather than a court or internal bargaining [LW · GW], the ex-employees are far more likely to get compensation. The litigation costs of legal practitioners there also tend to be far less.

Again, this assumes that the agreements they signed didn't also waive the right to class action arbitration. If OpenAI does have agreements this cumbersome, I am worried about the ethics of everything else they are pursuing.

For further context, see:

LessWrong 2.0 Reader

Archive

Recent comments