LessWrong 2.0 Reader

View: New · Old · Top

← previous page (newer posts) · next page (older posts) →

← previous page (newer posts) · next page (older posts) →

Recent comments

viliam on AI #62: Too Soon to Tell

there is strong reluctance from employees to reveal that LLMs have boosted productivity and/or automated certain tasks.

The thing with "boosting productivity" is tricky, because productivity is not a linear thing. For example, in software development, using a new library can make adding new features faster (more functionality out of the box), but fixing bugs slower (more complexity involved, especially behind the scenes).

So what I would expect to happen is that there is a month or two with exceptionally few bugs, the team velocity is measured and announced as a new standard, deadlines are adjusted accordingly, then a few bugs happen and now you are under a lot more pressure than before.

Similarly, with LLMs it will be difficult to explain to non-technical management if they happen to be good at some kind of tasks, but worse at a different kind of tasks. Also, losing control... for some reasons that you do not understand, the LLM has a problem with the specific task that was assigned to you, and you are blamed for that.

buck on Buck's Shortform

When I said "AI control is easy", I meant "AI control mitigates most risk arising from human-ish-level schemers directly causing catastrophes"; I wasn't trying to comment more generally. I agree with your concern.

nim on How would you navigate a severe financial emergency with no help or resources?

You're here, which tells me you have internet access.

I mentally categorize options like Fiverr and mturk as "about as scammy as DoorDash". I don't think they're a good option, but I also don't think DoorDash is a very good option either. It's probably worth looking into online gig economy options.

What skills were you renting to companies before you became a stay-at-home parent? There are probably online options to rent the same skills to others around the world.

You write fluently in English and it sounds like English is your first language. Have you considered renting your linguistic skills to people with English as a second language? You may be able to find wealthy international people who value your proof-reading skills on their college work, or conversational skills to practice their spoken English with gentle correction as needed. It won't pay competitively with the tech industry, but it'll pay more than nothing.

If you're in excellent health, the classic "super weird side gig" is stool donor programs. https://www.lesswrong.com/posts/i48nw33pW9kuXsFBw/being-a-donor-for-fecal-microbiota-transplants-fmt-do-good [LW · GW] for more.

Another weird one that depends on your age and health and bodily situation, since you've had more than 0 kids of your own, is gestational surrogacy. Maybe not a good fit, but hey, you asked for weird.

You mention that your kids are in the picture. This suggests a couple options:

Have you contacted social services to find out what options are available to support kids whose parents are in situations like yours? You probably qualify for food stamps, and there may be options for insurance, kids' clothing, etc through municipal or school programs. If your kids are in school, asking whatever school district employee you have the best personal rapport with is an excellent starting point.
What do childcare prices look like in your area? Do you have friends who are parents and need childcare? Can you rent your time to other parents to provide childcare for their kids at a rate lower than their other options? This may or may not be feasible depending on your living situation.

akash-wasil on Buck's Shortform

It is pretty plausible to me that AI control is quite easy

I think it depends on how you're defining an "AI control success". If success is defined as "we have an early transformative system that does not instantly kill us– we are able to get some value out of it", then I agree that this seems relatively easy under the assumptions you articulated.

If success is defined as "we have an early transformative that does not instantly kill us and we have enough time, caution, and organizational adequacy to use that system in ways that get us out of an acute risk period", then this seems much harder.

The classic race dynamic threat model seems relevant here: Suppose Lab A implements good control techniques on GPT-8, and then it's trying very hard to get good alignment techniques out of GPT-8 to align a successor GPT-9. However, Lab B was only ~2 months behind, so Lab A feels like it needs to figure all of this out within 2 months. Lab B– either because it's less cautious or because it feels like it needs to cut corners to catch up– either doesn't want to implement the control techniques or it's fine implementing the control techniques but it plans to be less cautious around when we're ready to scale up to GPT-9.

I think it's fine to say "the control agenda is valuable even if it doesn't solve the whole problem, and yes other things will be needed to address race dynamics otherwise you will only be able to control GPT-8 for a small window of time before you are forced to scale up prematurely or hope that your competitor doesn't cause a catastrophe." But this has a different vibe than "AI control is quite easy", even if that statement is technically correct.

(Also, please do point out if there's some way in which the control agenda "solves" or circumvents this threat model– apologies if you or Ryan has written/spoken about it somewhere that I missed.)

cate-hall on Which skincare products are evidence-based?

I live and die by hyaluronic acid. It doesn’t create permanent changes AFAIK but makes a massive difference for me day to day — plus or minus 5 years depending.

dagon on Johannes C. Mayer's Shortform

"Mathematical descriptions" is a little ambiguous. Equations and models are terse. The mapping of such equations to human-level system expectations (anticipated conditional experiences) can require quite a bit of verbosity.

I think that's what you're saying with the "algorithms and data structures" part, but I'm unsure if you're claiming that the property specification of the math is sufficient as a description, and comparable in fidelity to the algorithmic implementation.

algon on How to write Pseudocode and why you should

An example of you writing psuedocode woud've helped a great deal, especially if it illustrated what you thought was a core skill.

chris_leong on Visible Thoughts Project and Bounty Announcement

This comes out to ~600 pages of text per submission, which is extremely far beyond anything that current technology could leverage. Current NLP systems are unable to reason about more than 2048 tokens at a time, and handle longer inputs by splitting them up. Even if we assume that great strides are made in long-range attention over the next year or two, it does not seem plausible to me to anticipate SOTA systems in the near future to be able to use this dataset to its fullest.

It's interesting to come across this comment in 2024 given how much things have changed already.

viliam on Let's Design A School, Part 2.1 School as Education - Structure

I like this a lot! I think you did a great job explaining how the details are connected.

At the root, the problem is "we cannot teach everyone individually". We do not have enough teachers for that; and the computer solutions are not good enough yet. (Perhaps soon they will get good enough, at least in a way "everyone gets their own AI tutor, and there are still human teachers as a backup". But we are not there yet.) Many things that are unpleasant about schools were invented as a solution to "how to teach 300 kids using only 30 teachers, especially when most of them - both kids and teachers - are not very bright". The solutions seems like a local maximum (we already did many small improvements that worked in isolation), but it also seems like we could do much better with a greater redesign.

Another sad constraint is that many students would be unwilling to cooperate even with a much better designed system. Any solution needs to provide answers for what to do about students who will try their hardest to undermine the system, no matter how irrational such behavior may seem to us. Kids, especially at puberty, are often trying to impress their peers doing various destructive and self-destructive things. Assume that every school will have some bullies, some kids who want to hide in a place out of sight and use drugs, etc.

andy-arditi on Mechanistically Eliciting Latent Behaviors in Language Models

Awesome work, and nice write-up!

One question that I had while reading the section on refusals:

Your method found two vectors (vectors 9 and 22) that seem to bypass refusal in the "real-world" setting.
While these vectors themselves are orthogonal (due to your imposed constraint), have you looked at the resulting downstream activation difference directions and checked if they are similar?
- I.e. adding vector 9 at an early layer results in a downstream activation diff in the direction , and adding vector 22 at an early layer results in a downstream activation diff in the direction $δ_{22}$ . Are these downstream activation diff directions $δ_{9}$ and $δ_{22}$ roughly the same? Or are they almost orthogonal?
  - (My prediction would be that they're very similar.)

LessWrong 2.0 Reader

Archive

Recent comments