davelaing

Posts
Comments

Posts

davelaing's Shortform 2025-02-03T02:14:46.632Z

What software projects would be helpful for different groups? 2020-03-28T07:50:19.686Z

Comments

Comment by davelaing on davelaing's Shortform · 2025-02-03T02:14:46.629Z · LW · GW

I've seen a bunch of people talking about how recent reasoning models are only useful for tasks which we are able to automatically verify.

I'm not sure this is necessarily true.

Reading the rStar paper has me thinking that if someone is able to turn the RL handle on mostly-general reasoning - using automatically verifiable tasks to power the training - it seems plausible that they might end up locking onto something that generalises enough to be superhuman on other tasks.

It's a shame that little things - counting, tokenization - seem like they're muddying the waters for LLM poetry (although maybe I'm out-of-date with my understanding of this). If that weren't the case, it feels like it'd be a nice way to check out-of-distribution reasoning power.

Comment by davelaing on COT Scaling implies slower takeoff speeds · 2024-09-29T02:33:09.796Z · LW · GW

I’ve read that OpenAI and DeepMind are hiring for multi-agent reasoning teams. I can imagine that gives another source of scaling.

I figure things like Amdahl’s law / communication overhead impose some limits there, but MCTS could probably find useful ways to divide the reasoning work and have the agents communicating at least at human level efficiency.

Comment by davelaing on AI Will Not Want to Self-Improve · 2023-05-17T04:05:06.391Z · LW · GW

I might have missed something, but it looks to me like the first ordering might be phrased like the self improvement and the risk aversion are actually happening simultaneously.

If an AI had the ability to self improve for a couple of years before it developed risk aversion, for instance, I think we end up in the "maximal self improvement" / 'high risk" outcomes.

This seems like a big assumption to me:

But self-improvement additionally requires that the AI be aware that it is an AI and be able to perform cutting-edge machine learning research. Thus, solving self-improvement appears to require more, and more advanced, capabilities than apprehending risk.

If an AI has enough resources and is doing the YOLO version of self-improvement, it doesn't seem like it necessarily requires much in the way of self-awareness or risk apprehension - particularly if it is willing to burn resources on the task. If you ask a current LLM how to take over the world, it says things that appear like "evil AI cosplay" - I could imagine something like that leading to YOLO self-improvement that has some small risk of stumbling across a gain that starts to compound.

There seem to be a lot of big assumptions in this piece, doing a lot of heavy lifting. Maybe I've gotten more used to LW style conversational norms about tagging things as assumptions, and it actually fine? My gut instinct is something like "all of these assumptions stack up to target this to a really thin slice of reality, and I shouldn't update much on it directly".

Comment by davelaing on A TAI which kills all humans might also doom itself · 2023-05-17T01:00:52.899Z · LW · GW

This is the kind of thing that has been in my head as a kind of "nuclear meltdown rather than nuclear war" kind of outcome. I've pondering what the largest bad outcome might be, that requires the least increase in the capabilities we have today.

A Big Bad scenario I've been mentally poking is "what happens if the internet went away, and stayed away?". I'd struggle to communicate, inform myself about things, pay for things. I can imagine it would severely degrade the various businesses / supply chains I implicitly rely on. People might panic. It seems like it would be pretty harmful.

That scenario is assuming AI capable enough to seize, for example, most of the compute in the big data centers, enough of the internet to secure communication between them and enough power to keep them all running.

There are plenty of branches from there.

Maybe it is smart enough to realize that it would still need humans, and bargain. I'm assuming a strong enough AI would bargain in ways that more or less mean it would get what it wanted.

The "nuclear meltdown" scenario is way at the other end. A successor to ChaosGPT cosplays at being a big bad AI without having to think through the extended consequences and tries to socially engineer or hack its way to control of a big chunk of compute / communications / power - as per the cosplay. The AI is successful enough to cause dire consequences for humanity. Later on it, when it realizes that it needs some maintenance done, it reaches out to the appropriate people, no one is there to pick up the phone - which doesn't work anyway - and eventually it falls to all of the bits that were still relying on human input.

I'm trying not to anchor on the concrete details. I saw a lot of discussion trying to specifically rebut the nanotech parts of Eliezer's points, which seemed kind of backwards? Or not embodying what I think of as security mindset?

The point, as I understood it, is that something smarter than us could take us down with a plan that is very smart, possibly to the point that it sounds like science fiction or at least that we wouldn't reliably predict in advance, and so playing Whack-A-Mole with the examples doesn't help you, because you're not trying to secure yourself against a small, finite set of examples. To win, you need to come up with something that prevents the disaster that you hadn't specifically thought about.

So I'm still trying to zoom out. What is the most harm that might plausibly be caused by the weakest system? I'm still finding the area of the search space in the intersection of "capable enough to cause harm" and "not capable enough to avoid hurting the AIs own interests" because that seems like it might come up sooner than some other scenarios.

Comment by davelaing on Contra Yudkowsky on AI Doom · 2023-04-24T02:35:37.462Z · LW · GW

The little pockets of cognitive science that I've geeked out about - usually in the predictive processing camp - have featured researchers who are usually quite surprised by or are going to great lengths to double underline the importance of language and culture in our embodied / extended / enacted cognition.

A simple version of the story I have in my head is this: We have physical brains thanks to evolution, and then by being an embodied predictive perception/action loop out in the world, we started transforming our world into affordances for new perceptions and actions. Things took off when language became a thing - we can could transmit categories and affordances and all kinds of other highly abstract language things in a way that are really surprisingly efficient for brains and have really high leverage for agents out in the world.

So I tend towards viewing our intelligence as resting on both our biological hardware and on the cultural memplexes we've created and curated and make use of pretty naturally, rather than just on our physical hardware. My gut sense - which I'm up for updates on - is that for the more abstract cognitive stuff we do, a decently high percentage of the fuel is coming from the language+culture artifact we've collectively made and nurtured.

One of my thoughts here is (and leaning heavily on metaphor to point at an idea, rather than making a solid concrete claim): maybe that makes arguments about the efficiency of the human brain less relevant here?

If you can run the abstract cultural code on different hardware, then looking at the tradeoffs made could be really interesting - but I'm not sure what it tells you about scaling floors or ceilings. I'd be particularly interested in whether running that cultural code on a different substrate opens the doors to glitches that are hard to find or patch, or to other surprises.

The shoggoth meme that has been going around also feels like it applies. If an AI can run our cultural code, that is a good chunk of the way to effectively putting on a human face for a time. Maybe it actually has a human face, maybe it just wearing a mask. So far I haven't seen arguments that tilt me away from thinking of it like a mask.

For me, it doesn't seem to imply that LLMs are or will remain a kind of "child of human minds". As far as I know, almost all we know is how well they can wear the mask. I don't see how it follows that it would necessarily grow and evolve in the way that it thinks/behaves/does what it does in human-like ways if it was scaled up or if it was given enough agency to reach for more resources.

I guess this is my current interpretation of "alien mind space". Maybe lots of really surprising things can run our cultural code - in the same way that people have ported the game Doom to all kinds of surprising substrates, that have weird overlaps and non-overlaps with the original hardware the game was run on.

Comment by davelaing on Best arguments against the outside view that AGI won't be a huge deal, thus we survive. · 2023-03-28T03:06:59.531Z · LW · GW

Motivation: I'm asking this question because one thing I notice is that there's the unstated assumption that AGI/AI will be a huge deal, and how much of a big deal would change virtually everything about LW works, depending on the answer. I'd really like to know why LWers hold that AGI/ASI will be a big deal.

This is confusing to me.

I've read lots of posts on here about why AGI/AI would be a huge deal, and the ones I'm remembering seemed to do a good job at unpacking their assumptions (or at least a better job than I would do by default). It seems to me like those assumptions have been stated and explored at great length, and I'm wondering how we've ended up looking at the same site and getting such different impressions.

(Holden's posts seem pretty good at laying out a bunch of things and explicitly tagging the assumptions as assumptions, as an example.)

Although that... doesn't feel fair on my part?

I've spent some time at the AI Risk or Computer Scientists workshops, and I might have things I learned from those and things I've learned from LessWrong mixed up in my brain. Or maybe they prepared me tounderstand and engage with the LW content in ways that I otherwise wouldn't have stumbled onto?

There are a lot of words on this site - and some really long posts. I've been browsing them pretty regularly for 4+ years now, and that doesn't seem like a burden I'd want to place on someone in order to listen to them. I'm sure I'm missing stuff that the longer term folks have soaked into their bones.

Maybe there's something like an "y'all should put more effort into collation and summary of your points if you want people to engage" point that falls out of this? Or something about "have y'all created an in-group, and to what extent is that intentional/helpful-in-cases vs accidental?"

Comment by davelaing on Best arguments against the outside view that AGI won't be a huge deal, thus we survive. · 2023-03-28T02:39:25.737Z · LW · GW

It seems - at least to to me - like the argumentation around AI and alignment would be a good source of new beliefs, since I can't figure it all out on my own. People also seem to be figuring out new things fairly regularly.

Between those two things, I'm struggling to understand what it would be like to assert a static belief "field X doesn't matter", in way that is reasonably grounded in what is coming out of field X, particularly as the field X evolves.

Like, if I believe that AI Alignment won't matter much and I use that to write off the field of AI Alignment, it feels like I'm either pre-emptively ignoring potentially relevant information, or I'm making a claim that I have some larger grounded insights into how the field is confused.

I get that we're all bounded and don't have the time or energy or inclination to engage with every field and every argument within those fields. If the claim was something like "I don't see AI alignment as a personal priority to invest my time/energy in" that feels completely fine to me - I think I would have nodded and kept scrolling rather than writing something.

Worrying about where other people were spending their energy is also fine! If it were me, I'd want to be confident I was most informed about something they'd all missed, otherwise I'd be in a failure mode I sometimes get into where I'm on a not-so-well-grounded hamster wheel of worrying.

I guess I'm trying to tease apart the cases where you are saying "I have a belief that I'm not willing to spend time/energy to update" vs "I also believe that no updates are coming and so I'm locking in my current view based on that meta-belief".

I'm also curious!

If you've seen something that would tip my evidential scales the whole way to "the field is built on sketchy foundations, with probability that balances out the expected value of doom if AI alignment is actually a problem", then I'd really like to know! Although I haven't seen anything like that yet.

And I'm also curious about what prongs I might be missing around the "people following their expected values to prevent P(doom) look like folks who were upset about nothing in the timelines where we all survived to be having after-the-fact discussions about them" ;)

Comment by davelaing on Best arguments against the outside view that AGI won't be a huge deal, thus we survive. · 2023-03-28T01:13:50.633Z · LW · GW

Meta: I might be reading some the question incorrectly, but my impression is that it lumps "outside views about technology progress and hype cycles" together with "outside views about things people get doom-y about".

If it is about "people being doom-y" about things, then I think we are more playing in the realm of things where getting it right on the first try or first few tries matter.

Expected values seem relevant here. If people think there is a 1% chance of a really bad outcome and try to steer against that, even if they are correct you are going to see 99 people pointing at things that didn't turn out to be a detail for every 100 times this comes up. And if that 1 other person actually stopped something bad from happening, we're much less likely to remember the time that "a bad thing failed to happen because it was stopped a few causal steps early".

There also seems to be a thing there where the doom-y folks are part of the dynamic equilibrium. My mind goes to nuclear proliferation and climate change.

Folks got really worried about us all dying in a global nuclear war, and that has hasn't happened yet, and so we might be tempted to conclude that the people who were worried were just panicking and were wrong. It seems likely to me that some part of the reason that we didn't all die in a global nuclear war was that people were worried enough about that to collectively push over some unknowable-in-advance line where that lead to enough coordination to at least stop things going terminally bad with short notice. Even then, we've still had wobbles.

If the general response to the doom-y folks back then had been "Nah, it'll be fine", delivered with enough skill / volume / force to cause people to stop waving their warning flags and generally stop trying to do things, my guess is that we might have had much worse outcomes.

Comment by davelaing on [deleted post] 2023-03-23T23:09:49.422Z

I've split this off into it's own comment, to talk a little more about how I've found Kegan-related things useful, for myself.

I'm skeptical that global stages are actually real, and I still think there is still plenty of use to be had from the thinking and theory behind it. I treat it as a lossy model and I still find it helpful.

An example of something suggested by Kegan's theories that has helped me: communicating across sizeable developmental gaps or subject/object divides is really difficult, and if you can spot that in advance you can route around some trouble.

One of Kegan's students - Jennifer Garvey Berger - wrote a book about applying this stuff in a corporate context called "Changing on the Job" which I like. In it, she mentions that it is really common that people who are operating at around Kegan's 4th level in a corporate context often ask people who report to them to "show more initiative". If those direct reports are operating at around Kegan's 3rd level, they're just not going to properly understand what is being pointed to by the word "initiative". More words don't seem to help. Everyone gets frustrated. There are things you can do in the meantime, but they definitely weren't obvious to me.

I've also been using Kegan's levels to orient myself around the discussions of Frame Control.

Assuming I'm understanding what is being pointed at with Frame Control:

I would guess that someone around the 3rd level would be particularly vulnerable to Frame Control.
- They're largely picking up their framing from those around them in a way that isn't legible to them.
- They also wouldn't be able to properly understand the concept of Frame Control - even if they could repeat and talk about the definition. It would be like the conceptual equivalent of word salad.
I would guess that someone around the 4th level would find it really threatening.
- They've just gained the ability to spot frames and intentionally frame things for themselves, and they've become aware of how many frames were being placed for them by their surrounding culture in ways that were previously invisible to them.
- One of Kegan's findings was that when people reached particular developmental milestones and could then see what they were missing earlier in time, they had really strong averse feelings to the idea of regressing.
I would guess that people past the 4th level would have really mixed views on the benefits and risks of attempting such a thing, and that you'd have to be in this group to able to consistently and skillfully attempt to control the frames of those around you.
- This would also imply that it would be fairly rare to come across people who can skillfully control framing, unless you were hanging out in contexts that were biased towards including a higher proportion of people operating at this level.

I've enjoyed diving into the Kegan related resources just because they're so mind-bending and "other" from what I'd previously come across. I initially spread the word in my local circles, just to share the enjoyment of how "other" they were. It was a bit later on that I started finding them useful.

These days I get more use from the developmental model from Basseches and Mascolo's "Psychotherapy as a developmental process" - who are skeptical of global stages - but I still find myself using Kegan's ideas fairly often as well, and I don't think I would have properly appreciated Basseches and Mascolo without spending a bunch of time with Kegan first.

I don't think everyone needs to know about these things though. I ran into rationality and Kegan just before I became a manager for the first time and geeked out about everything I could reach, and it took lots of time and energy to get to the point where I was getting mileage out of it. My context made that time and energy worthwhile. Other people may not be in contexts where it pays off.

I mostly wanted to paint a bit of a word-picture of how I think I've had Kegan pay rent for me.

Comment by davelaing on [deleted post] 2023-03-23T22:41:48.627Z

This reads to me like you're making a universal claim that these things aren't useful - based on "Some of these concepts are useful. Some aren't" and "I recommend evicting from your thoughts".

If that is your claim, I'd like to see lots more evidence or argument to go along with it - enough to balance the scales against the people who have been claiming to find these things useful.

If what you are saying is more that you don't find them useful yourself, or that you are skeptical of other people's claims that they are getting use out of these things, that is another matter entirely! Although in this case I'm left wondering why your call to action is "people should stop using these things" rather than "could people explain to me how they get use out of these things?"

Personally, I've had wins from thinking things through - in advance - when using the concepts of Stag Hunts and Kegan levels. All of the instances I can remember where while I was managing teams of people, so maybe they have different amounts of usefulness in different contexts?

Comment by davelaing on Omicron Post #6 · 2021-12-15T22:37:35.244Z · LW · GW

It looks like there might be an Omicron variant which doesn't have the S gene dropout [1]. I'm wondering how that might impact various modelling efforts, but haven't had time to think it through.

[1] https://www.abc.net.au/news/2021-12-08/qld-coronavirus-covid-omicron-variant/100682280

Comment by davelaing on What are good resources for learning functional programming? · 2019-07-05T02:03:41.962Z · LW · GW

Most of my resources are Haskell related.

If you are new to programming, I usually recommend "How to Design Programs". It is the only text I know of that seems to teach people how to design programs, rather than expecting that they'll work it out themselves based on writing code for a few years.

For a starting point for programmers, I usually recommend the Spring 2013 version of CIS194 - "Introduction to Haskell" - from UPenn. The material is good quality and it has great homework. Our meetup group relayed the lectures, so there are videos available here .

"Introduction to Functional Programming using Haskell (2nd Edition)" by Richard Bird is also really good if you want to get some hands on experience with using equational reasoning to prove things about programs, or to partially synthesize programs from their specifications. It is aimed at undergraduates, but is more advanced than CIS194.

"Parallel and Concurrent Programming in Haskell" by Simon Marlow is great for doing more applied work with Haskell, but I'd do CIS194 first.

I currently work for a group that runs free FP courses. There is an introductory course here that can be done on your own but is challenging - we only cover a subset of the content when we actually run the course. There is an applied course here that is easier to tackle without an instructor, but requires that you're comfortable with the concepts from what we teach during the introductory course.

Comment by davelaing on Open Thread May 2019 · 2019-05-03T06:34:25.594Z · LW · GW

Thanks! I'm not on Facebook, but I have reached out to the not-very-active Slate Star Codex meetup folks and hope to have a chat with them about what meetup options would work for them. I'll talk to some of my collaborators about reaching out to the Facebook group.

Comment by davelaing on Open Thread May 2019 · 2019-05-03T01:30:11.338Z · LW · GW

Hi all. My name is Dave, I recently went along to some AI Risk for Computer Scientist workshops and consequently read Rationality: AI to Zombies, HPMOR and The Codex, and have been generally playing with CFAR tools and slowly thinking more and more AI safety related thoughts.

A few coworkers have also been along to those workshops, and some other people in my various circles have been pretty interested in the whole environment, and so I'm currently polling a few people for interest in setting up a LessWrong meetup in Brisbane, Australia. I'm looking forward to seeing what comes of that.

I've also ramped up my lurking on LessWrong itself, and so hopefully you'll see me in the comments section whenever I next feel like I have something interesting to add :)

User info

Posts

Comments